Hallucination — Explained

Definition

Instead of saying "I don't know," an LLM may produce a plausible-sounding but wrong answer. Made-up book titles, non-existent API methods, invented citations, fabricated statistics, imaginary dates — all hallucinations.

Root cause: the model optimizes for plausible next token, not truth. When training data has a similar pattern but lacks the specific fact, the model guesses the most likely tokens — and the result is a fluent lie. Models can be trained to say "I don't know," but it never fully eliminates the problem.

Two flavors: factual hallucination (wrong about the world) and faithfulness hallucination (output contradicts the provided context). The second is the bane of RAG systems.

Analogy

A student who never studied for the exam but, instead of leaving blanks, confidently makes up an answer for every question. Some happen to be right, most are wrong — all written in the same self-assured tone. Or think of it as "a student who never learned to say 'I don't know'."

Real-world example

A lawyer asked ChatGPT for case law on a matter. The model produced strong arguments plus 6 case citations. The lawyer submitted them in court. The judge tried to verify — 5 of the 6 cases didn't exist. Pure hallucination. The lawyer was fined and temporarily suspended (Mata v. Avianca, 2023, a real case).

Fix: RAG is mandatory in critical paths (pull from a real case DB), require sources for every factual/numeric claim, and keep human review of model output.

When to use

Acknowledging hallucination risk exists — every LLM has it
Verification layer is mandatory in high-stakes domains (health, law, finance)
User-facing info always carries a 'this is AI, verify' label

When not to use

Trusting any system that claims 'no hallucinations' — it doesn't exist
Believing a single technique (just RAG, just prompting) fully solves it
Manual hallucination spotting — you need an automated eval pipeline

Common pitfalls

High temperature amplifies hallucinations

Asking for factual answers at T = 0.9 can triple or quintuple your hallucination rate. For factual work use T < 0.3.

RAG reduces, doesn't eliminate

Even well-built RAG doesn't force the model to stick to context. The model can mix in its 'own knowledge' that contradicts the docs. Use strict-mode prompts: 'answer only from the provided context; otherwise say I don't know.'

Treating confident tone as a signal

LLMs deliver every answer in the same confident tone. The 'I don't know' answer and the wrong-but-sure answer wear the same outfit. Tone isn't a confidence signal.