WHAT A HALLUCINATION ACTUALLY IS
Large language models do not retrieve facts; they predict the next token based on statistical patterns in training data. A citation that looks plausible — correct author surnames, realistic journal titles, valid-format DOIs — is generated the same way a real one would be. The model has no internal flag distinguishing the two.
WHY LEGAL AND POLICY TEXT IS THE WORST CASE
Citations follow rigid formats — *Smith v. Jones*, 2019, paragraph 47 — which LLMs reproduce flawlessly in form while inventing in substance. The genre's surface regularity is exactly what makes fabrications hard to spot without checking each reference against a real database.
THE MATA PRECEDENT
In 2023, two New York lawyers were sanctioned after filing a brief in *Mata v. Avianca* citing six entirely fictitious cases generated by ChatGPT. The judge's order became the canonical warning that has since been cited in courts, regulatory filings, and now government policy reviews worldwide.
THE DELOITTE PATTERN
In 2025 Deloitte was forced to partially refund a A$440,000 report to the Australian government after fabricated academic citations and a made-up federal court quote were found in the deliverable. Consultancies and ministries are running into the same failure mode independently: AI-drafted text passing through review processes designed for human-written documents.
WHY REVIEW PROCESSES MISS IT
Editors check whether a citation supports the claim it's attached to, not whether the cited paper exists. A reviewer who recognizes the journal name and finds the surrounding argument coherent will sign off. The verification step — searching each reference in a real database — was historically unnecessary because no honest author would invent sources. That assumption no longer holds.
THE RETRIEVAL FIX
Retrieval-augmented generation (RAG) constrains the model to cite only documents it has actually been shown, with quotations grounded in the retrieved text. It dramatically reduces hallucinations but is harder to deploy than a chat box, which is why most government and consulting use of LLMs still runs on the unconstrained version.