WHAT CALIBRATION MEANS
A model is calibrated when its stated probabilities match observed frequencies — when it says 70% confident, it should be right 70% of the time. Calibration is a separate property from accuracy: a model can be highly accurate but badly calibrated, or vice versa.
WHY MODELS HEDGE BADLY
Pretrained language models are typically well-calibrated on raw next-token probabilities. Reinforcement learning from human feedback — the step that makes them helpful and polite — systematically destroys this. Raters reward confident, fluent answers, so the model learns to sound sure even when its underlying probability is mixed.
THE AUTOMATION BIAS PRECEDENT
Long before chatbots, aviation and medical researchers documented automation bias: humans defer to algorithmic outputs even when their own judgment is correct and the system is visibly wrong. The 1995 grounding of the Royal Majesty cruise ship — crew trusted a GPS feed for 24 hours after the antenna cable disconnected — is the canonical case.
THE FLUENCY HEURISTIC
Decades of psychology show that humans use fluency — how easily information is processed — as a proxy for truth. A grammatical, well-structured sentence reads as more credible than a hesitant one, regardless of content. LLMs are fluency machines; their outputs trigger this heuristic on every token.
WHERE IT BITES HARDEST
The stakes scale with the consequence of acting on a wrong answer. A wrong recipe wastes dinner; a wrong drug interaction or wrong case citation can kill a patient or sanction a lawyer. Studies of clinicians using diagnostic AI find they over-accept confident-but-wrong suggestions, and under-accept correct ones flagged as uncertain — the exact opposite of the intended workflow.
THE INTERFACE PROBLEM
Even when a model emits a hedge — 'I'm not sure, but…' — interfaces strip it. Chat UIs display the answer prominently and the caveat as small grey text. Downstream applications often discard the uncertainty token entirely and pass only the most likely output. The calibration that exists in the model rarely survives to the human.