Agent Reliability Erodes Review Discipline

San FranciscoSimon Willison, developer and AI blogger, no longer reviews AI-written production code.

He argues that as agents grow reliable, the habit of reviewing output atrophies — the critical skill edge cases need most.

Willison says the incident class is shifting from AI errors to human approval of AI output nobody fully read.

Sources: Hacker News

Background

THE IRONY OF AUTOMATION

Lisanne Bainbridge named this in 1983: the more reliable an automated system becomes, the rustier the human supervisor gets — and the supervisor is only kept in the loop precisely for the cases the automation cannot handle. Reliability erodes the very vigilance it depends on.

THE PRECEDENTS

Air France 447 (2009) and the Boeing 737 MAX MCAS crashes are the canonical industrial cases — pilots out of practice on stick-and-rudder flying when autopilot disengaged. Tesla Autopilot inattention crashes are the consumer version. Each followed the same pattern: high reliability, dulled supervision, edge-case failure.

WHAT CODE REVIEW IS FOR

Review catches two distinct things: defects and drift. Defects are bugs an author missed; drift is code that works but violates an invariant the codebase depends on — a security boundary, a concurrency assumption, a data contract. Agents are getting good at the first; the second still requires a reader who holds the system in their head.

THE SKILL ATROPHY LOOP

Reading code closely is how engineers build the mental model that lets them review well in the first place. Skip the reading because the agent is usually right, and the model degrades. Six months in, the reviewer cannot tell good output from plausible output — the exact failure mode Willison is naming.

WHO BEARS THE LIABILITY

Legally and contractually, the human who approved the merge owns the defect — not the model vendor. EULAs for coding assistants disclaim warranty on output; SOC 2 and ISO 27001 audits trace accountability to the committer. 'The agent wrote it' is not a defense in a postmortem or a courtroom.

THE INCIDENT SHIFT

Early LLM coding incidents were hallucinated APIs and obviously wrong logic — the kind a reviewer catches in seconds. The newer class is subtler: plausible code that passes tests but encodes a wrong assumption about a rate limit, a permission model, or a race condition. The bug is in what the agent didn't know to ask, and the reviewer didn't think to check.