tech 2d ago

AI Prose Collapses Writing Styles

New YorkAI prose now triggers human-detection fatigue online.

A few models trained on the same corpus have collapsed professional prose into 1 recognizable style.

Writers avoiding AI patterns are now flagged as AI by audiences conditioned to the same style.

Sources: 404 Media

Background

THE MECHANISM

Large language models are trained to predict the most probable next token. Averaged across billions of examples, the most probable choice is by definition the most common one. The model's natural attractor is the median of its training data — not the mean of human writing, but the mode.

THE SHARED CORPUS

The major frontier labs train on overlapping slices of the same internet — Common Crawl, Wikipedia, Reddit, Stack Exchange, books from the same shadow libraries. Different architectures fed the same diet converge on similar outputs. The 'house style' is the internet's center of gravity, not any one company's voice.

RLHF NARROWS FURTHER

After pretraining, models are tuned with reinforcement learning from human feedback. A few thousand contractors rate outputs; the model learns what those raters reward. Polite hedging, balanced both-sides framing, the em-dash transition, the 'It's not just X — it's Y' construction: these are not invented by the model, they are amplified by the rating pool's preferences.

THE TELLS

Audiences trained to spot AI now flag specific tics: em-dashes between independent clauses, tricolon openers, 'delve' and 'tapestry,' the antithesis pivot ('not merely X, but Y'), reflexive caveats. None of these are AI inventions — they are register markers of formal English the model overuses.

THE FEEDBACK LOOP

AI output is now a meaningful share of new text on the web. Next-generation models train on a corpus partly written by previous-generation models. Researchers call this model collapse — the variance of the training distribution shrinks with each cycle, and the median tightens around itself.

THE HUMAN COST

When a style becomes the AI style, humans who naturally write that way get accused of using AI. Non-native English writers, who learned formal register from textbooks, get flagged more often than native speakers writing casually. AI-detection tools have documented false-positive rates above 50% on non-native prose.