Supporters of Marcus Endicott’s Patreon can access weekly or monthly consultations on this topic.

The Doubters

One of the four authors of the most consequential criticism ever leveled at large language models does not exist. On the published paper, presented at a computer-science conference in March 2021, her name is Shmargaret Shmitchell and her institutional affiliation is The Aether. The joke is bitter and deliberate. The author behind the pseudonym is Margaret Mitchell, who founded and co-led Google's Ethical AI team, and by the time the paper appeared in print she had been fired and had no institution left to list. A byline can encode a firing. That one does, and the strangeness of it is the cleanest way into a story that the field, and much of the journalism about the field, has tended to compress into a single event when it was in fact at least three.

The paper is "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?", and its central argument is not about performance. It is about meaning. A language model, the authors wrote, is "a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot." The force of the claim lies in a point about the human reader rather than the machine. We are built to read intention into language; we extend interpretive generosity to any fluent text, whether or not anything behind it understands. Fluency plus that generosity produces an illusion of comprehension. The parrot does not know what it is saying. We supply the knowing, and mistake it for the parrot's.

This was an old question wearing new clothes, and the chapter cannot be understood without seeing the lineage. A year earlier, Emily Bender and Alexander Koller had won a best-paper award for a thought experiment about an octopus. Imagine two people stranded on separate islands, passing messages through an undersea cable. A hyper-intelligent octopus taps the line, learns the statistical regularities of their conversation, and eventually impersonates one of them well enough to fool the other—right up until a genuine emergency arrives, a bear on the beach, a tool that must be improvised, and the octopus, having only ever seen the shapes of words and never the world they point at, has nothing useful to offer. Their thesis, stated plainly, was that a system trained only on form has no path to meaning. The octopus and the parrot are the same animal in different costumes, and they reach back across the book to the oldest argument in it—the quarrel between those who believed language could be captured as structure and rules and those who believed it could be learned from the raw distribution of forms. Bender was re-posing that quarrel for the age of the transformer.

Around this argument the paper assembled four practical worries that had nothing to do with metaphysics. The first was cost: training these models consumed enormous energy, and the authors noted estimates that a single large model's architecture search could emit on the order of 284 metric tons of carbon, with the further point that the people most likely to bear the consequences of a warming climate were the least likely to benefit from the models. The second was the data itself. Web-scale corpora were too vast to inspect, and what cannot be inspected cannot be vouched for; the authors called this "documentation debt," and warned that size is not diversity—a larger scrape of the internet overrepresents whoever already dominates the internet. The third was opportunity cost, the worry that chasing leaderboard gains was pulling talent and money away from harder questions about actual understanding. The fourth was the danger of fluent synthetic text loosed into the world: deception, automated disinformation, the amplification of bigotry, and again that illusion of meaning, now operating at scale.

It would be tidy to say Google banned the paper, and the tidy version is wrong. Google's stated objection was that the work had not cleared its internal review bar and had "ignored too much relevant research" on efficiency and bias mitigation; the demand was that the authors withdraw it or strip the Google names from it. The paper was not suppressed. It went to peer review and was published regardless. What broke was not the paper but the people, and they broke on different days for different reasons. Timnit Gebru—Eritrean-Ethiopian computer scientist, Stanford PhD advised by Fei-Fei Li of ImageNet fame, co-author of the landmark "Gender Shades" study that found commercial face-classification systems misread darker-skinned women at rates up to 34.7 percent against 0.8 percent for lighter-skinned men—left in early December 2020. Whether she resigned or was fired is the part the book must refuse to settle. Google's account, in an email from research chief Jeff Dean, was that she had set conditions, said she would otherwise leave, and that "we accept and respect her decision to resign from Google." Gebru's account, posted within days, was blunter: "So I've been immediately fired :-)". Both versions are documented; neither has been independently adjudicated in public; and the unresolvability is itself the point about who holds power over research inside a company that builds the thing being researched.

Mitchell's firing came later and for a different reason. After a roughly five-week lockout, Google terminated her on February 19, 2021, citing security violations after she had used automated scripts to search her own corporate email for evidence relating to Gebru's treatment. Two exits, ten weeks apart, with separate proximate causes, and then, weeks after both authors were gone, the paper itself was formally presented at the conference. Three events, three dates. Collapsing them into one tale of a censored paper is the error the careful reader should learn to catch.

It is worth naming the under-credited fourth author who was not a pseudonym. Angelina McMillan-Major was a graduate student at the University of Washington, and her own work on documenting datasets in collaboration with the communities they describe is the intellectual seed of the paper's documentation-debt argument. The recurring injustice of science writing is to let the junior name evaporate, and the record is better when it does not.

The metaphor that gave the paper its title is contested on the merits, and the book should say so rather than treat it as settled. Some researchers argue that "haphazardly… without any reference to meaning" describes the older n-gram models more snugly than it does a transformer, and that the phrase fuses two separate claims—no grounding in the world and no intention behind the words—into one slogan. Bender herself has since stressed that her real target was never the machinery but the conduct around it: the data taken without consent, the labor exploited, the corpora assembled and left undocumented. The parrot, in other words, was always partly an argument about people.

That distinction matters for placing this chapter against its neighbors, because the same season produced a second set of scaling skeptics whose worry was the opposite shape. In December 2020, the very month Gebru left Google, Dario Amodei left OpenAI, and within a year he and others would build Anthropic around the fear that capability was outrunning safety—that the danger lay ahead, in machines too powerful to control. The doubters in this chapter feared something already present and measurable: the carbon, the biased data, the exploited workers, the people misclassified and disinformed right now. One critique asks what happens if the machine becomes too capable. The other asks who is harmed by the machine we already have, and who profits. They are routinely flattened into a single figure of the "AI doomer," and the flattening is a category error—Gebru is a pointed critic of the existential-risk worldview, not its ally. Same era, adjacent skepticisms, and opposite institutional fates: one set of skeptics left to found a celebrated, well-funded lab; another was pushed out the door. The parrot they conjured did not go quiet. By the time a chatbot built on exactly the models they had warned about arrived in November 2022, their phrase had become the argument the whole industry had to answer.

=> The Reckoning

Page updated

Google Sites

Report abuse