Supporters of Marcus Endicott’s Patreon can access weekly or monthly consultations on this topic.
On the afternoon of November 30, 2022, a small team at OpenAI pushed a chat interface onto the public internet and braced for almost nothing. They called it a research preview, the cautious phrase engineers reach for when they want to ship something without promising it will hold together. The system underneath was a tuned version of GPT-3.5, a model that had existed in some form for two years; what was new was mostly the wrapper — a clean text box, a blinking cursor, and a model willing to answer in ordinary conversational English. The company framed it as a way to collect feedback, a quiet experiment that might draw a respectable trickle of curious researchers and tinkerers over the holidays. People who worked on it have since said they expected a muted response. It was a sibling, internally, to an instruction-following model the company had released earlier that year, and almost no one inside the building seems to have guessed what was coming.
What came was a kind of weather event. By OpenAI's own account, the service crossed a million users within five days — a figure Sam Altman announced on Twitter that first week, and which has been repeated so often since that it is easy to forget it was a founder's celebratory tally rather than an audited number. Two months later, analysts at the Swiss bank UBS put the total at roughly a hundred million monthly users and called it the fastest-growing consumer application they had ever tracked, a claim built on third-party traffic estimates rather than figures from OpenAI's own ledgers. Both numbers are best understood as estimates that everyone found plausible, not as facts certified by anyone with access to the servers. But the direction of travel was not in doubt. Something that was meant to be a low-key demo had become, within a season, the most talked-about piece of software on earth — written about in newspapers that had never covered a machine-learning paper, banned in school districts, debated in boardrooms, typed into by people who had no idea what a language model was and did not need to.
The strain showed in the obvious place. The systems behind ChatGPT buckled under the load, throwing up apologetic error pages and capacity warnings for months. Altman would later become known for a vivid line about his hardware giving out under demand — that the company's GPUs were "melting" — though that particular cry came in the spring of 2025, when a wave of users discovered the model could turn their photographs into illustrations and overwhelmed the image servers in a matter of days. It is too good a phrase to belong only to that episode, and it has migrated backward in the retelling onto the launch itself. The truth is duller and stranger: from almost the first hour, the demand for a free research preview outran what its makers had built to serve it, and it kept outrunning them.
It is tempting, standing at this moment, to treat it as a rupture — a thing that arrived from nowhere and split history into before and after. The whole burden of this book is that it was nothing of the kind. ChatGPT was not a bolt from the blue but the visible surfacing of a conversation that had been running, quietly and often unfashionably, for the better part of a century. The chat box that a hundred million people typed into in early 2023 was the endpoint of a chain of arguments and half-forgotten papers and stubborn careers that reaches back through the scaling experiments of the 2010s, through a Munich master's thesis on why neural networks forget, through a translation memo written in 1949, to a pair of ideas set down at the very dawn of the computer age. The eruption felt sudden because almost none of that history was visible from the surface. The people who typed "write me a poem" and watched the words appear had no reason to know that they were pulling on a thread first picked up before most of them were born.
To see where the thread begins, we have to go back to 1948, and to two men who had taken tea together during the war and argued about whether a machine could ever be said to think. In that year, working at Bell Labs, Claude Shannon published the paper that founded information theory — and, almost as an aside, built a crude statistical model of written English, predicting each letter from the ones before it, that prefigures with uncanny directness the way today's language models work. Two years later, across the Atlantic, Alan Turing proposed setting aside the unanswerable question of whether machines can think in favor of a question one could actually test: whether a machine could hold a conversation convincingly enough that a human judge could not tell it from a person. Neither man lived to see his idea bear fruit. But between Shannon's statistics and Turing's imitation game lies the whole problem this book is about — the problem of getting a machine to handle human language not by understanding it in any deep sense, but by modeling, with enough data and enough arithmetic, what comes next.
For most of the decades that followed, the people who believed this approach could work were a minority, and frequently a mocked one. The dominant view held that language was a matter of rules and grammar and meaning, and that statistics was a crude shortcut that could never capture it. The story of how the shortcut won — how a handful of heretics kept faith in counting and prediction through long stretches of indifference and outright winter, until the data and the machines finally caught up with them — is the story of how we got the demo that ate the world. ChatGPT was the moment the public was finally let into the room. The rest of this book is about the room itself, and the long, strange road that led there.