Supporters of Marcus Endicott’s Patreon can access weekly or monthly consultations on this topic.

The Schism

In the last days of December 2020, Dario Amodei walked out of OpenAI. He did it quietly. There was no open letter, no thread announcing a principled stand, none of the public theater that had come to accompany departures in an industry that increasingly conducted its quarrels in front of an audience. He had spent close to five years there, rising to vice president of research and presiding over the work that produced GPT-2 and GPT-3, the models that had made the company's name. One detailed retrospective fixes his exit at December 29; the precise date is less certain than the pattern that followed it. Over the next several weeks a cluster of his colleagues left as well. By the accounting of trade coverage at the time, at least nine other employees followed him out, most of them between December 2020 and January 2021. The people who had built OpenAI's scaling apparatus were leaving the building, and they were doing it together.

What they were leaving for took legal shape early in the new year. The company they formed was incorporated as Anthropic, PBC, a Delaware public-benefit corporation, and its California registration was filed on February 3, 2021. The founders themselves tend to date the founding to "the beginning of 2021," and January has become the convenient shorthand. The distinction between the leaving and the founding is worth preserving, because the two are easily collapsed into a single dramatic act, and the reality is quieter and more deliberate than that: a group of researchers exited one organization at the end of one year and spent the opening weeks of the next quietly assembling another.

There were seven of them at the center, and the roster reads like a reunion of the people who had formalized the modern scaling enterprise. Dario Amodei became chief executive. His sister Daniela, who had run safety and policy at OpenAI, became president, and the division of labor between them would shape the company: Dario the public face of research and risk, Daniela the architect of operations and the commercial discipline that would have to fund the research. Tom Brown, the lead engineer on GPT-3, came across; so did Sam McCandlish, a co-author of the scaling-laws work who would become chief architect, and Jared Kaplan, the Johns Hopkins theoretical physicist whose instinct for power laws had helped turn the behavior of large models into something resembling a predictable curve, and who became chief science officer. Jack Clark, OpenAI's former policy director and a onetime Bloomberg journalist, and Chris Olah, whose work tried to make the interior of a neural network legible rather than opaque, rounded out the seven. A few accounts add Ben Mann as an eighth.

The story that hardened around this exodus was a schism over safety — a band of true believers fleeing a company that had grown too commercial, too entangled with Microsoft, too willing to ship. The shape of that story is not invented. The reporters who covered Anthropic closely, among them Kevin Roose at the New York Times and Dylan Matthews at Vox, wrote that the group had worried the company was becoming too commercial, and that framing has stuck. But it is only half of what was said, and the founders themselves told it differently. In their own contemporaneous account, the departure was not primarily a denunciation. It was, in Daniela's phrasing, the chance to gather a particular group of people to make a "focused research bet" on building AI systems that were steerable, interpretable, and reliable, with humans at the center of them. Dario described the same thing as a focused bet on scaling and safety together — a conviction, formed in the wake of GPT-2 and GPT-3, that pouring more compute into these models would keep making them better, and that scaling alone, without alignment, was not safe to pursue.

He was even sharper, later, about what the move was not. Asked directly whether the group had left because it disliked the Microsoft arrangement, he answered with a single flat word: "False." The deal that secured Microsoft an exclusive license to GPT-3 in the autumn of 2020 sits in the background of the story, and it is tempting to assign it the causal weight, but the man at the center of the departure has refused to let it carry that load. What he described instead was something less cinematic and more honest about how such splits actually happen — that it is, as he put it, unproductive to keep arguing with someone else's vision when you have a different one. The temptation to read a villain into the founding, to locate the break in a loss of trust in Sam Altman, is real, but the language that supports it belongs largely to retrospectives published years afterward, not to the record of 2020 and 2021. The cleaner account, and the one the primary sources support, is that a group of people who shared a specific bet decided they wanted an organization built entirely around it.

The name they chose pointed at the orientation rather than the technology. "Anthropic" means, roughly, pertaining to human beings, and the company described itself plainly as an AI safety and research company whose purpose was to make the fundamental advances that would let it build more capable and reliable systems and then deploy them in ways that benefited people. No single founder quote canonically explains the word; it evokes a human-centered stance more than it defines one.

The bet needed money, and in May 2021 it got a first installment. Anthropic announced a $124 million Series A led by Jaan Tallinn, the Skype co-founder, with participation from James McClave, the Facebook co-founder Dustin Moskovitz, the Center for Emerging Risk Research, and the former Google chief executive Eric Schmidt, at a post-money valuation of roughly $550 million. The investor list is worth reading slowly, because several of those names — Tallinn, Moskovitz, the Swiss research center — belong to the world of effective altruism and longtermism, the intellectual milieu preoccupied with low-probability, high-stakes risks to humanity's future. That milieu is part of Anthropic's lineage and helps explain both its funding and its founding anxieties; it is also a subject that invites editorial heat from every direction, and the plainest thing to say is the most accurate: the money came substantially from people who took existential risk from AI seriously, and the company was built by people who did too.

The deeper continuity ran through the work itself. Anthropic concentrated, in one building, the people who had built the empiricist scaling tradition and then bent that tradition back on itself. The scaling laws were Kaplan and McCandlish; GPT-3 was Brown; reinforcement learning from human preferences traced to a 2017 paper Dario himself had co-authored. The schism, properly understood, was not scaling against safety. It was the wager that the two had to be pursued together, by the same people, inside a single mission.

The proof of that wager arrived at a moment the founders could not have scripted better. On November 30, 2022, OpenAI released ChatGPT, and within weeks the world discovered the chatbot. About two weeks later, on December 15, Anthropic's researchers posted a paper titled "Constitutional AI: Harmlessness from AI Feedback," with Jared Kaplan as corresponding author and more than fifty names attached. Its method replaced the human labels that governed a model's harmfulness with feedback generated by the model itself, guided by a short written set of principles — a constitution. A model would critique and revise its own responses against those principles, then learn from its own judgments of which responses were more harmless, a technique that came to be called reinforcement learning from AI feedback. The paper was careful about its own modesty: it proposed to finetune models toward harmlessness using, as it put it, "of order ten simple principles, stated in natural language," where conventional methods leaned on tens of thousands of human labels. It was not, despite the temptation to say so, a machine teaching itself with no human input; humans still supplied the judgments of helpfulness, and only harmfulness was automated. It was a hybrid, and it was explicitly an extension of the human-feedback lineage Anthropic's founders had helped invent.

The juxtaposition is the whole chapter in miniature. As the public met the chatbot and the commercial explosion began, the group that had quietly left the year before published, with no comparable fanfare, an argument for how to make such a thing behave. The schism that produced Anthropic was not an ending but a recurrence — one group deciding that another could not be trusted with the most powerful technology, the same motion that had once produced OpenAI itself. The road to ChatGPT, it turns out, kept forking at exactly the places where its builders most agreed on how dangerous the destination might be.

=> The Doubters

Page updated

Google Sites

Report abuse