Supporters of Marcus Endicott’s Patreon can access weekly or monthly consultations on this topic.

Chapter 3: Literature Review

The dream of a machine that could hold up its end of a conversation is older than most of the people now having conversations with one. In 1966, at the Massachusetts Institute of Technology, Joseph Weizenbaum built a program that did little more than rearrange a person's own words into questions, and discovered to his unease that people confided in it anyway. ELIZA could not understand a sentence; it matched patterns and reflected them back, and yet the people who typed at it felt understood. Weizenbaum had stumbled onto something that would haunt the field for the next sixty years — the readiness of human beings to meet a machine halfway, to supply the mind they could not find. The tendency was named after his program. It is the foundational fact of the whole enterprise, and no amount of engineering has ever made it go away.

What followed ELIZA was less a science than a craft. A Stanford psychiatrist gave the world PARRY, a program built to imitate paranoid reasoning, and the two were eventually introduced to each other over the early internet, a chatbot talking to a chatbot while researchers watched. The Turing Test, that thought experiment about whether a machine could be mistaken for a person, hardened into an annual ritual when the Loebner Prize began offering medals and prize money to whichever program could best fool a panel of judges. For three decades the prize was the field's carnival, equal parts serious benchmark and parlor trick, and the bots that competed in it were almost all built the same way: by hand, rule upon rule, a human author anticipating every turn a conversation might take and writing a response for it.

That authorial tradition reached its most refined form in the markup languages and chatbot frameworks that grew up around it. A bot like the much-decorated Mitsuku was not intelligent in any modern sense; it was the product of years of patient hand-authoring, a vast library of patterns and replies assembled by a single dedicated craftsman. The work was painstaking and genuinely skilled, closer to writing than to programming, and the best of it could carry a conversation a remarkably long way. This was the world as it stood when the literature on virtual humans was being assembled: a mature tradition of hand-built dialogue, decades deep, with its own competitions, its own celebrated authors, its own sense of where the frontier lay.

Alongside the voices, others were building the faces. The ambition was never only to make a machine talk but to make it appear, to give the conversation a body and an expression. Early commercial systems put an animated face and a synthetic voice on a chatbot and called it a virtual human, and the field pressed steadily toward realism from there. The most striking work came out of New Zealand, where a former visual-effects artist who had won Academy Awards for his technical work turned to building digital faces driven not by animation curves but by simulated nervous systems — a virtual infant whose expressions emerged from a model of how a real brain might produce them. The company he co-founded promised photorealistic digital people who could look you in the eye, and for a while it looked like the future. The same instinct ran through the research world's grandest projects, including a twenty-year effort in Switzerland to reconstruct the circuitry of a mammalian brain in software, neuron by simulated neuron.

The public, meanwhile, was running its own informal version of the uncanny-valley test on social media, where computer-generated influencers accumulated millions of followers and brand deals while their audiences argued over whether they were charming or unsettling. The most famous of them passed for human often enough to make the question interesting, which was rather the point. If a synthetic face could hold an audience on a phone screen, the believability problem was being solved in public, one post at a time, without anyone needing a laboratory.

Then the ground moved. Within roughly a year and a half of all this being carefully documented, a chatbot arrived that had not been authored at all in the old sense. It had been trained, on a corpus larger than any individual could read in a thousand lifetimes, and it could discuss nearly anything without a human having written a single rule for the occasion. The craft tradition that had been refined for half a century — the patterns, the markup, the hand-built replies — was not improved by this. It was bypassed. The skill remained real and the libraries still existed, but they had become, almost overnight, a legacy art, the way letterpress printing is a legacy art: admirable, specialized, and no longer how the work gets done.

The other surprise was about venue. The expectation had been that virtual humans would step out of the screen and into mixed reality, that the conversation and the body would merge into the physical world through headsets and spatial displays. That is not where they arrived. The conversational virtual human of the new era turned up almost entirely on flat screens — on websites, on kiosks, on phones — explicitly marketed as needing no headset at all. Consumer virtual reality, for its part, stalled, and the immersive future kept receding into the long horizon where it had always lived. The convergence everyone had predicted did happen: the talking and the appearing fused into a single believable presence. It simply came through a different door than the one everyone was watching.

The reckoning has been uneven. The annual prize that once crowned the best hand-built bot held its final contest and quietly closed. The decades-long effort to simulate a brain reached its conclusion and wound down. The New Zealand company that had promised photorealistic digital people, having raised well over a hundred million dollars and signed marquee clients, slid into receivership, its surviving products quietly rebuilt on top of the very language models that had upended the field. Others adapted and prospered, throwing out their hand-authored dialogue layer and orchestrating large language models instead, keeping the beautiful faces and replacing the brittle scripts behind them. The research institutes that had spent years building clinical virtual humans began folding in the new models carefully, mindful that a system meant to help someone in distress cannot be allowed to improvise badly.

What survives all of this is the oldest thing in the story. The machines can talk now in ways their first authors never imagined, and they wear faces those early animators would have envied, but the reason any of it works is still the reason ELIZA worked in 1966. People lean in. They supply the understanding. The craft was outgrown, the venue was wrong, the companies rose and fell — and the human willingness to meet a machine halfway, which was always the real engine, never changed at all.

Page updated

Google Sites

Report abuse