Supporters of Marcus Endicott’s Patreon can access weekly or monthly consultations on this topic.

Chapter 6: Discussion

Every forecast about the future of computing is, underneath, a bet about plumbing. Not about where we are going but about which pipes the future will flow through. Around the turn of the last decade, a confident prediction took shape about how the photorealistic, fully interactive virtual human would finally arrive in our lives. It rested on a three-part wager: that game engines would migrate off the desktop and into the cloud, that a high-bandwidth last mile in the form of mobile 5G would carry the result to us, and that a new class of neural processing units would put real machine intelligence on the device in our hands. Stack those three together, the argument went, and you could deliver convincing virtual people for every application anyone could conceive. Five years on, the wager has come apart in a revealing way, because the three limbs aged at wildly different rates.

The cleanest victory belonged to the smallest and least glamorous part of the bet. The forecast that processing power would push outward to the edge, into dedicated neural chips on consumer devices, turned out to be a piece of genuine foresight. Apple's Neural Engine matured quietly across generations of its silicon. Then the trend went mainstream in a way nobody could miss: Qualcomm's Snapdragon X Elite, with a neural unit rated around forty-five trillion operations a second, launched an entire PC category whose very definition required a chip capable of at least forty TOPS. AMD and Intel followed within months, each shipping laptop processors in the same class. The on-device accelerator, an afterthought in most computers a few years earlier, became the defining hardware story of the mid-2020s. Whoever called that early deserves full credit, because it was concrete, falsifiable, and correct.

The middle limb of the bet was half right, which in forecasting is its own kind of instructive. The intuition that delivering rich, real-time virtual humans would demand far more bandwidth than the typical connection then offered was sound. The specific mechanism was not. Millimeter-wave 5G, the flashy multi-gigabit promise of the era, largely failed as a mass-market technology; the physics of poor penetration and short range confined it to stadiums and a handful of dense urban hotspots, and people connected to it a fraction of a percent of the time. What actually delivered measurable gains was the unglamorous mid-band spectrum, useful and real but a long way short of the gigabit dream. The genuine surprise was fixed wireless access, which found a consumer foothold by competing with cable on price rather than by enabling any immersive frontier. And the immersive content that did ship to people reached them, overwhelmingly, over home broadband, fiber, and Wi-Fi rather than over a phone. The bandwidth instinct was directionally fine. The story about how the bandwidth would arrive was a product of its moment's enthusiasm.

The third limb simply collapsed, and it took a small graveyard of platforms down with it. The notion that the game engine itself would move into the cloud, authored and hosted and consumed there rather than on a desktop, did not become the way the world works. The flagship examples named at the time are now a roll call of the departed. Sansar, the social VR platform that was supposed to embody this cloud-native future, was sold off in 2020 and faded into irrelevance. Amazon's Sumerian, pitched as a browser-based engine for building immersive scenes, was closed to new customers in 2022 and wound down entirely the following year. Amazon's Lumberyard, another contender, survived only by being open-sourced and handed to a foundation. Part of the trouble was that the original vision had quietly folded together things that were never the same. Streaming a finished game from a distant server, as Google's Stadia did before its own abrupt shutdown in early 2023, is not the same act as authoring a world in the cloud. A subscription library of locally downloaded titles is something else again. Cloud streaming as a service did endure and even thrive in places, but the engines where games and worlds actually get built, Unity and Unreal, never left the desktop at all.

The most interesting part is what filled the gap, because the destination the forecast pointed toward was reached almost exactly, just by a different road. The photorealistic, conversational virtual human did arrive. It simply traveled through a stack nobody at the time had foregrounded. The breakthrough engine was not a cloud game platform but the large language model, which turned fluent conversation from a perennial disappointment into a solved-enough problem almost overnight. Bolt that onto real-time avatar and lip-sync rendering, with tools that drive expressive digital faces from audio, and you get a virtual person who can look you in the eye and talk back. And the screen this person appears on is, more often than not, an ordinary flat one, reached through a web browser rather than a headset. The hard part turned out to be the mind, not the pipe; once the conversation became convincing, the rendering was the comparatively tractable problem.

There is a lesson in the shape of all this for anyone in the business of predicting where technology goes. Seeing the destination is rarely the hardest thing. The genuinely difficult call is the route, because the route depends on which boring, half-built infrastructure happens to be sitting around when the demand finally shows up. The future of the virtual human did not run on exotic new pipes laid specially for it. It ran on the broadband already in the wall, the graphics card already in the machine, the flat display already on the desk, and a chip on the device that, for once, somebody had seen coming.

Page updated

Google Sites

Report abuse