Supporters of Marcus Endicott’s Patreon can access weekly or monthly video consultations on this topic.
China's virtual digital human industry has entered a period of rapid commercial maturation and regulatory crystallization, driven by converging forces of government policy, artificial intelligence advancement, 5G infrastructure deployment, and expanding enterprise demand. The sector's trajectory from experimental novelty to recognized component of the national digital economy can be traced through a sequence of policy instruments, market data, corporate milestones, and consumer behavior patterns that collectively describe one of the most dynamic technology-driven industries in contemporary China.
The policy architecture supporting virtual digital humans in China rests on a layered framework of national plans, sector-specific regulations, and provincial incentive programs that have accumulated steadily since 2021. The Fourteenth Five-Year Plan and 2035 Vision, approved by the National People's Congress in March 2021, formally incorporated virtual reality and augmented reality as key digital economy industries, establishing the foundational political mandate for related technology development. The State Council's "Fourteenth Five-Year" Digital Economy Development Plan, issued in January 2022, deepened this commitment by promoting the integration of AI, VR, and high-definition video technologies across social, commercial, entertainment, and exhibition applications, explicitly encouraging virtual-real interaction experiences. In October 2022, five central government departments led by MIIT jointly issued the Virtual Reality and Industry Application Integration Action Plan for 2022 through 2026, targeting key breakthroughs in 3D and immersive audiovisual technologies and enrichment of next-generation VR terminal products. These broad industrial plans were complemented by increasingly specific regulatory instruments governing the technologies that underpin virtual digital humans.
The Provisions on the Management of Deep Synthesis of Internet-based Information Services, jointly issued by the Cyberspace Administration of China, MIIT, and the Ministry of Public Security and effective January 10, 2023, defined "deep synthesis" services to include face generation, face replacement, voice synthesis, and immersive realistic scenes. The regulation required providers to add watermarks and labels to deep-synthesis content, mandated prominent labeling for AI-generated human faces and synthetic voices, and imposed real-name verification obligations on users and algorithm filing requirements on providers. These provisions were directly applicable to the creation and distribution of virtual digital human content. The Generative AI Services Interim Measures, published on July 10, 2023, by seven departments including the CAC and NDRC, extended regulatory coverage to generative AI services providing text, images, audio, and video content to the public in China, adopting a principle of balancing development with security through classification-based supervision. The Measures for Identifying AI-Generated Synthetic Content, published on March 7, 2025, and effective September 1, 2025, introduced mandatory national standard GB 45438-2025, requiring both visible "AI-generated" markers and machine-readable metadata embedded in files for all AI-generated content, directly applicable to virtual digital human video and audio.
Two landmark draft regulations published in late 2025 and early 2026 marked a decisive shift toward sector-specific governance of virtual digital humans. The Provisional Measures on Human-like Interactive AI Services, published by the CAC on December 27, 2025, with a comment period ending January 25, 2026, was described by experts as the world's first attempt to regulate anthropomorphic AI. The draft required prominent disclosure that a user is interacting with AI, imposed a two-hour continuous use limit triggering mandatory break reminders, prohibited content encouraging suicide or self-harm and emotional manipulation, and established special protections for minors and the elderly. On April 3, 2026, the CAC published the Digital Virtual Human Information Service Management Measures in draft form, with public comments accepted through May 6, 2026. This instrument represents China's first dedicated national regulation for digital virtual humans. The draft defined "digital virtual human" as a virtual digital figure existing in a non-physical world, using computer graphics, digital image processing, or AI technologies, driven by real humans or computation, simulating human appearance with voice, behavior, interaction capabilities, or personality characteristics. The regulation established multi-department oversight with the CAC in the coordinating role and introduced provisions requiring explicit individual consent for using biometric data such as face and voice for digital human modeling, with withdrawal rights and data deletion obligations. It prohibited the provision of virtual family members and virtual romantic partners to minors, mandated the continuous and prominent display of a label containing the characters "数字人" (digital human) throughout any digital human service, and set penalties ranging from ten thousand to one hundred thousand yuan for violations, with fines of one hundred thousand to two hundred thousand yuan where violations cause harm to life or health.
Provincial and municipal governments have developed their own complementary policy frameworks. In August 2022, the Beijing Municipal Bureau of Economy and Information Technology issued the Beijing Digital Human Industry Innovation Development Action Plan for 2022 through 2025, which was China's first dedicated digital human industry policy at any level of government. The plan set targets for the city's digital human industry to exceed five hundred billion yuan by 2025, to cultivate one to two enterprises with annual revenue exceeding five billion yuan and ten enterprises exceeding one billion yuan, and to establish ten university-enterprise co-built laboratories. The Beijing Digital Human Base, located in Chaoyang District, opened in February 2024 and had attracted forty-nine enterprises by November of that year. Shanghai issued its own Metaverse New Track Action Plan in July 2022, targeting a metaverse industry scale of three thousand five hundred billion yuan by 2025 and including a Digital Human Comprehensive Enhancement Project as one of eight key initiatives. Sichuan province issued its Metaverse Industry Development Action Plan in September 2023, targeting two thousand five hundred billion yuan in metaverse-related industry scale by 2025 and specifically promoting functional, service-type digital humans across consulting, education, entertainment, and healthcare.
The market size of China's virtual digital human industry varies substantially depending on the source and scope of measurement, and this variation is itself an important characteristic of the sector. iiMedia Research, the most frequently cited commercial research firm for this industry, published its 2024 white paper reporting that the core market reached 205.2 billion yuan in 2023 and projected it to reach 480.6 billion yuan by 2025, while the broader "driven market" encompassing adjacent and peripheral industries reached 3,334.7 billion yuan in 2023 with a projection of 6,402.7 billion yuan for 2025. In its 2025 report, published in November of that year, iiMedia reported the 2024 core market at 339.2 billion yuan and projected it to reach 935.6 billion yuan by 2030, with the driven market at 4,785.3 billion yuan in 2024 and projected at 10,468.6 billion yuan for 2030. It should be noted that the 2025 report adopted a broader "digital human" definition, dropping the qualifier "virtual," which may partly account for the 2024 actual figure already approaching the prior year's 2025 projection. IDC, using a substantially narrower definition covering only AI-driven digital human platform and SaaS annual revenue, reported the 2024 market at approximately 41.2 billion yuan, representing 85.3 percent year-over-year growth, and projected it to reach 250.5 billion yuan by 2029, yielding a compound annual growth rate of 43.5 percent for the 2024 through 2029 period. IDC's June 2025 report identified Baidu as holding the top market share at 9.8 percent, followed by Huawei Cloud at 9.7 percent and Xiaoice in third position, with the 2D digital human sub-market alone reaching 28.9 billion yuan in 2024, up 101.2 percent year over year. The gap between IDC's conservative 41.2 billion yuan and iiMedia's 339.2 billion yuan for the same year underscores how dramatically scope definitions affect the narrative, and readers should bear this distinction in mind when evaluating any market size claim about this industry.
The industry's structure follows a vertical chain spanning upstream technology providers, midstream platform and application companies, and downstream deployment domains. Upstream providers supply the foundational technologies of computer graphics, AI-driven natural language processing, speech synthesis, computer vision, and motion capture. Companies operating at this level include iFlytek, which offers a full-stack AI virtual human platform with more than sixty emotional voice libraries; Baidu, whose Xi Ling platform integrates its ERNIE large language model with dialogue capabilities; SenseTime, which contributed to the development of customer-service digital human classification standards; and Noitom, a leading Chinese motion capture hardware company. Rendering infrastructure relies substantially on foreign engines, particularly Epic Games' Unreal Engine and Unity Technologies' platform, though Chinese cloud rendering providers including Haima Cloud and Alibaba Cloud have developed real-time rendering services that reduce end-device hardware requirements.
Midstream companies operate end-to-end digital human platforms and produce the virtual characters that populate consumer and enterprise applications. Baidu's Xi Ling platform provides one-stop creation combined with large language model dialogue, while Tencent's Zhiying tool offers SaaS-based video creation. ByteDance's Volcengine division launched digital human products across interactive, broadcast, and livestream categories. JD Cloud's Yanxi platform provides e-commerce virtual livestreamers at entry-level pricing. Kuaishou developed its Nuwa platform, which at peak capacity supported over two thousand two hundred simultaneous digital human livestreams operating around the clock. In the virtual idol segment, Bilibili acquired the team behind Luo Tianyi, China's most prominent virtual singer, who was launched in 2012 and performed at the 2022 Beijing Winter Olympics opening ceremony. Ranmai Technology created AYAYI, described as China's first hyper-realistic meta-human when she debuted in May 2021, and she subsequently became a digital brand manager for Alibaba's Tmall platform. Chuangyi Technology created Liu Yexi, a virtual beauty influencer who attracted a massive following on Douyin.
Silicon Intelligence (硅基智能), founded in 2017 in Nanjing, has emerged as a particularly consequential player in the digital human agent segment. According to the CIC/Frost & Sullivan consultancy report accompanying the company's Hong Kong IPO filing, Silicon Intelligence held 32.2 percent of the Chinese digital human agent market in 2024 by revenue, ranking first domestically and second globally. The company completed eight rounds of funding totaling approximately 870 million yuan, with Tencent as its largest external investor at 16.59 percent ownership, alongside Sequoia China, China Merchants Bank International, and other institutional backers. Its revenue grew from 223 million yuan in 2022 to 655 million yuan in 2024, with the company achieving its first adjusted net profit of 5.29 million yuan in the first half of 2025. Silicon Intelligence filed for a Hong Kong IPO on October 31, 2025, positioning itself as the potential "first digital human stock" on the Hong Kong exchange. However, the company's financial profile reveals tensions characteristic of the industry: revenue is heavily concentrated among a small number of large clients, with the top customer, believed to be a major telecommunications operator, contributing 64.4 percent of revenue in the first half of 2025, and gross margins declined from 38.5 percent in 2022 to 31.6 percent in the first half of 2025 due to competitive pricing strategies aimed at securing large enterprise accounts.
MoFa (Shanghai) Information Technology Co., Ltd. (魔珐(上海)信息科技有限公司), founded in 2017 in Shanghai, represents the 3D digital human infrastructure segment. The company completed its C round in April 2022 at approximately 110 million US dollars led by SoftBank Vision Fund 2, bringing combined B and C round funding to approximately 130 million US dollars, with participation from Sequoia China, 5Y Capital, and Northern Light Venture Capital. Its client base of more than two hundred enterprises includes Alibaba, Tencent, CCTV, L'Oreal, and ByteDance. Xiaoice, spun off from Microsoft in July 2020 under the chairmanship of Harry Shum, operates more than three hundred thousand digital employees and completed a funding round of approximately one billion yuan in November 2022.
Consumer behavior data reveals a market where awareness is high but spending remains modest. iiMedia's 2024 survey found that more than fifty percent of surveyed enterprises had used virtual human technology, with more than thirty percent planning to do so. The most commonly encountered type was virtual livestreamers, cited by 81.4 percent of respondents. In the virtual idol segment, 92.3 percent of fans were aged nineteen to thirty, and eighty percent of consumers spent less than one thousand yuan per month on virtual idol-related purchases. More than eighty percent of respondents indicated that virtual idol endorsement would increase their purchase intent, while 68 percent learned about virtual idols primarily through short video platforms. However, demand-side data from QuestMobile published in September 2025 painted a more cautious picture, reporting that virtual livestreamer gross merchandise value remained less than one-fifth that of real human livestreamers in 2023, that average viewing duration on Douyin dropped from fifteen minutes in 2022 to five minutes in 2023, and that virtual streamer fan attrition rates exceeded forty percent on the platform. This tension between supply-side growth metrics and demand-side engagement indicators suggests that the industry's commercial proof points remain concentrated among a small number of leading platforms while the broader ecosystem of smaller operators faces adoption challenges.
Enterprise deployment of digital humans has advanced furthest in the financial services and media sectors. The China Banking Association reported in November 2023 that eleven banks had deployed virtual digital humans in remote banking operations, with five additional institutions under construction. Named deployments include SPD Bank's "Xiao Pu," developed with Baidu's Xi Ling platform and launched in December 2019, which handles more than eighty percent of call volume; China Construction Bank's "Long Zhiwei" and "Long Zhiyuan," which integrate robot vision, natural language processing, and speech AI for multi-turn customer service dialogue; and ICBC's "Gong Xiaozheng" and "Gong Xiaocheng," deployed in July 2023 for wealth management and life services. In the telecommunications sector, all three major operators have deployed digital human solutions: China Unicom's "Xiao U" was created using Baidu's Xi Ling platform, China Mobile launched its "Lingxi" AI intelligent agent in October 2024 based on its proprietary "Jiutian" large model, and China Telecom has developed smart tourism platforms incorporating digital human technology.
The media sector represents the most mature application domain for digital humans in China. Xinhua News Agency deployed the world's first AI synthetic male anchor, "Xin Xiaohao," in November 2018 in collaboration with Sogou, followed by the first AI synthetic female anchor in February 2019 and the first 3D AI synthetic anchor in May 2020. CCTV developed "Xiao C," which became the first digital human to conduct a live interview with a National People's Congress delegate, accumulating more than two million social media followers and billions of views. The "AI Wang Guan" digital twin of CCTV financial commentator Wang Guan has generated more than 780 million total plays on the CCTV video platform, providing around-the-clock financial coverage. Multiple provincial television stations have deployed their own digital anchors, including Hunan Satellite TV, Zhejiang Satellite TV, and Beijing Radio and Television, the last of which developed "Shijian Xiaoni," a multi-functional digital human serving across news broadcasting, government Q&A, and the 12345 citizen service hotline.
E-commerce has become a high-profile testing ground for digital human commercial viability. JD.com's "Caixiao Dongge," an AI digital human modeled on founder Liu Qiangdong, debuted in April 2024 and attracted more than twenty million total views within approximately one hour, generating cumulative gross merchandise value exceeding fifty million yuan. During the 618 shopping festival in 2024, JD Cloud's Yanxi platform was deployed across more than five thousand brand livestreams with 380,000 hours of digital human broadcasting, more than 400 million cumulative views, and more than five million viewer interactions. By the 618 festival in 2025, JD reported that digital human livestream costs had fallen to one-tenth of real human equivalents while operating around the clock, with conversion rates up thirty percent and performance exceeding that of eighty percent of real human livestreamers. Baidu's "Huibao Xing" platform, launched in October 2023, accumulated more than one hundred thousand digital human livestreamers, reporting conversion rate improvements of thirty-one percent and launch cost reductions of eighty percent.
The technological transformation most fundamentally reshaping the industry is the integration of large language models into digital human systems, enabling the shift from scripted, broadcast-type digital humans to interactive entities capable of real-time bidirectional conversation. The standard technical architecture now follows a pipeline of speech recognition, large language model reasoning and response generation, text-to-speech synthesis, and talking head animation for lip synchronization. Retrieval augmented generation is commonly used to supplement large language models with enterprise-specific knowledge bases. Baidu's Xi Ling 4.0, released in September 2024, reduced the entry price for 3D ultra-realistic digital human creation to 199 yuan, a reduction from tens of thousands of yuan. Generative AI technologies have compressed digital human development cycles from approximately three months to two weeks and reduced production costs by an estimated seventy percent. Cloud rendering costs have fallen from approximately eighteen yuan per minute to approximately three yuan per minute. SaaS platform tools have reduced entry costs for small and medium businesses from approximately five hundred thousand yuan for custom projects to twenty thousand yuan per year for standardized products. Open-source frameworks have further accelerated accessibility, with Silicon Intelligence releasing its DUIX framework on GitHub and Ant Group's multimodal research lab open-sourcing EchoMimic, which requires only a single reference image and audio clip to generate animated digital human video.
5G infrastructure provides the low-latency connectivity essential for real-time digital human interactions, reducing transmission latency from approximately fifty milliseconds under 4G to approximately one millisecond. Edge computing combined with 5G is projected to support large-scale concurrent digital human interactions for smart city applications, while cloud rendering offloads intensive GPU computation from end-user devices, making high-quality digital humans accessible on smartphones and smart screens. China's digital economy, which reached approximately 63.2 trillion yuan in 2024 according to iiMedia, provides the broader commercial substrate within which the digital human industry operates.
The industry's competitive landscape features a dual structure of internet platform giants and specialized vertical companies. Baidu, Tencent, Alibaba, and ByteDance leverage their AI large model capabilities, cloud infrastructure, and existing user bases to provide digital human solutions across government, financial, and e-commerce services. Simultaneously, specialized companies including Silicon Intelligence, MoFa (Shanghai) Information Technology Co., Ltd., Xiaoice, and others have established positions in specific segments such as digital human agents, 3D virtual character production, and interactive entertainment. IDC's market share data, showing the top player holding only 9.8 percent of the platform market, indicates that the industry remains fragmented, with no single provider yet achieving dominance across the full spectrum of applications.
The iiMedia 2025 Digital Human City Development Index ranked Shenzhen first at 92.92, followed by Beijing at 91.21, Guangzhou at 89.28, Shanghai at 86.36, and Hangzhou at 83.28, with Wuxi in sixth position, distinguished by its manufacturing base and scenario deployment capabilities within the Yangtze River Delta region.
The strengths of China's virtual digital human industry are well documented: rapid market expansion confirmed by multiple independent research firms, an extensive and deepening policy support infrastructure, a large digitally literate consumer base of approximately 1.28 billion social media user identities, and a commercial ecosystem spanning more than one thousand two hundred substantive companies with proven deployments across banking, telecommunications, media, e-commerce, tourism, and government services. The weaknesses are equally concrete. Interaction capabilities, while dramatically improved by large language model integration, remain below human-level emotional depth and spontaneity. High-end custom 3D digital humans with professional motion capture still cost upward of one hundred thousand yuan. Dependence on foreign rendering engines and the impact of US chip export controls on training compute availability represent structural vulnerabilities. Cross-disciplinary talent combining AI, computer graphics, natural language processing, and motion capture expertise remains scarce.
Opportunities lie primarily in enterprise digital transformation, where digital humans are replacing traditional customer service functions in banking and accelerating e-commerce livestream operations at a fraction of human labor costs. Cross-border cultural export represents an emerging vector, with companies such as Wondershare developing multilingual digital human templates targeting overseas markets in more than 120 languages. Threats include the rapidly evolving regulatory landscape, which has produced four major AI-specific regulations since 2022 and two significant draft regulations within a four-month period in late 2025 and early 2026, creating compliance uncertainty for operators. Competition from overseas technology providers, including Microsoft, NVIDIA, Meta, Synthesia, and Soul Machines, adds pressure, with North America holding approximately 43 percent of the global market according to Mordor Intelligence. The proliferation of registered companies in the space, with tens of thousands of entities on enterprise registries but many lacking differentiated technology, raises concerns about market saturation and commoditization at the lower end of the value chain.
China's virtual digital human industry stands at an inflection point defined by the April 2026 CAC draft regulation, which represents the world's first dedicated digital virtual human regulatory framework. The regulation signals that Beijing views digital virtual humans as sufficiently consequential to warrant sector-specific governance rather than reliance on broader AI rules. Whether the final regulatory framework strikes a balance that protects consumers and safeguards personal data while preserving space for commercial innovation will substantially determine the industry's trajectory. The evidence suggests an industry where supply-side capabilities and government ambition have outpaced demand-side adoption in many consumer-facing applications, where commercial value is concentrated among a small number of leading platforms and enterprise deployments, and where the AIGC-driven collapse in production costs is simultaneously enabling broad accessibility and threatening differentiation. The companies best positioned for the next phase are those with proprietary large language model integration, established enterprise relationships in regulated industries, or unique technical capabilities in 3D production and multimodal interaction. The industry's maturation will ultimately be measured not by the aggregate market size figures that vary by billions depending on scope definition, but by whether interactive AI digital humans can demonstrate sustained consumer engagement, measurable enterprise value, and responsible deployment within the governance framework that China is now constructing.
[Apr 2026]