Supporters of Marcus Endicott’s Patreon can access weekly or monthly video consultations on this topic.
China's digital human market has been shaped most decisively not by the telecommunications operators that deploy avatars within their service networks, nor by the content platforms that host and monetize them, but by four commercial technology firms whose platforms, models, and cloud infrastructure supply the underlying capability for most digital human deployments across the Chinese economy. Baidu, Alibaba, Tencent, and ByteDance occupy this commercial core through markedly different strategic orientations, yet none has achieved dominance in the conventional sense. IDC's June 2025 market share data places Baidu first at 9.8 percent, followed by Huawei Cloud at 9.7 percent, with Tencent Cloud and the rest of the field distributed below — a leading position that nevertheless leaves more than ninety percent of the AI digital human platform market in other hands. The four giants together hold the largest combined commercial position in the sector, but the market beneath them remains structurally fragmented, populated by specialist firms, state-owned enterprises building proprietary capability, and a long tail of smaller providers. The chapter that follows treats these four companies in proportion to the scale and distinctiveness of their digital human work, anchored by the platforms and named events that demonstrate their respective approaches.
Baidu has built the most fully realized digital human platform in China and earned the corresponding market position. The Xi Ling platform was unveiled by CTO Wang Haifeng on December 27, 2021 at the Baidu Create AI Developer Conference, positioned as a one-stop production, content creation, and business configuration service spanning media, finance, retail, and entertainment. Its commercial origins, however, predate the platform's formal launch. In April 2019, Baidu announced its collaboration with Shanghai Pudong Development Bank at a "Hello Future" event in Shanghai; on July 3 of that year, Robin Li and SPD Bank Vice President Pan Weidong unveiled the bank's digital employee Xiao Pu at the Baidu AI Developer Conference; and on December 13, 2019, Pan personally awarded the digital employee badge that designated Xiao Pu as the banking industry's first digital staff member. The September 2024 release of Xi Ling lowered the entry price for a 3D hyper-realistic digital human from approximately 10,000 yuan to 199 yuan, described at launch as the lowest in the industry, and brought the platform's preset library to more than 800 avatars and 150 voice options with MCP-compliant APIs for video generation, speech synthesis, and voice cloning. At WAIC 2025, Baidu introduced NOVA, a next-generation digital human platform under the Huibo Xing umbrella with public access slated for October 2025. The roster of named deployments built on Xi Ling includes a digital Kang Hui created in collaboration with CCTV, the historical figure Diaochan unveiled at the Baidu Cloud Intelligence Conference in early September 2023 and locally launched at the Sixth Majiayao Culture Festival in Lintao County, Gansu, the digital reporter Jing Jing at China Intellectual Property News produced via Baidu's Qingduo technology, and the rural commerce avatar Pan Fang in Bijie, Guizhou, deployed to sell deer antler products. A dedicated Baidu digital human industry base in Yancheng, Jiangsu, anchors the Yangtze River Delta Digital Audiovisual Industry Base, and a separate Macau e-commerce solution launched in June 2025 offered free Huibo Xing AI digital human livestreaming services to ten thousand merchants timed to the Greater Bay Area Shopping Festival.
The platform Baidu now markets most aggressively to e-commerce merchants is Huibo Xing, which by late 2025 had created over 320,000 avatars and more than 120,000 voice profiles. Its capability stack includes AI script generation drawing on Wenxin and DeepSeek models, voice cloning from a fifteen-minute recording, avatar cloning from a five-minute video at ninety-eight percent lip-sync accuracy, twenty-four-hour autonomous livestreaming with a ninety-five percent AI question-and-answer usability rate, and libraries of more than 7,800 public avatars, 3,600 livestream templates, and 3,200 voice options. IDC ranked Huibo Xing first in its e-commerce livestream digital human category. During the 2025 Double Eleven shopping festival, eighty-three percent of streaming merchants on the platform used Huibo Xing, and gross merchandise value rose ninety-one percent year over year. The most visible single demonstration arrived on June 15, 2025, when a digital twin of the entrepreneur Luo Yonghao livestreamed for more than six hours on Baidu's Youxuan e-commerce platform, generating over fifty-five million yuan in gross merchandise value before more than thirteen million viewers; the AI called its knowledge base thirteen thousand times, generated approximately ninety-seven thousand words of commentary, and coordinated more than 8,300 avatar movements. A second digital avatar of co-host Zhu Xiaomu appeared simultaneously, an industry first for dual-digital-human interactive livestreaming. The system ran on Baidu's Wenxin 4.5T model and the Xi Ling platform, with face-modeling support from Pinscreen China. Chinese media characterized the broadcast as the livestream industry's DeepSeek moment. Two days later, on June 17, Baidu announced the Mengdie Plan for top-streamer digital humans and the Fanxing Plan, pledging one hundred thousand free Huibo Xing licenses backed by a one hundred million yuan subsidy. The dialogue layer that distinguishes Baidu's digital humans from pipeline-only competitors is its ERNIE large language model family, which evolved from ERNIE 3.0 Titan's 260 billion parameters in 2021 through ERNIE 4.0 in October 2023 to ERNIE 5.0, unveiled at Baidu World 2025 on November 13, 2025 as a natively omni-modal mixture-of-experts model with a 2.4-trillion-parameter architecture jointly processing text, images, audio, and video. ERNIE Bot was made fully free for all users on April 1, 2025 under DeepSeek pressure, and by December 2025 the ERNIE Assistant had reached roughly two hundred million monthly active users, with the Qianfan model-as-a-service platform generating more than thirty billion yuan in AI cloud revenue in 2025.
ByteDance occupies a different position, less anchored to a single named platform than driven by aggressive iteration on the underlying models and infrastructure that digital humans require. The company's enterprise cloud division, Volcengine, offers a comprehensive Volcengine Virtual Digital Human product line spanning two-dimensional real-person avatars, three-dimensional cartoon styles, and three-dimensional hyper-realistic figures, supporting both AI-driven and motion-capture-driven operating modes with broadcasting, interactive, and perceptive subtypes. The Volcengine system supports voice cloning from ten minutes of reference audio and digital clone creation from approximately three minutes of source video with roughly one-hour processing, and a single T4 GPU supports ten concurrent 1080P streams at twenty-five frames per second with both public-cloud API and private deployment options. Named enterprise deployments include the digital employee Xiao Zhi at Xingye Securities, with e-commerce livestream digital humans flowing through the partner Zhongke Shenzhi. On May 15, 2024, Volcengine cut prices on nine enterprise Doubao models to figures roughly 99.3 percent below the prevailing industry average, an action that triggered the Chinese LLM price war whose effects rippled through digital human pricing across the sector. ByteDance's most consequential single technical contribution to the digital human market is OmniHuman-1, released in February 2025, which generates lifelike human videos from a single image plus motion inputs and supports portrait, half-body, and full-body proportions; OmniHuman 1.5 followed shortly afterward, and the open-source LatentSync 1.5 lip-sync framework was released in March 2025. On the speech side, ByteDance published its Seed-TTS research on June 4, 2024 with claims of speech virtually indistinguishable from human recordings, and upgraded the system to Doubao-Seed-TTS 2.0 in October 2025 with enhanced emotional expressiveness. The Doubao consumer assistant, originally branded Skylark when launched through Volcano Engine on August 6, 2023, evolved through Doubao-1.5-pro in January 2025 to Doubao-Seed-2.0 on February 14, 2026, with Pro, Lite, and Mini variants; by March 2025 it was processing 12.7 trillion tokens daily, daily active users surpassed one hundred million by December 2024, and by February 2026 QuestMobile recorded 227 million users in the native AI generation category, placing Doubao ahead of its nearest competitor by roughly one hundred million users. Seed 2.0 ranked sixth on the LMSYS Chatbot Arena and third on Vision Arena at release. At the consumer surface layer, ByteDance offers the Douyin AI Avatar tool in beta to creators with at least 500,000 followers, the AI camera application Xinghui produced by the company's Flow department, and the Jichuang creative production platform launched around October 2023 on Ocean Engine, which provides AI script generation, digital human creation, one-click video production, and integrated e-commerce functions. The continuity from foundational research through enterprise cloud services to consumer creative tools reflects an integrated strategy in which digital humans are one application of a broader investment in multimodal generative AI.
ByteDance's most consequential exit from a single line of business in the digital human space was the April 19, 2024 sale of the virtual idol group A-SOUL to Yuehua Entertainment for thirty million yuan. The group, debuted by ByteDance's gaming subsidiary NUVERSE and Yuehua in November 2020, had generated monthly revenue of 3.15 million yuan by November 2021 and reached substantial cultural prominence before the May 2022 announcement that member Carol would enter livestream hibernation, an event linked in leaked reporting to mistreatment claims and a compensation structure in which a performer received approximately 0.60 yuan from each 138 yuan monthly captain subscription after Bilibili's fifty percent platform cut and the company's revenue allocations. Operations transferred to Yuehua's Nice Future subsidiary; the group now operates as a three-member ensemble under YH Entertainment and performed on the CCTV 2025 Online Spring Festival Gala on January 22, 2025. The sale signaled ByteDance's narrowing of its digital human investments toward platform infrastructure and away from owned virtual idol IP, a decision that contrasts with Tencent's continued cultivation of Xing Tong and other branded virtual characters.
Alibaba's contribution to the digital human market is distributed across foundational model releases, cloud rendering infrastructure, set-piece deployments through Ant Group, and a small number of culturally significant set-piece events. The most important sustained technical contribution is the Tongyi Qianwen family, launched in April 2023 in invitation-only form and opened to the public in September 2023 after CAC approval, renamed simply "Tongyi" in May 2024 with the international model brand Qwen. The release cadence has been aggressive: Qwen-7B opened weights in August 2023, Qwen2 in June 2024, Qwen2.5 in September 2024 across sizes from 0.5 billion to 72 billion parameters, Qwen3 on April 28, 2025 with mixture-of-experts variants up to 235 billion total and 22 billion active parameters trained on 36 trillion tokens across 119 languages under an Apache 2.0 license, and Qwen3.5 and Qwen3.5-Plus in February 2026. By January 2026, the Qwen family had accumulated more than seven hundred million downloads on Hugging Face, surpassing Meta's Llama in October 2025 to become the world's most downloaded model family, with more than 180,000 community-built derivatives. Alibaba has committed approximately fifty-three billion dollars over three years to cloud and AI infrastructure, and its AI products grew at triple-digit rates for eight consecutive quarters through the first half of 2025. On the speech side, CosyVoice 2.0, released in December 2024, achieved 150-millisecond streaming latency and mean opinion scores of 5.53 comparable to professional studio recordings; Alibaba released the system fully open-source under the Apache 2.0 license, accelerating its adoption across the broader ecosystem of digital human builders. DAMO Academy contributed EMO, launched on the Tongyi App on April 25, 2024 to generate singing or talking videos from a single photo plus audio with no 3D modeling required, alongside DreamTalk, OmniTalker for real-time talking heads, MotionShop for character animation, and the open-sourced 3D digital human tooling now embedded across the Tongyi research stack.
The most visible single deployment of Alibaba's digital human capability came not through Alibaba Group itself but through its affiliate Ant Group, which built the Digital Torch Bearer Nongchaoer for the Hangzhou Asian Games. Through an Alipay mini-program more than one hundred million participants from over 130 countries created three-dimensional digital avatars of themselves, and at the opening ceremony on September 23, 2023 a giant digital human composed of these avatars co-lit the main torch alongside swimmer Wang Shun on a 185-by-20-meter screen, marking the first digital torch-lighting in Asian Games or Olympic history. The figure reappeared at the closing ceremony to extinguish the flame. The system was built on the Web3D engine Galacean and incorporated AI facial recognition, motion capture, and blockchain components. Alibaba Cloud separately deployed the digital sign language interpreter Xiaomo, accessible via the Alipay mini-app, primarily for the Asian Para Games in late October, a project developed over roughly two years using a dataset of twenty-five thousand signs. Tmall's commercial deployments of digital humans include Dongdong, the cloud-technology-driven ambassador launched in February 2022 for the Beijing Winter Olympics, and the hyper-realistic meta-human AYAYI, debuted on Xiaohongshu by Ranmai Technology on May 20, 2021 in partnership with Aww Inc.; AYAYI became Tmall's super brand digital manager and Alibaba's first virtual employee in September 2021, accumulating brand collaborations with Louis Vuitton, Guerlain, Tiffany, Burberry, Prada, Kiehl's, Porsche, and others before her cultural prominence faded in late 2022 as production costs of approximately 8,000 to 15,000 yuan per second of video constrained scalability. Alibaba's e-commerce platforms have also formalized digital human governance: the Alibaba International Station digital human livestream registration system launched on January 17, 2024 requires Chinese Gold Suppliers to use approved digital human service providers and display "Digital Human Live" labels, while Taobao's "high-quality digital human livestream room" standard had qualified more than one hundred merchants by late 2025 according to Taobao University materials. Ant Group's broader digital human work includes the Lingjing Digital Human Platform introduced in April 2024 and the AQ health application soft-launched on June 16, 2025, which deploys AI doctor avatars of approximately two hundred renowned specialists from Grade-A tertiary hospitals and was serving more than 110,000 users daily at peak.
Tencent's role in the digital human market is the most distributed of the four and reflects the company's broader strategic preference for ecosystem partnerships and cultural-institutional engagement rather than dominant platform ownership. Tencent Zhiying, fully launched as an AI creation assistant on March 30, 2023, offers digital human broadcasting from text or audio across more than forty official avatar templates in 2D and 3D forms, with photo-customized avatar packages priced at 3,999 yuan annually, video-customized avatars at 7,999 yuan annually, and a professional membership tier at 698 yuan annually. The Tencent Cloud AI Digital Human service spans broadcasting and interactive sub-platforms across 3D realistic, 3D semi-realistic, 3D cartoon, 2D real person, and 2D cartoon avatar styles; small-sample production introduced on April 25, 2023 generates a highly realistic digital human from three minutes of video and one hundred sentences of speech in twenty-four hours, starting at approximately one thousand yuan. Named enterprise customers include Ping An Puhui, Shenzhen Metro, the National Museum of China, Singapore Changi Airport, FAW-Volkswagen, and CITIC Jiantou Securities, and Tencent Cloud co-released the Cultural-Tourism Industry Large-Model Standard with CAICT in September 2024. The Xiaowei Digital Humans system serves as Tencent Cloud's branded character platform, with the specialized derivative Yu Ling functioning as a virtual sign-language interpreter with a vocabulary exceeding 1.6 million signs at greater than ninety percent accuracy. Tencent Yuanqi, the zero-code AI agent platform built on Hunyuan, entered internal testing in mid-2024 and distributes digital personas across WeChat Official Accounts, Mini Programs, QQ, WeChat Customer Service, and Tencent Cloud. AvaMo, an AI avatar video generation service launched on May 13, 2025, addresses the Japanese market with TTS optimized for Japanese voices, reducing video production costs by up to ninety-eight percent.
The company's most distinctive footprint, however, lies in cultural heritage and museum partnerships, anchored by the SSV Digital Culture Lab. Tencent developed Jiayao, the first official Digital Dunhuang Cultural Ambassador, in June 2022 through the Tencent IEG Cultural Heritage Digital Creative Technology Joint Laboratory, with game-engine-based physically based rendering and global dynamic lighting; the figure was inspired by the mythical Kalavinka bird from Mogao Cave 360 and performs Dunhuang dances at events including the Dunhuang Mid-Autumn Night and the China Network Audiovisual Annual Ceremony. Ai Wenwen and Tong Gujin launched at the National Museum of China in July 2022 on the museum's 110th anniversary, developed by the SSV Digital Culture Lab with Tencent Cloud Xiaowei for multimodal interaction, Fantuo Digital for motion and expression capture, Shandong University of Art and Design for image design, and Beijing University of Posts and Telecommunications for the cloud exhibition platform; by 2024 Ai Wenwen had been through three upgrade phases and drew on a knowledge base of more than 1.4 million artifacts. Tencent's Hunyuan large language model, unveiled at the Global Digital Ecosystem Conference in September 2023, evolved through Hunyuan Turbo with heterogeneous mixture-of-experts in September 2024, the open-sourced Hunyuan-Large with 389 billion total and 52 billion active parameters in November 2024, Hunyuan T1 deep-reasoning in March 2025, Hunyuan 2.0 in December 2025 with 406 billion total and 32 billion active parameters and a 256K-token context window, and Hunyuan 3.0 in internal testing as of the March 2026 earnings call. The Hunyuan3D series across versions 1.0 through 3.0 generates three-dimensional models from text, images, or sketches with physically based rendering textures and smart topology optimization, accumulating more than three million community downloads on Hugging Face and integrations with more than 150 enterprises in mainland China — material directly relevant to digital human character modeling. The Yuanbao consumer assistant launched on app stores on May 30, 2024, was embedded directly as a contact within WeChat in April 2025, and by mid-2025 was handling more than ninety percent of question-based queries within WeChat search, reaching roughly 41.6 million monthly active users in the second quarter of that year.
Tencent's investment in Silicon Intelligence represents a substantial indirect position in the digital human market: a 16.59 percent stake makes Tencent the company's largest external investor alongside Sequoia China and China Merchants Bank International. Silicon Intelligence held 32.2 percent of China's digital human agent market by revenue in 2024, with revenue growing from 223 million yuan in 2022 to 655 million yuan in 2024, and filed for a Hong Kong Stock Exchange listing on October 31, 2025. Tencent has also continued to cultivate proprietary virtual idol IP. Xing Tong, originated as an NPC in QQ Dance in 2017, was IP-ified in May 2018 and launched as a VTuber project on February 1, 2021, accumulating brand partnerships with Levi's, Li-Ning, and Make Up For Ever, performances on CCTV and Jiangsu Satellite TV, and a slot on the 2026 Shenzhen Satellite TV Spring Festival Gala. The Unlimited Kings Team boy band derived from Honor of Kings since 2019 has appeared on the cover of GQ and partnered with Givenchy. Tencent Music Entertainment launched TMELAND, China's first virtual music festival, on December 31, 2021 in partnership with XVERSE, drawing 1.1 million participants across QQ Music, Kugou Music, QQ, WeChat, and Tencent Video, and supporting up to 100,000 simultaneous virtual avatars across a virtual space exceeding 130,000 square meters. NExT Studios within Tencent's IEG umbrella has driven foundational digital human research since the August 2018 Shekou TGDC presentation by Xie Weibo on Project Siren, the photorealistic real-time digital human modeled on actress Bingjie Jiang and demonstrated at GDC 2018 with Epic Games, Cubic Motion, 3Lateral, and Vicon. NExT's pipelines include xFaceBuilder for film-grade face production, xMoCap for 3A-level motion capture, and the QuickSilverX Engine for character rendering. A notable internal asymmetry shapes Tencent's commercial footprint: in June 2024, WeChat Video Channels banned digital human livestreaming outright, classifying both AI-driven and human-driven virtual avatars as non-authentic livestream content under revised operational rules, the first major Chinese platform to explicitly do so. The decision excluded digital humans from WeChat Video Channels tipping, mini-program e-commerce, and WeChat Pay integration, separating the company's enterprise digital human services from its dominant consumer surface even as Tencent continued to draft the Digital Human Financial Application Construction Guide alongside ICBC, China Construction Bank, Ping An Bank, and WeBank.
The four companies converge on the same broad market but approach it from distinct strategic positions. Baidu has built the most dedicated and commercially mature digital human platform, anchored by Xi Ling and Huibo Xing and validated by independent market share measurements that placed it first in 2024. ByteDance has pursued a technology-led strategy in which advances in foundation models and generative video research feed an enterprise cloud product line, consumer creative tools, and a price-war stance that has shaped the economics of the entire sector. Alibaba has contributed essential underlying infrastructure, particularly in speech synthesis and open-weight model distribution, and has lent its scale to set-piece cultural deployments through Ant Group while letting more conventional commercial uses develop within the Tmall and Taobao ecosystems. Tencent has prioritized cultural-institutional partnerships, standards-setting, and indirect equity positions alongside its enterprise cloud services, with the WeChat platform's restrictive stance on digital human livestreaming creating a notable internal tension between the company's vendor offerings and its dominant consumer surface. The April 2026 CAC draft Administrative Measures for Digital Virtual Human Information Services touches each of these positions differently: the prohibition on digital humans circumventing facial or voice recognition systems implicates authentication and KYC features inside Alipay and Tencent's financial products and Baidu's Apollo autonomous driving platform; the protection of motion capture performers' rights speaks directly to A-SOUL's history, Xing Tong's operations, and AYAYI's production model; the ban on virtual intimate relationships and addictive services for minors constrains the consumer-companion segment in which all four have nascent products; and the explicit call to "establish and improve the digital virtual human technology standards system" implicitly legitimizes the GY/T 411-2024 broadcasting standard and the broader regulatory ecosystem in which Tencent has played a leading drafting role. The combined commercial weight of these four firms is substantial, yet the IDC measurement showing the market leader at 9.8 percent makes clear that even the concentrated top of China's digital human industry remains fragmented. The structure suggests a market in which the dominant commercial vendors set the technological pace and supply the foundational tooling on which most other deployments depend, while specialist firms, telecommunications operators, content platforms, and a long tail of smaller providers continue to compete for the remaining ninety percent of revenue distributed below the top.
[May 2026]