Supporters of Marcus Endicott’s Patreon can access weekly or monthly video consultations on this topic.

ByteDance

ByteDance (字节跳动) has been reported as building an end-to-end digital human stack that spans model research, product integration, and enterprise deployment across Douyin, CapCut, Jimeng AI, and Volcengine. The reported technical scope covers single-image-plus-audio human video generation, long-sequence audio-driven human video, lip-sync at phoneme level, micro-expression control, and 3D head and avatar asset creation, with named projects including OmniHuman (and OmniHuman-1.5), LatentSync (1.5 and 1.6), InfinityHuman, MimicTalk, and Goku (悟空). Reported releases and updates throughout 2025 describe OmniHuman-1 as generating digital humans from a single photo and audio clip, OmniHuman-1.5 as advancing realism and emotional expression and being surfaced through Seedance on CapCut Web, and LatentSync as an end-to-end lip-sync framework that was jointly open-sourced with Beijing Jiaotong University and later updated to version 1.6. Academic and lab linkages were also reported, including collaboration with HKU on Goku and joint work with Zhejiang University on InfinityHuman and MimicTalk, alongside a described internal “Intelligent Creation Digital Human team” and “Intelligent Creation Lab” supporting multiple ByteDance business lines and commercialization needs, including scenarios framed as scalable creation of large numbers of avatars.

https://bytedance.com

Volcengine (火山引擎) has been reported as the enterprise-facing foundation layer where ByteDance packages digital human creation, animation, and deployment capabilities for external customers. Reports frame the offering as covering customizable avatar design, voice cloning, low-latency animation and interaction for livestreaming and video, and integration with large language models for enterprise applications such as virtual broadcasting, customer service, education, and marketing video production. A prominent 2025 thread reported Volcengine closed-testing a next-generation digital human platform called “Chimera,” described as built by ByteDance’s Intelligent Creation Digital Human team and backed by Volcengine large-model and audio-video capabilities, with stated functions including digital human generation, outfit swapping, and video translation and a potential pricing model by API calls or per-video output. Separate reporting also tied Volcengine to culturally themed deployments, including an intangible cultural heritage digital human “Feifei” (非非) said to rely on the Doubao large model, and a university case in which Shenzhen Technology University’s digital human “Run Xiaozhi” (润晓知) was reported as jointly built with Volcengine.

https://www.volcengine.com

Doubao (豆包) is a large language model-based AI assistant developed by ByteDance and launched in August 2023, positioned as ByteDance's primary consumer-facing generative AI product in China. Built on ByteDance's proprietary Skylark model family, Doubao offers a broad suite of capabilities including conversational AI, text generation, image generation, AI-generated video, and a digital human video creation feature marketed under the Creator Star product line. It is available as a web application and mobile app and has grown rapidly to become one of the most widely used AI assistant applications in China, competing directly with Baidu's Ernie Bot and Alibaba's Tongyi Qianwen. The underlying model infrastructure is also made available to developers and enterprises through the Doubao API on the Volcengine cloud platform, which enables commercial applications including AIGC content pipelines, digital human livestreaming systems, and custom large model deployments. ByteDance has positioned Doubao aggressively on price, cutting API costs substantially in 2024 to accelerate enterprise and developer adoption across industries including e-commerce, media production, and education.

https://www.doubao.com

BytePlus (字节跳动云服务) has been reported as the international cloud and solutions layer that commercializes similar building blocks as packaged services, especially for media and commerce workflows. Coverage characterizes its portfolio as including real-time voice synthesis, virtual digital humans, and livestreaming tooling, positioned to support use cases such as video dubbing, virtual presenters, intelligent assistants, and interactive shopping guides in e-commerce, customer service, education, and entertainment. Expansion activity was reported through partnerships and deployments across Asian markets, including mentions of Thailand and Hong Kong, with digital human capability presented as part of a broader multimodal AI stack intended for production use.

https://www.byteplus.com

Douyin (抖音) has been reported as both a high-volume distribution surface for digital-human content and a platform that increasingly enforces operational constraints to prevent deception, impersonation, and “unmanned” commerce. Reports about 2025 governance describe requirements that virtual-human personas be registered in advance, that virtual-human livestreams carry prominent labeling, and that operators complete real-name registration and verification, with interaction required to be real-person-driven and in real time rather than purely AI-driven. Platform policy was also reported as explicitly prohibiting unmanned livestreaming, and enforcement activity was reported at scale, including handling over 170,000 noncompliant digital-human or recorded or looped live rooms since 2024 or in the prior year, permanently banning over 30,000 accounts, reviewing and handling 44,000 related videos in a case, and applying speaking restrictions to 298 violating accounts. Commentary attributed to Luo Yongxiang framed a specific risk pattern of digital humans running large numbers of accounts to impersonate real-person livestreams, paired with claims of improved AI-host recognition, while merchant-side reports described bans shortly after going live and broader doubts about reliability in AI digital-human commerce operations.

https://www.douyin.com

Jichuang (即创) has been reported as an OceanEngine (oceanengine.com) marketing-technology product that operationalizes ByteDance’s digital human and AIGC capabilities into standardized enterprise workflows, especially for e-commerce and promotional content. Reporting frames it as a one-stop intelligent creation and management platform where users can generate scripts, select digital human models, apply voice synthesis, and output short marketing videos, and it is presented as accessible to brand operators and marketing teams without specialist 3D or animation skills. In 2025 it was reported that Douyin launched “Jichuang” as a one-stop AI intelligent creation platform and that a “Jichuang digital human,” reported as based on Julang Engine’s underlying model, could capture Douyin trending topics and produce adapted materials in 3–5 minutes using functions described as AI inspiration, AI scripts, and one-click auto editing. Operational guidance reported around the same period emphasized the compliance-gated nature of live deployment on Douyin, while also describing common operational configurations in local-life live rooms such as pop-up frequency, single-item and loop pop-ups, intelligent replies, and reply intervals, as well as a go-live method using WeCam to feed a digital-human window as camera input that was framed as improving anti-ban capability, alongside reports and claims about impersonation risks and fast cloning from short audio plus a video segment.

https://aic.oceanengine.com

Jimeng AI (即梦AI), known internationally as Dreamina, is ByteDance's generative AI creation platform, developed by the CapCut/Jianying (剪映) team and operated by Shenzhen Lianmeng Technology Co., Ltd. (深圳市脸萌科技有限公司). The service entered closed beta in March 2024 under the original name Jianying Dreamina and has since grown into a multimodal creation suite covering text-to-image, image-to-video, voice generation, lip-sync, action imitation, and a dedicated digital human module, accessible through a consumer web and mobile app and through an enterprise API distributed via ByteDance's Volcano Engine (火山引擎) cloud platform. The digital human stack is organized around two principal modes. The Quick Mode (快速模式) takes a single portrait image plus an audio clip and returns a synchronized talking-head video, while the Master Mode (大师模式), launched on March 7, 2025, is driven by ByteDance's proprietary OmniHuman-1 model and accepts portrait, half-body, or full-body inputs, including non-photorealistic source images such as anime characters and 3D cartoons, producing lifelike video in which the figure speaks, sings, plays instruments, and moves naturally in accordance with an input audio track. A complementary Action Imitation (动作模仿) feature, also released in March 2025, transfers motion captured in a reference video onto a still image of a person. The underlying lip-sync engine was later upgraded to a 1.5 release, and a successor generation model, OmniHuman 1.5, further expanded controllability over expression, gesture, and whole-body kinematics. ByteDance positions the digital human stack as a low-cost replacement for traditional video shoots in scenarios including character design, presentation, online education, livestream commerce, and short-form social video, and offers a template gallery with more than one hundred avatar styles, including a TikTok-oriented variant aimed at cross-border marketing. In line with the Cyberspace Administration of China's regulatory framework for synthetic media, the platform applies a content review mechanism and labels all digital human outputs with an AI-generated watermark. By the first quarter of 2026, Jimeng AI had reached approximately 13.52 million monthly active users and 5.59 million downloads, placing it among the most widely used domestic Chinese platforms for generative digital human video, and in December 2025 it launched a micro-short drama initiative, Dimensional Folding (次元折叠), built around its digital human and video generation pipeline.

https://jimeng.jianying.com

TikTok functions as a major global distribution and experimentation environment for virtual beings, including virtual humans, avatars, AI-generated characters, and synthetic personalities, particularly within short-form video culture where algorithmic discovery enables rapid audience scaling. On TikTok, virtual beings are deployed across entertainment, marketing, and cultural contexts through fully virtual influencers, hybrid human–AI personas, stylized avatars, and AI-assisted character performances that rely on face-tracking, voice synthesis, motion capture, and generative visual tools. In the China context, TikTok’s ecosystem (including its mainland counterpart) has been instrumental in normalizing virtual idols, branded digital spokescharacters, and narrative-driven synthetic personas, allowing creators, studios, and companies to test engagement, monetization, and parasocial interaction models at scale. The platform’s emphasis on trends, remixing, and participatory formats makes virtual beings particularly effective, as synthetic characters can be rapidly iterated, localized, and embedded into challenges, livestream commerce, and promotional campaigns, reinforcing TikTok’s role as a key infrastructure layer for the visibility, social acceptance, and commercial viability of virtual beings.

https://www.tiktok.com/

(TikTok is owned by ByteDance, a Chinese technology company founded in Beijing in 2012; however, TikTok operates as an international platform with headquarters outside mainland China and is legally separate from ByteDance’s China-only platform Douyin, which serves the domestic Chinese market.)

Page updated

Google Sites

Report abuse