Supporters of Marcus Endicott’s Patreon can access weekly or monthly video consultations on this topic.
ByteDance (字节跳动) has been reported as building an end-to-end digital human stack that spans model research, product integration, and enterprise deployment across Douyin, CapCut, Jimeng AI, and Volcengine. The reported technical scope covers single-image-plus-audio human video generation, long-sequence audio-driven human video, lip-sync at phoneme level, micro-expression control, and 3D head and avatar asset creation, with named projects including OmniHuman (and OmniHuman-1.5), LatentSync (1.5 and 1.6), InfinityHuman, MimicTalk, and Goku (悟空). Reported releases and updates throughout 2025 describe OmniHuman-1 as generating digital humans from a single photo and audio clip, OmniHuman-1.5 as advancing realism and emotional expression and being surfaced through Seedance on CapCut Web, and LatentSync as an end-to-end lip-sync framework that was jointly open-sourced with Beijing Jiaotong University and later updated to version 1.6. Academic and lab linkages were also reported, including collaboration with HKU on Goku and joint work with Zhejiang University on InfinityHuman and MimicTalk, alongside a described internal “Intelligent Creation Digital Human team” and “Intelligent Creation Lab” supporting multiple ByteDance business lines and commercialization needs, including scenarios framed as scalable creation of large numbers of avatars.
Volcengine (火山引擎) has been reported as the enterprise-facing foundation layer where ByteDance packages digital human creation, animation, and deployment capabilities for external customers. Reports frame the offering as covering customizable avatar design, voice cloning, low-latency animation and interaction for livestreaming and video, and integration with large language models for enterprise applications such as virtual broadcasting, customer service, education, and marketing video production. A prominent 2025 thread reported Volcengine closed-testing a next-generation digital human platform called “Chimera,” described as built by ByteDance’s Intelligent Creation Digital Human team and backed by Volcengine large-model and audio-video capabilities, with stated functions including digital human generation, outfit swapping, and video translation and a potential pricing model by API calls or per-video output. Separate reporting also tied Volcengine to culturally themed deployments, including an intangible cultural heritage digital human “Feifei” (非非) said to rely on the Doubao large model, and a university case in which Shenzhen Technology University’s digital human “Run Xiaozhi” (润晓知) was reported as jointly built with Volcengine.
Doubao (豆包) is a large language model-based AI assistant developed by ByteDance and launched in August 2023, positioned as ByteDance's primary consumer-facing generative AI product in China. Built on ByteDance's proprietary Skylark model family, Doubao offers a broad suite of capabilities including conversational AI, text generation, image generation, AI-generated video, and a digital human video creation feature marketed under the Creator Star product line. It is available as a web application and mobile app and has grown rapidly to become one of the most widely used AI assistant applications in China, competing directly with Baidu's Ernie Bot and Alibaba's Tongyi Qianwen. The underlying model infrastructure is also made available to developers and enterprises through the Doubao API on the Volcengine cloud platform, which enables commercial applications including AIGC content pipelines, digital human livestreaming systems, and custom large model deployments. ByteDance has positioned Doubao aggressively on price, cutting API costs substantially in 2024 to accelerate enterprise and developer adoption across industries including e-commerce, media production, and education.
BytePlus (字节跳动云服务) has been reported as the international cloud and solutions layer that commercializes similar building blocks as packaged services, especially for media and commerce workflows. Coverage characterizes its portfolio as including real-time voice synthesis, virtual digital humans, and livestreaming tooling, positioned to support use cases such as video dubbing, virtual presenters, intelligent assistants, and interactive shopping guides in e-commerce, customer service, education, and entertainment. Expansion activity was reported through partnerships and deployments across Asian markets, including mentions of Thailand and Hong Kong, with digital human capability presented as part of a broader multimodal AI stack intended for production use.
Douyin (抖音) has been reported as both a high-volume distribution surface for digital-human content and a platform that increasingly enforces operational constraints to prevent deception, impersonation, and “unmanned” commerce. Reports about 2025 governance describe requirements that virtual-human personas be registered in advance, that virtual-human livestreams carry prominent labeling, and that operators complete real-name registration and verification, with interaction required to be real-person-driven and in real time rather than purely AI-driven. Platform policy was also reported as explicitly prohibiting unmanned livestreaming, and enforcement activity was reported at scale, including handling over 170,000 noncompliant digital-human or recorded or looped live rooms since 2024 or in the prior year, permanently banning over 30,000 accounts, reviewing and handling 44,000 related videos in a case, and applying speaking restrictions to 298 violating accounts. Commentary attributed to Luo Yongxiang framed a specific risk pattern of digital humans running large numbers of accounts to impersonate real-person livestreams, paired with claims of improved AI-host recognition, while merchant-side reports described bans shortly after going live and broader doubts about reliability in AI digital-human commerce operations.
Jichuang (即创) has been reported as an OceanEngine (oceanengine.com) marketing-technology product that operationalizes ByteDance’s digital human and AIGC capabilities into standardized enterprise workflows, especially for e-commerce and promotional content. Reporting frames it as a one-stop intelligent creation and management platform where users can generate scripts, select digital human models, apply voice synthesis, and output short marketing videos, and it is presented as accessible to brand operators and marketing teams without specialist 3D or animation skills. In 2025 it was reported that Douyin launched “Jichuang” as a one-stop AI intelligent creation platform and that a “Jichuang digital human,” reported as based on Julang Engine’s underlying model, could capture Douyin trending topics and produce adapted materials in 3–5 minutes using functions described as AI inspiration, AI scripts, and one-click auto editing. Operational guidance reported around the same period emphasized the compliance-gated nature of live deployment on Douyin, while also describing common operational configurations in local-life live rooms such as pop-up frequency, single-item and loop pop-ups, intelligent replies, and reply intervals, as well as a go-live method using WeCam to feed a digital-human window as camera input that was framed as improving anti-ban capability, alongside reports and claims about impersonation risks and fast cloning from short audio plus a video segment.