Supporters of Marcus Endicott’s Patreon can access weekly or monthly consultations on this topic.
Part VII, "Enabling Technologies," surveys the full technical stack behind creating realistic digital humans and virtual beings. It moves from the foundations of facial animation and rigging (blendshapes, RigLogic, JALI, face-swapping) through the major production toolchains like Epic's MetaHuman and NVIDIA's Audio2Face/Omniverse pipeline, into modern rendering and 3D standards (Gaussian splatting avatars, SMPL body modeling, USD scene description). From there it broadens into the generative-media layer—AI voice, video, dubbing, and image generation—and then the cognitive core, covering large language models, NLP, reasoning techniques like Tree of Thought and RAG, and experimental cognitive architectures that blend things like Jungian psychology and abstract state machines. The final chapters connect these pieces to agency and the physical world, examining how RPA and LLMs combine into autonomous agents, how avatars serve as human-computer interfaces (holograms, biometric integration, projection robots), and the underlying compute infrastructure (NVIDIA DGX, neuromorphic supercomputers, brain-research projects), closing with the question of potential machine consciousness. In short, it's a layered tour from the surface of a digital face down through its "mind" and out to the hardware and embodiment that bring it to life.
PART VII — Enabling Technologies
Chapter 28. Faces, Rigging, and Animation
Neural Face Rigging: A Novel Paradigm in Facial Animation and Retargeting
The Evolution of Facial Animation: A Deep Dive into RigLogic
Face Rigging in 3D Animation: Techniques, Challenges, and Future Trends
JALI Research: Advancing Facial Animation Through Innovative Software
Advancements and Applications of Speech Graphics' SG Com in Real-Time Facial Animation
Digital Domain's Charlatan: Revolutionizing Digital Facial Manipulation
Algorithmic Foundations and Challenges of Face Swapping Technology
Chapter 29. The MetaHuman and NVIDIA Stacks
Digital Human Creation: An Analysis of NVIDIA's Audio2Face and Related Tools
Real-time Lip-syncing and Facial Animation in Unity using NVIDIA Omniverse Audio2Face
Guide to Integrating ElevenLabs API with NVIDIA Omniverse Audio2Face
Navigating the Evolving Landscape of Audio2Face: Key Challenges and Potential Solutions
Unreal Engine Pixel Streaming Plugin versus Nvidia Omniverse Audio2Face
Pixel Streaming and WebRTC: Similarities and Differences in Real-Time Content Delivery
NVIDIA Containers: Enabling GPU Acceleration in Containerized Environments
Chapter 30. Rendering, Avatars, and 3D Standards
Relightable Gaussian Codec Avatars: Advancing Realistic Real-time Avatar Rendering
The Evolution of Avatars Through Gaussian Splatting: A Technological Leap Forward
Universal Scene Description: An Emerging Standard for Scalable Digital Human Creation
Advancing Embodied Intelligence Through Foundation Models for 3D Humans
Creating and Fine-Tuning Life-sized 3D Digital Humans: A Step-by-Step Guide
The Evolution and Intersection of Character Sheets and 3D Avatars
Chapter 31. Voice, Video, and Generative Media
Evolution and Future Prospects of Speech Emotion Recognition Technology
The Rise of AI Video Generators: Transforming Content Creation
How Open-Source AI Video Tools Control Character Identity, Motion, and Speech Today
The Convergence of AI Video Dubbing and Virtual Beings Technologies
Integrating Natural Language Processing with Visual Narrative for Multimodal Communication
Chapter 32. Language Models, NLP, and Cognitive Architecture
From Words to Worlds: The Remarkable Evolution of Large Language Models
NLP Techniques in Large Language Models: Foundations, Advanced Methods, and Future Directions
Advances and Applications of Deep Natural Language Processing
Advancements in Large Language Model Reasoning: Techniques, Applications and Future Directions
Retrieval-Augmented Generation: Enhancing Language Models with External Knowledge
Vector Databases in Retrieval Augmented Generation (RAG) Systems
Abstract State Machines and Large Language Models in Multimodal Cognitive Architectures
Incorporating Jungian Psychology into Multimodal Cognitive Architectures
The Intersection of Metaphor, Cognition, and Artificial Intelligence
Uncovering the Inner Workings of Claude 3.5 Haiku Through Circuit Tracing
Multimodal Models and the Evolution of Virtual Beings: A New Frontier in Human-AI Interaction
Chapter 33. Agents, Automation, and Embodiment
Integrating RPA and Large Language Models in the Creation of Autonomous Digital Entities
RPA and Digital Humans Converge in Agentic AI to Transform Business
The Synergy Between Hyper-Realistic Avatars and Large Language Models
Digital Humans as Interface: Revolutionizing Human-Computer Interaction
Integrating Biometric Identification with Virtual Beings Transforms Human-Digital Interaction
Conversational AI Holograms: Reshaping Business Interactions and Operations
Rear Projection Robots: Merging Virtual Expressiveness With Physical Robotics
The Dual Journeys of Digital Humans: Navigating the Virtual and Physical Realms
Chapter 34. Infrastructure and Compute
Advancing AI Development: An Overview of NVIDIA's DGX Series Supercomputers
DeepSouth: A Neuromorphic Supercomputer for Studying Brain Computation and Advancing AI
Integrating Neuroscience and AI: The Blue Brain Project and INAIT
Assessing the Potential for Consciousness in AI Through Neuroscientific Indicators