Research archive
Papers.
The research behind character technology: rigging, deformation, muscles and CFX, facial capture, retargeting, and motion synthesis. Papers from SIGGRAPH, DigiPro, SCA and the studio labs, alongside conference and engine talks from GDC, FMX, Annecy, Unreal, Houdini and Maya, all connected so you can trace how each idea grew out of the ones before it.
Most built upon
Interactive Skeleton Techniques for Enhancing Motion Dynamics in Key Frame Animation101 descendants in the lineage
Busiest year
2023
78 entries published
Entries per year
1972 to 2026
Topic trends
entries / year, 1972–2026landmark work, hover to trace
Pick a topic to trace its lineage
The venues behind 676 of the entries, from SIGGRAPH and the academic circuit to the engine and studio events. Journals and preprints have no fixed home; rotating conferences are pinned to a recent or frequent host.
drag to spin · click a marker to filter the list
Type
Topic
Venue
Org
Relations
Keywords
2026
14-
Presents a fully automated pipeline for wrapping and cleaning up dozens of FACS expression scans using WrapAI to build silver-level likeness rigs in minutes.
Facial / Skinning
-
,
Covers updates to Weta's muscle-driven APFS facial system and Bodyopt full-body skin-deformation network for performance-captured Avatar characters.
Facial / Muscles / Retargeting
- CANRIG: Cross-Attention Neural Face Rigging with Variable Local Control Eurographics Disney Research
, , , , , ,
Cross-attention neural network for face rigging enabling variable local control over expression and pose for production face pipelines.
abstract ▾ abstract ▴
CANRig is a fully automated neural facial rigging method that lets artists deform a 3D face mesh by manipulating sparse control handles with continuous, user-defined locality at runtime. It formulates deformation as a cross-attention between control handles and mesh vertices, where the attention matrix is masked by per-vertex influence weights derived from a user-specified radius and falloff, enabling smooth transitions from precise local edits to broad global changes. A shape-preserving workflow built from zero-preserving MLPs, an added base-shape attention column, and a shape-preserving loss guarantees that previous edits are maintained during iterative, non-destructive editing and yields zero-error inversion of captured performances. For high-resolution live-action meshes the method predicts local patch blendshape weights instead of per-vertex deltas, and the authors demonstrate posing, clip editing, dialogue replacement, and expression transfer across both feature animation and visual effects pipelines.
Related Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild · Example-Based Facial Rigging · Reusable Facial Rigging and Animation: Create Once, Use Many · Dynamic 3D Avatar Creation from Hand-Held Video Input
how to read this ▾ how to read this ▴
- Category
- Method: neural face rigging with variable local control
- Contributions
-
- CANRig, a fully automated neural facial rig that deforms a face mesh via sparse control handles with continuous, user-defined locality at runtime
- A cross-attention formulation between handles and vertices, masked by per-vertex influence weights from a user-specified radius and falloff, spanning local to global edits
- A shape-preserving workflow (zero-preserving MLPs, base-shape attention column, shape-preserving loss) giving non-destructive iterative editing and zero-error inversion of captured performances, with local patch-blendshape prediction for high-res live-action meshes
- Context
- Continues neural rig-approximation work such as Accurate Face Rig Approximation with Deep Differential Subspace Reconstruction (Song et al. 2020), adding artist-controllable locality via attention. Builds on: Accurate Face Rig Approximation with Deep Differential Subspace Reconstruction
- Correctness
- Capabilities (posing, clip editing, dialogue replacement, expression transfer) are demonstrated across feature-animation and VFX pipelines; the zero-error inversion and shape preservation are architectural guarantees, so the practical question is generalization and handle-placement sensitivity rather than reconstruction error.
- Clarity
- Dense but well-structured; a first pass conveys the handle-and-locality idea, a second pass needed for the masked cross-attention and shape-preserving components.
- How to read it
- Read first for how locality is expressed via the masked attention weights; second pass on the zero-preserving MLPs and shape-preserving loss if you need the non-destructive-editing guarantees.
Facial / Rigging / ML Deformation
-
,
Embark Studios outlines their Houdini and USD-based procedural character pipeline shared across Arc Raiders and The Finals, balancing manual and automated rigging workflows.
Rigging / Skinning
- talk High-Fidelity on PC and Mobile: The 'Delta Force' Character Production Pipeline (Presented by Tencent Games) GDC Industrial
Details Team Jade's cross-platform character art pipeline for Delta Force, covering asset workflows that deliver AAA fidelity on both PC and mobile targets.
Skinning / Rigging
-
Examines how Pixar's Hoppers unified modeling, animation, simulation, and lighting to achieve felted cloth characters with physical realism and stylized appeal.
CFX / Skinning
-
Animation director at Illogic Studios demonstrates feature-quality rigging and cross-department animation workflow for a commercial wolf character at production scale.
Rigging / Skinning
-
Describes a reinforcement-learning locomotion controller that trains robot enemies to self-discover terrain-adaptive walking and combat movement in physics simulation.
Motion Synthesis
-
,
Covers building a Vicon Valkyrie-based performance capture facility with custom file-management APIs to support complex multi-actor stunt and dialogue capture pipelines.
Retargeting / Facial
- MotionBricks: Scalable Real-Time Motions with Modular Latent Generative Model and Smart Primitives SIGGRAPH Industrial project ↗
, , , , , , , , , , , , , , ,
MotionBricks is a real-time generative motion framework: a modular latent backbone models over 350,000 clips in one network and a smart-primitive interface authors navigation and object interaction plug-and-play, reaching 15,000 FPS at 2ms latency and deployed zero-shot to a Unitree G1 humanoid.
abstract ▾ abstract ▴
Despite transformative advances in generative motion synthesis, real-time interactive motion control remains dominated by traditional techniques. In this work, we identify two key challenges in bridging research and production: 1) Real-time scalability: Industry applications demand real-time generation of a vast repertoire of motion skills, while generative methods exhibit significant degradation in quality and scalability under real-time computation constraints, and 2) Integration: Industry applications demand fine-grained multi-modal control involving velocity commands, style selection, and precise keyframes, a need largely unmet by existing text- or tag-driven models. Moreover, a systematic motion design interface for generative models remains absent. To overcome these limitations, we introduce MotionBricks: a large-scale, real-time generative framework with a two-fold solution. First, we propose a large-scale modular latent generative backbone tailored for robust real-time motion generation, effectively modeling a dataset of over 350,000 motion clips with a single model. Second, we introduce smart primitives that provide a unified, robust, and intuitive interface for authoring both navigation and object interaction. Notably, MotionBricks applies to new downstream tasks in a zero-shot manner, where no finetuning or task-specific tagging is required. Applications can be designed in a plug-and-play manner like assembling bricks without expert animation knowledge, enabling an accessible interface for applications in animation and robotics. Quantitatively, we show that MotionBricks produces state-of-the-art motion quality on open-source and proprietary datasets of various scales, while also achieving a real-time throughput of 15,000 FPS with 2ms latency. We demonstrate the flexibility and robustness of MotionBricks in a complete production-level animation demo, covering navigation and object-scene interaction across various styles with a unified model. To showcase our framework's application beyond animation, we deploy MotionBricks on the Unitree G1 humanoid robot to demonstrate its flexibility and generalization for real-time robotic control.
Related Character Motion Synthesis by Topology Coordinates · Mode-Adaptive Neural Networks for Quadruped Motion Control · Motion Retargeting for Crowd Simulation · SMPL: A Skinned Multi-Person Linear Model
how to read this ▾ how to read this ▴
- Category
- Method / system: a real-time generative motion framework (modular latent backbone plus a smart-primitive authoring interface) for animation and robotics
- Contributions
-
- A modular latent generative backbone built on motion in-betweening that models a single dataset of over 350,000 motion clips and runs far above real time (reported 15,000 FPS at 2ms latency).
- Smart primitives (smart locomotion and smart object) that give a unified, plug-and-play interface for authoring navigation and object interaction without animation graphs or per-task tagging, and that apply zero-shot to new downstream tasks.
- An end-to-end production-level demonstration spanning a UE5 animation demo (locomotion, acrobatics, object-scene interaction) and real-world whole-body control deployed on a Unitree G1 humanoid robot.
- Context
- A generative successor to control-driven runtime motion methods, from motion matching and phase-functioned or learned controllers to physics-based RL and diffusion motion models, aimed squarely at the real-time, controllable, production-scale regime those earlier lines could not satisfy all at once. Builds on: Motion Matching and The Road to Next-Gen Animation · Phase-Functioned Neural Networks for Character Control · Learned Motion Matching · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Human Motion Diffusion Model · DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds
- Correctness
- Quality is reported as state of the art on open-source and proprietary datasets of various scales; the headline 15,000 FPS and 2ms numbers are throughput and latency for the neural backbone under the authors' own setup, and the robot and UE5 results are qualitative production demonstrations rather than controlled user studies, so generality beyond the shown scenarios should be read with that in mind.
- Clarity
- The two framed challenges (real-time scalability and fine-grained integration) and the bricks metaphor make the high-level design easy to follow; the structured latent backbone and the in-betweening formulation are the parts that reward a careful second read.
- How to read it
- First pass for the two production challenges and the smart-primitive interface idea; second pass on the structured modular latent design and the in-betweening backbone if you build, extend, or benchmark real-time motion-control systems.
Motion Synthesis / Retargeting
-
, , , , , , , , , , , , , ,
SOMA introduces a canonical body topology and universal rig, reducing cross-model adapter complexity from O(M^2) to O(M) via inverse-LBS pose solving and FK optimization.
abstract ▾ abstract ▴
Parametric human body models are foundational to human reconstruction, animation, and simulation, yet they remain mutually incompatible: SMPL, SMPL-X, MHR, Anny, and related models each diverge in mesh topology, skeletal structure, shape parameterization, and unit convention, making it impractical to exploit their complementary strengths within a single pipeline. We present SOMA, a unified body layer that bridges these heterogeneous representations through three abstraction layers. Mesh topology abstraction maps any source model's identity to a shared canonical mesh in constant time per vertex. Skeletal abstraction recovers a full set of identity-adapted joint transforms from any body shape, whether in rest pose or an arbitrary posed configuration, in a single closed-form pass, with no iterative optimization or per-model training. Pose abstraction inverts the skinning pipeline to recover unified skeleton rotations directly from posed vertices of any supported model, enabling heterogeneous motion datasets to be consumed without custom retargeting. Together, these layers reduce the $O(M^2)$ per-pair adapter problem to $O(M)$ single-backend connectors, letting practitioners freely mix identity sources and pose data at inference time. The entire pipeline is fully differentiable end-to-end and GPU-accelerated via NVIDIA-Warp.
how to read this ▾ how to read this ▴
- Category
- Method: a unifying interoperability layer for parametric body models
- Contributions
-
- A canonical body topology with mesh topology abstraction that maps any source model's identity to a shared canonical mesh in constant time per vertex
- Skeletal abstraction recovering identity-adapted joint transforms from any body shape (rest or posed) in a single closed-form pass, with no iterative optimization or per-model training
- Pose abstraction that inverts skinning (inverse-LBS) to recover unified skeleton rotations from posed vertices, collapsing the O(M^2) per-pair adapter problem to O(M) single-backend connectors
- Context
- It sits on top of and bridges established parametric body models (SMPL, SMPL-X, the Momentum Human Rig (MHR), and Anny) and motion archives like AMASS, aiming to make their complementary strengths usable in one pipeline. Builds on: SMPL: A Skinned Multi-Person Linear Model · Expressive Body Capture: 3D Hands, Face, and Body from a Single Image · AMASS: Archive of Motion Capture as Surface Shapes · MHR: Momentum Human Rig · Human Mesh Modeling for Anny Body
- Correctness
- The approach assumes the heterogeneous source models (differing in topology, skeleton, shape parameterization, and units) can be reconciled through a shared canonical representation and an invertible skinning pipeline; readers should note it is demonstrated on the specific supported models listed, and fidelity of the inverse-LBS pose recovery and closed-form joint solve under extreme shapes or out-of-distribution poses is the natural thing to scrutinize.
- Clarity
- Reasonably accessible at a high level (the three-layer abstraction framing reads cleanly); a first pass conveys the unifying idea, but a second pass is needed to follow the closed-form skeletal solve and the inverse-skinning formulation.
- How to read it
- First pass: read the abstract and fix the three abstraction layers (mesh, skeletal, pose) and the O(M^2) to O(M) framing in mind. Do a second pass on the inverse-LBS pose solving and the closed-form FK/joint-transform recovery if you intend to implement retargeting or mix motion datasets; check which models are actually supported before assuming coverage.
Retargeting / Skinning / Rigging
-
,
Lead Creatures TD Jono Dysart details creature development pipelines for Vecna and the Mind Flayer across 1,185 VFX shots, including skin, muscle, and cloth systems.
Muscles / CFX / Skinning
- talk The Hidden Ones: Real-Time AI Generation of Kung Fu Motions (Presented by Tencent Games AI) GDC Industrial
Introduces a markerless mocap capture pipeline paired with a dedicated generation model for real-time fluid Kung Fu transition synthesis in a fighting game.
Motion Synthesis / Retargeting
-
Presents VISVISE's full-stack pipeline covering 3D structural sketch generation, auto-retopology, auto-rigging, high-precision auto-skinning, and motion in-betweening.
Rigging / Skinning / Motion Synthesis
2025
39- talk AI in Maya: Autodesk CEO and Animation Product Manager Demo MotionMaker, FaceAnimator and More AU Autodesk
,
Autodesk University 2025 live demo by CEO Andrew Anagnost and Product Manager Lance Thornton showing Maya's MotionMaker locomotion synthesis, FaceAnimator, and additional AI-driven character animation tools.
abstract ▾ abstract ▴
In this Autodesk University 2025 live demo, CEO Andrew Anagnost and product manager Lance Thornton show MotionMaker fully integrated into Maya generating plausible human locomotion from just a couple of spatial keyframes via live neural-network inference, trained on a large set of human motion data so it reproduces weight shifts, arm swing, and walk-to-run transitions. They demonstrate art-directing the result by adding a keyframe to redirect the character around a corner and regenerating on the fly. The session also previews FaceAnimator, which learns a character's facial and lip motion from existing animated sequences of a show and then generates face animation directly from an input audio file, and teases Autodesk Assistant scene editing in Maya driven by MCP servers.
Related Meet MotionMaker: New AI Animation Tool In Maya · Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Mode-Adaptive Neural Networks for Quadruped Motion Control
how to read this ▾ how to read this ▴
- Category
- Production talk / product demo (AI animation tools in Maya)
- Contributions
-
- Demonstrates MotionMaker generating plausible human locomotion from a couple of spatial keyframes via live neural-network inference
- Shows art-directing the result by adding keyframes to redirect the character and regenerating on the fly
- Previews FaceAnimator learning facial/lip motion from a show's existing sequences to drive face animation from audio, and teases MCP-driven Autodesk Assistant scene editing
- Context
- An Autodesk University product demo extending the MotionMaker AI animation direction (the earlier Meet MotionMaker introduction) into an integrated Maya feature set spanning locomotion synthesis, audio-driven face animation, and assistant-based scene editing. Builds on: Meet MotionMaker: New AI Animation Tool In Maya
- Correctness
- Vendor demo, not peer-reviewed; outputs depend on the underlying training data (large human-motion sets for locomotion, a show's own sequences for FaceAnimator) and are shown in curated live demos, so generalization and edit fidelity beyond the demo should be treated cautiously.
- Clarity
- Highly accessible; a single viewing conveys the capabilities and intended workflow.
- How to read it
- Watch for the keyframe-to-motion interaction model and how art direction is layered onto generated output; treat as a capability preview rather than a method, with no second pass needed for theory.
Motion Synthesis / Facial
-
, , , ,
2D generative prior guides joint placement and skinning weight prediction for diverse skeleton and mesh configurations.
abstract ▾ abstract ▴
Despite the growing accessibility of skeletal motion data, integrating it for animating character meshes remains challenging due to diverse configurations of both skeletons and meshes. Specifically, the body scale and bone lengths of the skeleton should be adjusted in accordance with the size and proportions of the mesh, ensuring that all joints are accurately positioned within the character mesh. Furthermore, defining skinning weights is complicated by variations in skeletal configurations, such as the number of joints and their hierarchy, as well as differences in mesh configurations, including their connectivity and shapes. While existing approaches have made efforts to automate this process, they hardly address the variations in both skeletal and mesh configurations. In this paper, we present a novel method for the automatic rigging and skinning of character meshes using skeletal motion data, accommodating arbitrary configurations of both meshes and skeletons. The proposed method predicts the optimal skeleton aligned with the size and proportion of the mesh as well as defines skinning weights for various meshskeleton configurations, without requiring explicit supervision tailored to each of them. By incorporating Diffusion 3D Features (Diff3F) as semantic descriptors of character meshes, our method achieves robust generalization across different configurations.
Related One Model to Rig Them All: Diverse Skeleton Rigging with UniRig · SkinMixer: Blending 3D Animated Models · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · Automatic Rigging and Animation of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: automatic rigging and skinning
- Contributions
-
- Automatic rigging and skinning of character meshes from skeletal motion data across arbitrary skeleton and mesh configurations
- Predicts an optimal skeleton aligned to mesh size and proportions, positioning joints inside the mesh
- Predicts skinning weights without requiring explicit supervision, guided by a 2D generative prior
- Context
- Builds on neural rigging for articulated characters such as RigNet, extending it to handle variation in both skeletal hierarchies and mesh connectivity by leveraging a 2D generative prior. Builds on: RigNet: Neural Rigging for Articulated Characters
- Correctness
- The central claim is that a 2D generative prior can guide joint placement and weight prediction without explicit supervision across diverse configurations; readers should check what range of skeleton hierarchies and mesh topologies the method was actually evaluated on and how failure cases (badly proportioned or non-humanoid meshes) are handled.
- Clarity
- Likely accessible at the conceptual level; a first pass conveys the prior-guided idea, a second pass is needed for the joint-placement and weight formulation.
- How to read it
- First pass for the problem framing (joint-and-mesh configuration variety) and the role of the 2D prior; do a second pass if you need the actual skeleton-prediction and weight-inference formulation, and inspect the qualitative results to judge generalization.
Rigging / Skinning
- ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling ICCV Academic 8 cites
, , , , , , , , ,
Parametric body model trained on 600k scans that grounds the mesh in an explicit skeletal basis, decoupling bone length and shape from soft-tissue deformation for independent anatomical control.
abstract ▾ abstract ▴
Parametric body models offer expressive 3D representation of humans across a wide range of poses, shapes, and facial expressions, typically derived by learning a basis over registered 3D meshes. However, existing human mesh modeling approaches struggle to capture detailed variations across diverse body poses and shapes, largely due to limited training data diversity and restrictive modeling assumptions. Moreover, the common paradigm first optimizes the external body surface using a linear basis, then regresses internal skeletal joints from surface vertices. This approach introduces problematic dependencies between internal skeleton and outer soft tissue, limiting direct control over body height and bone lengths. To address these issues, we present ATLAS, a high-fidelity body model learned from $600 k$ high-resolution scans captured using 240 synchronized cameras. Unlike previous methods, we explicitly decouple the shape and skeleton bases by grounding our mesh representation in the human skeleton. This decoupling enables enhanced shape expressivity, fine-grained customization of body attributes, and keypoint fitting independent of external soft-tissue characteristics.
Related SUPR: A Sparse Unified Part-Based Human Representation · Expressive Body Capture: 3D Hands, Face, and Body from a Single Image · STAR: Sparse Trained Articulated Human Body Regressor · SoftSMPL: Data-driven Modeling of Nonlinear Soft-tissue Dynamics for Parametric Humans
how to read this ▾ how to read this ▴
- Category
- Method: parametric human body model
- Contributions
-
- ATLAS, a high-fidelity parametric body model learned from 600k high-resolution scans captured with 240 synchronized cameras
- Explicitly decouples shape and skeleton bases by grounding the mesh representation in the human skeleton
- Enables fine-grained control of body height and bone lengths plus keypoint fitting independent of soft-tissue characteristics
- Context
- Sits in the SMPL lineage of learned parametric body models and shares the biomechanical-grounding motivation of From Skin to Skeleton, but reverses the usual surface-first then skeleton-regression order to decouple skeleton from soft tissue. Builds on: From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans · SMPL: A Skinned Multi-Person Linear Model
- Correctness
- The key assumption is that decoupling skeletal and shape bases gives independent anatomical control without sacrificing expressivity; this rests on a large but studio-captured scan set (240 cameras, 600k scans), so readers should consider how demographic coverage of that capture affects generalization beyond the captured population.
- Clarity
- Abstract is clear on motivation and design; the model fitting and basis construction will need a second, formulation-focused pass.
- How to read it
- First pass for the decoupling argument and why surface-first regression is problematic; second pass for the basis learning and fitting math if you plan to use or compare against the model, and note the capture-rig scale as both a strength and a coverage caveat.
Muscles / ML Deformation
-
Framestore Houdini Connect breakdown of crafting Appa's realistic fur groom and Momo's creature design for Netflix's live-action Avatar, plus storm and water FX.
abstract ▾ abstract ▴
Framestore breaks down its character and effects work on Netflix's live-action Avatar: The Last Airbender, focusing on grounding the sky bison Appa with a realistic fur groom and designing the creature Momo (a blend of flying lemur, bat and monkey referenced against fennec foxes and monkeys). The team describes using Houdini across the show for Appa's fur, digi-doubles, cloth dynamics and muscle simulation, plus a large storm sequence requiring water, rain and cloud volumetric simulations. They emphasize Houdini's node-based flexibility for blending fluid and particle simulations together with fur simulation for cleaner integration. Animation exploration tests such as walk cycles and grabbing actions were used to balance Momo's oversized cartoon ears against believable motion.
Related Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Creating a Photorealistic Hyena · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks
how to read this ▾ how to read this ▴
- Category
- Production talk / breakdown
- Contributions
-
- Demonstrates grounding the sky bison Appa with a realistic fur groom and creature design for Momo (lemur, bat and monkey blend)
- Shows Houdini used across the show for fur, digi-doubles, cloth dynamics and muscle simulation, plus a large storm sequence with water, rain and cloud volumetrics
- Highlights node-based blending of fluid, particle and fur simulation for cleaner integration, and animation tests to balance Momo's oversized ears against believable motion
- Context
- A studio breakdown in the lineage of Framestore's photoreal-creature work (for example the Stags & Stripes FMX talk), applied here to Netflix's live-action Avatar: The Last Airbender. Builds on: Stags & Stripes, Creating Photoreal Characters | Framestore | FMX HIVE Europe 2021
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a shipped show, and choices are driven by art direction and pipeline constraints rather than controlled evaluation.
- Clarity
- Accessible breakdown aimed at practitioners; a single watch conveys the workflow, no formal formulation to revisit.
- How to read it
- Watch once for the Houdini workflow decisions (fur groom on Appa, multi-solver blending, the storm volumetrics) and the animation-vs-design tradeoff on Momo; rewatch only segments relevant to a specific pipeline problem you face.
CFX
-
Maya Learning Channel tutorial by rigging expert Matthew Tucker introducing the Bifrost Modular Rigging Framework, covering modules, pins, and transforms for building procedural node-based animation rigs.
abstract ▾ abstract ▴
Matthew Tucker introduces Autodesk's Bifrost rigging module system, a fully procedural compound-based framework for building modular character rigs inside Maya. Each rig part is a self-contained module that encapsulates joint and control hierarchies plus transformation logic, connected through standardized input, output, and parent ports. Pins act as guide transforms that define the rest pose and drive the pivot matrix, and because the system is procedural, pins can be moved after animation without breaking the rig or requiring a rebuilt bind pose for skin clusters. Transforms share a core type with automatic propagation, animation is driven via operator matrices, and custom behavior such as inverse kinematics is authored in the user animation compound, with the included arm module offered as a reference example.
Related Bifrost Rigging in Maya - Part 2: The Module Interface · Bifrost Rigging in Maya - Part 3: Creating a Custom Module · Deliver Faster Rigging and Animation with AI · Group Based Rigging of Realistically Feathered Wings
how to read this ▾ how to read this ▴
- Category
- Production talk / tool tutorial
- Contributions
-
- Introduces Autodesk's Bifrost Modular Rigging Framework, a fully procedural compound-based system for building modular character rigs in Maya
- Explains modules as self-contained units encapsulating joint and control hierarchies plus transform logic, connected via standardized input, output and parent ports
- Shows pins as guide transforms defining the rest pose, movable after animation without breaking the rig or rebuilding bind poses, with the arm module as a reference example
- Context
- A vendor tutorial on Maya's node-based Bifrost rigging system, part of the broader move toward procedural, reusable rig construction; first in a three-part series.
- Correctness
- Studio/vendor practice, not peer-reviewed; the procedural pins-after-animation claim is demonstrated in-tool rather than benchmarked, so treat workflow benefits as illustrative of the framework's intent.
- Clarity
- Accessible introductory tutorial; one watch conveys the framework and vocabulary (modules, pins, ports).
- How to read it
- Watch once to learn the core concepts and terminology before parts 2 and 3; focus on the module-and-port model and what pins buy you, since later parts assume it.
Rigging
-
Covers the Bifrost rigging module interface in Maya, explaining input and output types, parent-child relationships, and how modules interact with Maya scenes for reusable rig components.
abstract ▾ abstract ▴
Matthew Tucker walks through the Bifrost rigging module interface in Maya, creating a new graph in the Bifrost Graph Editor and explaining its left-to-right data flow through inputs, outputs, and parent ports that link parent and child modules. Using the parameter editor he covers the module name and state parameters, showing how an FK chain behaves under the animation, rest relative, and rest absolute states and how a flattened versus direct hierarchy affects inherited transformations. He then demonstrates diagnostics mode for visualizing pin, control, and joint orientations via display transforms, and uses profile evaluation with the Maya profiler to measure how optimized a variable FK module is.
Related Bifrost Rigging in Maya - Part 1: Introduction and Framework · Bifrost Rigging in Maya - Part 3: Creating a Custom Module · Digital Humans: Inside Epic's MetaHuman Creator · Deliver Faster Rigging and Animation with AI
how to read this ▾ how to read this ▴
- Category
- Production talk / tool tutorial
- Contributions
-
- Walks through the Bifrost rigging module interface in Maya, with left-to-right data flow through input, output and parent ports linking parent and child modules
- Explains module name and state parameters and how an FK chain behaves under animation, rest relative and rest absolute states, plus flattened versus direct hierarchy effects
- Demonstrates diagnostics mode for visualizing pin, control and joint orientations, and profile evaluation with the Maya profiler to measure module optimization
- Context
- Second part of Matthew Tucker's Bifrost rigging series, building directly on the framework concepts introduced in Part 1. Builds on: Bifrost Rigging in Maya - Part 1: Introduction and Framework
- Correctness
- Studio/vendor practice, not peer-reviewed; performance claims come from in-tool profiling of example modules rather than independent evaluation.
- Clarity
- Accessible but more detailed than Part 1; one careful watch conveys the interface, with rewatching useful for the state-parameter behavior.
- How to read it
- Watch after Part 1; focus on the module states (animation vs rest relative vs rest absolute) and the diagnostics and profiler workflow, rewatching the state section if you intend to author modules.
Rigging
-
Walks through building and publishing a custom spline rig module inside the Bifrost Modular Rigging Framework, demonstrating how to author reusable procedural rig components for character pipelines.
abstract ▾ abstract ▴
Matthew Tucker walks through building a custom spline rig module inside Maya's Bifrost Modular Rigging Framework, starting from the editable template module and parenting it to a root module. He demonstrates authoring base and end controls with pins, then using iterate nodes to procedurally generate any number of evenly spaced mid controls and joints, driving their placement with interpolate transform matrix, divide/increment/decrement nodes, and dynamically built name strings. He builds a spline through the controls with a resample matrix chain, editing its decomposed-matrix internals to manually set the forward axis and up vector so the spline orientation matches the controls, then drives the joint chain along it. The module is cleaned up, parameters like joint count and mid-control count are promoted to the top of the graph, and the optimized module is published for reuse.
Related Bifrost Rigging in Maya - Part 2: The Module Interface · Bifrost Rigging in Maya - Part 1: Introduction and Framework · Digital Humans: Inside Epic's MetaHuman Creator · Deliver Faster Rigging and Animation with AI
how to read this ▾ how to read this ▴
- Category
- Production talk / tool tutorial
- Contributions
-
- Walks through building a custom spline rig module in Maya's Bifrost Modular Rigging Framework, starting from the editable template module parented to a root module
- Procedurally generates evenly spaced mid controls and joints with iterate nodes, interpolate transform matrices, divide/increment/decrement nodes and dynamically built name strings
- Builds and orients a spline via a resample matrix chain (editing decomposed-matrix internals for forward axis and up vector), then promotes parameters and publishes the optimized module for reuse
- Context
- Third and most hands-on part of the Bifrost rigging series, applying the interface and concepts from Parts 1 and 2 to author a reusable procedural module. Builds on: Bifrost Rigging in Maya - Part 2: The Module Interface
- Correctness
- Studio/vendor practice, not peer-reviewed; the workflow is a demonstrated authoring recipe, so correctness is about following the construction steps rather than a validated result.
- Clarity
- Detailed step-by-step build; benefits from pausing and following along rather than a single passive watch.
- How to read it
- Treat as a follow-along build after Parts 1 and 2; focus on the iterate-node procedural generation and the spline orientation fix (forward axis and up vector via decomposed matrices), pausing to replicate steps if you are authoring your own modules.
Rigging
-
, , , , ,
Three-stage pipeline combining multi-query mesh recovery, neural inverse kinematics, and 2D-informed refinement to produce biomechanically plausible human poses from monocular video.
abstract ▾ abstract ▴
Recent advancements in 3D human pose estimation from single-camera images and videos have relied on parametric models, like SMPL. However, these models over-simplify anatomical structures, limiting their accuracy in capturing true joint locations and movements, which reduces their applicability in biomechanics, healthcare, and robotics. Biomechanically accurate pose estimation, on the other hand, typically requires costly marker-based motion capture systems and optimization techniques in specialized labs. To bridge this gap, we propose BioPose, a novel learning-based framework for predicting biomechanically accurate 3D human pose directly from monocular videos. BioPose includes three key components: a Multi-Query Human Mesh Recovery model (MQ-HMR), a Neural Inverse Kinematics (NeurIK) model, and a 2D-informed pose refinement technique. MQ-HMR leverages a multi-query deformable transformer to extract multi-scale fine-grained image features, enabling precise human mesh recovery. NeurIK treats the mesh vertices as virtual markers, applying a spatial-temporal network to regress biomechanically accurate 3D poses under anatomical constraints. To further improve 3D pose estimations, a 2D-informed refinement step optimizes the query tokens during inference by aligning the 3D structure with 2D pose observations.
Related SFV: Reinforcement Learning of Physical Skills from Video · AddBiomechanics: Automating model scaling, inverse kinematics, and inverse dynamics from human motion data through sequential optimization · From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans · Reconstructing Humans with a Biomechanically Accurate Skeleton
how to read this ▾ how to read this ▴
- Category
- Method: biomechanical 3D pose estimation
- Contributions
-
- BioPose, a learning-based framework predicting biomechanically accurate 3D human pose directly from monocular video
- Combines a Multi-Query Human Mesh Recovery model (MQ-HMR) using a multi-query deformable transformer, a Neural Inverse Kinematics (NeurIK) model treating mesh vertices as virtual markers, and a 2D-informed pose refinement step
- Aims to recover anatomically constrained joint locations without marker-based motion capture
- Context
- Responds to the anatomical over-simplification of parametric models like SMPL and shares the biomechanical-accuracy goal of From Skin to Skeleton, but targets estimation from a single camera rather than capture rigs. Builds on: From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans
- Correctness
- The premise is that mesh vertices can act as virtual markers feeding a neural IK under anatomical constraints to approximate marker-based accuracy from monocular video; readers should be cautious about how close to true marker-based ground truth this gets and on which datasets and motion types it was validated.
- Clarity
- Pipeline is clearly staged (three components), so a first pass conveys the architecture; the deformable-transformer and NeurIK details need a second pass.
- How to read it
- First pass for the three-stage pipeline and the virtual-markers idea; second pass on NeurIK and the anatomical constraints if you care about biomechanical fidelity, and check the evaluation protocol against marker-based references before trusting accuracy claims.
Retargeting / Muscles
-
,
BlendSim shifts physics-based animation from discrete frame-by-frame solves to continuous interpolation trajectories built on blendshapes, representing each vertex path with continuous parametric Bezi
abstract ▾ abstract ▴
BlendSim shifts physics-based animation from discrete frame-by-frame solves to continuous interpolation trajectories built on blendshapes, representing each vertex path with continuous parametric Bezier splines that have variable keyframe times. Because this mesh animation representation is continuous and fully differentiable, it can be optimized to follow the laws of physics under constraints, while projective dynamics decouples the optimization into local parallelizable steps and a global quadratic step for efficiency. The method supports constraints such as collisions and cyclic motion and is compatible with modern animation workflows and file formats such as glTF.
Related An Implicit Physical Face Model Driven by Expression and Style · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Hyper-Reduced Projective Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: physics-based animation
- Contributions
-
- BlendSim, shifting physics-based animation from discrete frame-by-frame solves to continuous interpolation trajectories built on blendshapes
- Represents each vertex path with continuous, differentiable parametric Bezier splines with variable keyframe times, optimizable to follow physics under constraints
- Uses projective dynamics to decouple the solve into parallelizable local steps and a global quadratic step, supporting collisions and cyclic motion and exporting to glTF
- Context
- Built on Projective Dynamics, recasting that fast simulation framework in a continuous, blendshape-and-spline trajectory representation rather than per-frame state. Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- The key idea is that a continuous, differentiable spline-on-blendshapes representation can be optimized to satisfy physical laws and constraints; readers should keep in mind that continuous trajectory parameterization trades temporal resolution and keyframe placement against the range of dynamics it can faithfully represent, and check which constraint types and motions were demonstrated.
- Clarity
- Conceptually crisp framing (continuous vs discrete), but the spacetime projective-dynamics formulation warrants a careful second pass.
- How to read it
- First pass for the continuous-trajectory reframing and why differentiability matters; second pass on the spacetime projective-dynamics local/global decomposition if you intend to implement, and note the glTF and animation-workflow compatibility as a practical hook.
Facial / CFX
- talk CFX: Muscle and Soft Tissue with Otis | Houdini 21 | Kai Stavginski | SIGGRAPH HIVE 2025 Houdini SideFX
,
Introduces the Otis GPU-accelerated dynamics solver in Houdini 21, delivering FEM-quality muscle, fascia, and fat simulation at interactive speeds for character FX pipelines.
Muscles / CFX / Skinning
-
Artist-focused guide to Chaos Cloth simulation in Unreal Engine, building a production-ready pipeline for both real-time and cinematic cloth sequences.
CFX / Skinning
-
,
Technical presentation on integrating Chaos Cloth into AAA game production pipelines, covering simulation fidelity, performance budgets, and artist-facing tooling.
CFX / Skinning
-
, , ,
Multi-tiered strategy for choreographing Moana 2 hair and cloth motion using performance categorization, visual planning, and iterative refinement.
abstract ▾ abstract ▴
In Walt Disney Animation Studios’ "Moana 2", the main characters undergo significant evolution from the original film, displaying a broad range of emotions while interacting with diverse environmental conditions. Crafting the hair and cloth motion to consistently support and enhance the complex character performances presented challenges for the character team, particularly when combined with nuanced art direction. To this end, we developed a multi-tiered and strategic approach for the choreography of the hair and cloth which includes three main areas of emphasis: Performance Categorization, Continuity through Visual Planning, and Iterative Refinement. Integral to this process were extensive and detailed drawovers, to ensure that hairstyles and costumes reacted consistently. This strategy allowed us to frontload the work through careful planning, enabling technical animation to execute on art direction of shots efficiently. We illustrate the overall approach with specifics for hair with Moana and Maui, and with Matangi for cloth.
Related Scriptable Character FX Solution · Simulating Wind Effects on Cloth and Hair in Disney's Frozen · Art-Directing Asha's Braids in Disney's Wish · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact
how to read this ▾ how to read this ▴
- Category
- Production paper / pipeline methodology
- Contributions
-
- A multi-tiered strategy for choreographing hair and cloth motion to support complex character performances in Disney's Moana 2
- Organizes the work around Performance Categorization, Continuity through Visual Planning, and Iterative Refinement, using detailed drawovers to keep hairstyles and costumes reacting consistently
- Frontloads planning so technical animation can execute art direction efficiently, illustrated via Moana and Maui hair and Matangi cloth
- Context
- A Disney production paper extending the studio's art-directed hair lineage (for example Hierarchical Controls for Art-Directed Hair) to coordinated hair-and-cloth choreography on a feature sequel. Builds on: Hierarchical Controls for Art-Directed Hair at Disney
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a shipped film, and the methodology is a process framework whose effectiveness is shown through specific shots rather than controlled comparison.
- Clarity
- Accessible and process-oriented; one read conveys the three-area strategy without heavy formalism.
- How to read it
- Read once for the planning framework (categorization, visual planning via drawovers, iterative refinement) and the worked examples; useful as a process reference rather than a technique to reimplement, so a single pass usually suffices.
CFX
-
, , , , ,
Foundry introduces the UsdSuperLayer node in Katana, providing a procedural on-demand approach that prevents full stage recomposition when editing large character and shading graphs.
abstract ▾ abstract ▴
Katana’s [Foundry 2025a] strength is its procedural node graph architecture which provides dynamic and scalable workflows - studios are no strangers to reaching scenes containing tens of thousands of nodes. The layer-based nature of Universal Scene Description (USD) [Pixar Anmation Studios 2025] presents a challenge for the node-based procedural approach. At large scales, the current procedural method of creating a USD layer per node is inadvisable because modifying an early node could trigger a full USD Stage recomposition and reprocessing of USD layer generating functions. In this extended abstract, an insight into the journey towards our solution is presented: the UsdSuperLayer node type. Providing a “procedural on-demand” approach, enabling efficient management of thousands of primitives and extensive shading graphs represented as nodes.
Related Universal Scene Description: Open Source Release · A Deep Dive into Universal Scene Description and Hydra · Forging a New Animation Pipeline with USD · Using USD in Pixar's Digital Backlot
how to read this ▾ how to read this ▴
- Category
- Method / pipeline tooling (extended abstract)
- Contributions
-
- Introduces the UsdSuperLayer node type in Katana to reconcile USD's layer-based composition with Katana's procedural node graph
- Provides a procedural on-demand approach that avoids full USD Stage recomposition when an early node is modified
- Targets efficient management of thousands of primitives and extensive shading graphs represented as nodes
- Context
- Bridges Pixar's Universal Scene Description with Foundry's Katana node-graph architecture, addressing the cost of the prior layer-per-node procedural method at scale. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Presented as an extended abstract describing a solution journey; the central claim is that on-demand composition prevents costly full-stage recomposition at large node counts, but as a short abstract it gives limited quantitative detail, so readers should look for follow-up material for measured performance.
- Clarity
- Short and accessible if you know USD composition and Katana; one read conveys the problem and the idea, with depth limited by the abstract format.
- How to read it
- Read once for the scaling problem (layer-per-node recomposition) and the UsdSuperLayer concept; if you work with USD-in-Katana pipelines, seek the fuller talk or paper for the implementation and performance numbers the abstract omits.
Rigging
-
, , ,
Blends 3D target shapes into 2D fabric panels via custom UVs to retarget cloth simulations toward artist-directed draping looks, used in Inside Out 2 and Elio.
abstract ▾ abstract ▴
Cloth draping is a prevalent tailoring process that gives 3D form to sewn 2D panels of fabric. However, when dressing animated characters, artists often prefer to model garments with delineated spatial structures and clean silhouettes at the cost of diminishing the presence of folds and wrinkles produced by draping. To reconcile stylization and realism, this work describes a new approach for directing cloth draping that accommodates 3D shaping and 2D pattern making simultaneously. Our key contribution is a method that generates custom UVs blending the distortion induced by 3D shapes into 2D fabric panels. As a result, we can retarget cloth simulations to compute physically plausible draping deformations that smoothly transition to prescribed 3D forms. To assist the garment design, we also propose a flattening tool that constructs low-resolution UV panels amenable to 2D manipulation. We showcase our results with a series of garment assets and cloth animations from Pixar feature films Inside Out 2 (2024) and Elio (2025).
Related Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production · Mixing Yarns and Triangles in Cloth Simulation · Untangling Cloth · Multi-Resolution Isotropic Strain Limiting
how to read this ▾ how to read this ▴
- Category
- Method: a cloth-draping / simulation-retargeting technique
- Contributions
-
- A method that generates custom UVs blending the distortion of 3D target shapes into 2D fabric panels, so simulations retarget toward artist-directed draping.
- Physically plausible draping deformations that smoothly transition into prescribed 3D forms, reconciling stylization with realism.
- A flattening tool that builds low-resolution UV panels amenable to 2D manipulation to assist garment design.
- Context
- Sits in Pixar's production cloth pipeline and builds directly on the tailoring work of Waggoner et al. (Revamping the Cloth Tailoring Pipeline at Pixar), extending 2D pattern making toward simultaneous 3D shaping. Builds on: Revamping the Cloth Tailoring Pipeline at Pixar
- Correctness
- Demonstrated on garment assets and cloth animations from Inside Out 2 and Elio rather than a broad benchmark, so it is validated as production-proven craft; a reader should keep in mind it targets directed/stylized looks and its generality beyond Pixar's pipeline is not the claim.
- Clarity
- Accessible at a conceptual level; a first pass conveys the blended-UV idea, do a second pass for the UV-blending and retargeting formulation.
- How to read it
- Focus on how the custom UVs encode the 3D-into-2D distortion and how retargeting consumes them; a second pass on the flattening tool pays off if you care about the artist-facing workflow.
CFX
- talk Dressed to Quest: Achieving Dynamic, Efficient Characters in 'Kingdom Come: Deliverance II' GDC Industrial
Warhorse Studios presents character systems for KCD2 reducing memory, CPU, and draw calls via material atlasing, ID mapping, UV decals, and layered armor with blood, grime, and damage states.
CFX / Skinning
-
Practical advice on incrementally adopting Motion Matching in existing animation state-machine projects, demonstrating responsive pose selection with minimal data overhead.
Motion Synthesis / Retargeting
- talk Exploring Facial Rigging Techniques in KineFX/APEX | Houdini 21 | Carlos Valcarcel | SIGGRAPH HIVE 2025 Houdini SideFX
Project Outback phase two: contrasts realistic versus stylized facial rig construction in KineFX, demonstrating multiple facial rigging techniques and their APEX implementation.
Facial / Rigging
-
Quick-start guide showing how to use MotionMaker in Maya 2026.1 to transform simple motion paths into smooth, lifelike character locomotion cycles for bipeds and quadrupeds.
abstract ▾ abstract ▴
This quick-start tutorial walks through MotionMaker in Maya 2026.1, showing how to generate character locomotion from a drawn path instead of extensive keyframing or mocap. The presenter imports a default biped, keys a path locator across the scene at various frames, and clicks Generate Motion to calculate and cache the animation. It then demonstrates editing the path to route around obstacles using the AutoSpeedRamp function, walking up a hill by raising final frames, and adding a jump via right-click action tags on the movement bar. The final section retargets the generated motion onto a HumanIK MayaBot character through the Character Control tab using the MoMo Standard Biped as the source.
Related Meet MotionMaker: New AI Animation Tool In Maya · MotionBuilder: Essentials Characterization, Retargeting and Baking Animations · Motion Builder Characterization and Retargeting Tutorial · Physically Based Motion Transformation
how to read this ▾ how to read this ▴
- Category
- Production talk / tool quick-start (MotionMaker in Maya 2026.1)
- Contributions
-
- Demonstrates generating biped and quadruped locomotion from a drawn path locator instead of extensive keyframing or mocap, via Generate Motion and caching.
- Shows editing the path to route around obstacles using AutoSpeedRamp, walking up a hill by raising final frames, and adding a jump through right-click action tags.
- Retargets the generated motion onto a HumanIK MayaBot using the MoMo Standard Biped as source via the Character Control tab.
- Context
- A hands-on companion to the MotionMaker introduction (Meet MotionMaker), grounding Maya's AI locomotion feature in concrete UI steps. Builds on: Meet MotionMaker: New AI Animation Tool In Maya
- Correctness
- Vendor tutorial, not peer-reviewed; the workflow is product-demonstrated on default rigs, so a reader should treat the smooth results as curated demos and expect to validate behavior on their own characters and edge cases.
- Clarity
- Very accessible; a single watch-through conveys the workflow since it is step-by-step UI guidance.
- How to read it
- Follow it as a click-along while in Maya; focus on the path keying, action tags and the retargeting tab, and revisit only the retargeting section if applying to custom rigs.
Motion Synthesis
-
, , , , , , ,
Anny is a scan-free open-source parametric body model using anthropometric phenotype parameters: gender, age, height, weight, calibrated from WHO population statistics.
abstract ▾ abstract ▴
Parametric body models provide the structural basis for many human-centric tasks, yet existing models often rely on costly 3D scans and learned shape spaces that are proprietary and demographically narrow. We introduce Anny, a simple, fully differentiable, and scan-free human body model grounded in anthropometric knowledge from the MakeHuman community. Anny defines a continuous, interpretable shape space, where phenotype parameters (e.g. gender, age, height, weight) control blendshapes spanning a wide range of human forms--across ages (from infants to elders), body types, and proportions. Calibrated using WHO population statistics, it provides realistic and demographically grounded human shape variation within a single unified model. Thanks to its openness and semantic control, Anny serves as a versatile foundation for 3D human modeling--supporting millimeter-accurate scan fitting, controlled synthetic data generation, and Human Mesh Recovery (HMR). We further introduce Anny-One, a collection of 800k photorealistic images generated with Anny, showing that despite its simplicity, HMR models trained with Anny can match the performance of those trained with scan-based body models. The Anny body model and its code are released under the Apache 2.0 license, making Anny an accessible foundation for human-centric 3D modeling.
how to read this ▾ how to read this ▴
- Category
- Method / open resource: a parametric human body model
- Contributions
-
- Anny, a fully differentiable, scan-free parametric body model grounded in MakeHuman anthropometric knowledge with a continuous, interpretable shape space.
- Phenotype parameters (gender, age, height, weight) controlling blendshapes across ages and body types, calibrated using WHO population statistics.
- Anny-One, a collection of 800k photorealistic images, used to show HMR models trained on Anny can match scan-based body models.
- Context
- Positions itself against scan-based, proprietary, demographically narrow parametric models (the SMPL lineage), substituting anthropometric/MakeHuman priors for learned scan shape spaces.
- Correctness
- Validated on scan fitting, synthetic data generation and Human Mesh Recovery, with the headline claim that it matches scan-based models on HMR; readers should note the shape space is anthropometrically parameterized rather than learned from scans, so its realism is bounded by the WHO/MakeHuman priors.
- Clarity
- Accessible framing with interpretable parameters; a first pass conveys the model, do a second pass for the blendshape construction and calibration details.
- How to read it
- Read the abstract and shape-space definition first; second pass the calibration and the Anny-One HMR comparison if you plan to use it for training data or fitting.
Rigging / Skinning / Retargeting
- HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset CVPR Academic 14 cites
, , , , , , ,
Introduces the first large-scale humanoid rigging dataset of 11,434 T-posed meshes and a transformer-based framework fusing skeleton priors with 3D mesh features for automatic rigging.
abstract ▾ abstract ▴
With the rapid evolution of 3D generation algorithms, the cost of producing 3D humanoid character models has plummeted, yet the field is impeded by the lack of a comprehensive dataset for automatic rigging, a pivotal step in character animation. Addressing this gap, we present HumanRig, the first large-scale dataset specifically designed for 3D humanoid character rigging, encompassing 11,434 meticulously curated T-posed meshes adhered to a uniform skeleton topology. Capitalizing on this dataset, we introduce an innovative, data-driven automatic rigging framework, which overcomes the limitations of GNNbased methods in handling complex AI-generated meshes. Our approach integrates a Prior-Guided Skeleton Estimator (PGSE) module, which uses 2D skeleton joints to provide a preliminary 3D skeleton, and a Mesh-Skeleton Mutual Attention Network (MSMAN) that fuses skeleton features with 3D mesh features extracted by a U-shaped point transformer. This enables a coarse-to-fine 3D skeleton joint regression and a robust skinning estimation, surpassing previous methods in quality and versatility. This work not only remedies the dataset deficiency in rigging research but also propels the animation industry towards more efficient and automated character rigging pipelines.
Related RigNet: Neural Rigging for Articulated Characters · Automatic Rigging and Animation of 3D Characters · Pose and Skeleton-aware Neural IK for Pose and Motion Editing · MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds
how to read this ▾ how to read this ▴
- Category
- Dataset + method: automatic humanoid rigging
- Contributions
-
- HumanRig, presented as the first large-scale humanoid rigging dataset of 11,434 curated T-posed meshes sharing a uniform skeleton topology.
- A data-driven rigging framework with a Prior-Guided Skeleton Estimator (PGSE) that uses 2D skeleton joints to seed a 3D skeleton.
- A Mesh-Skeleton Mutual Attention Network (MSMAN) fusing skeleton and U-shaped point-transformer mesh features for coarse-to-fine joint regression and skinning.
- Context
- Extends neural auto-rigging beyond GNN-based methods such as RigNet, targeting the complex AI-generated meshes that earlier approaches struggled with. Builds on: RigNet: Neural Rigging for Articulated Characters
- Correctness
- Reported to surpass prior methods in quality and versatility, but the uniform skeleton topology and T-pose assumption define the scope, so a reader should keep in mind generalization to non-standard skeletons, non-T-poses, or non-humanoid shapes is outside the dataset's design.
- Clarity
- Reasonably accessible architecture story; a first pass conveys the two-module pipeline, a second pass is needed for the PGSE/MSMAN details.
- How to read it
- Skim the dataset construction and the PGSE-then-MSMAN pipeline first; second pass the attention fusion and skinning estimation if implementing or comparing against RigNet.
Rigging / ML Deformation
- Interactive Facial Animation: Enhancing Facial Rigs With Real-Time Shell And Contact Simulation SCA 2 cites
, , , ,
Layers real-time thin-shell simulation and contact onto an existing facial rig, with no anatomical priors or extra artist work, adding plausible skin dynamics and self-contact that run interactively on consumer hardware.
abstract ▾ abstract ▴
Augments a facial animation rig with real-time shell simulation and contact handling, without anatomical priors or additional artist intervention. Layered over an existing rig, it adds physically plausible skin dynamics and self-contact at the lips and eyelids, runs interactively on consumer hardware, and generalises across realistic, stylized and fantastical characters.
Related High Fidelity Facial Animation Capture and Retargeting with Contours · Position Based Dynamics · Enriching Facial Blendshape Rigs with Physical Simulation · Fast Contact Determination for Intersecting Deformable Solids
how to read this ▾ how to read this ▴
- Category
- Method: real-time physics augmentation of facial rigs
- Contributions
-
- Augments an existing facial rig with real-time thin-shell simulation and contact handling, with no anatomical priors or additional artist intervention.
- Adds physically plausible skin dynamics and self-contact at the lips and eyelids, layered over the rig.
- Runs interactively on consumer hardware and generalizes across realistic, stylized and fantastical characters.
- Context
- Builds on physics-based face modeling such as Phace, but trades anatomical priors for a lightweight shell-plus-contact layer on top of any existing rig. Builds on: Phace: Physics-based Face Modeling and Animation
- Correctness
- Demonstrated to add plausible dynamics and self-contact interactively across character styles; the no-anatomy, shell-based approximation is the key assumption, so a reader should keep in mind it adds surface dynamics rather than true volumetric tissue simulation.
- Clarity
- Accessible motivation; a first pass conveys what it adds and where, do a second pass for the shell and contact formulation.
- How to read it
- Focus on how the shell and contact model couples to an arbitrary rig and the real-time budget; second pass the contact handling if you care about lips/eyelids fidelity.
Facial / CFX
-
Framestore Head of Groom and Creature FX presents a full creature FX pipeline from muscle simulation through tissue and skin to fur coat dynamics in KineFX.
abstract ▾ abstract ▴
Framestore's head of groom and creature FX gives an end-to-end tour of building digital animals in Houdini, framing CFX as the last department before render and distinguishing it from FX and from grooming. She emphasizes preparation and reusable HDA templates that handle scaling, colliders, pre-roll, time remapping and farm publishing, then walks through the standard simulation stack: muscle tension and impact computed via stretching lines and a custom SOP solver, tetrahedron attachment with delta-mesh correction, and solving in Vellum, FEM or third-party solvers. She covers three fat approaches (target-based, attachment-based, and fascia-and-collision), tetrahedron creation with solidembed versus tetconform plus FEM validate for inverted tets, frame-by-frame recreated constraints for skin sliding, FEM-driven wrinkles, and cloth draping with Marvelous Designer. The hair section uses guide-deform to glue guides to characters and back to the groom, custom two-channel wind forces, and FLIP-based water interaction, finishing with procedural shot sculpting and career advice on VEX and Python.
Related Stags & Stripes, Creating Photoreal Characters | Framestore | FMX HIVE Europe 2021 · The Making of CG Animals · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Creating a Photorealistic Hyena
how to read this ▾ how to read this ▴
- Category
- Production talk / creature FX pipeline breakdown
- Contributions
-
- End-to-end tour of building digital animals in Houdini, framing CFX as the last department before render and distinguishing it from FX and grooming.
- Walks the simulation stack: muscle tension and impact via stretching lines and a custom SOP solver, tetrahedron attachment with delta-mesh correction, and solving in Vellum, FEM or third-party solvers.
- Covers three fat approaches, tet creation (solidembed vs tetconform plus FEM validate), skin sliding via recreated constraints, FEM wrinkles, Marvelous Designer cloth, and a hair section with guide-deform, custom wind and FLIP water interaction.
- Context
- A practitioner sequel to the speaker's prior photoreal-creature work (Stags & Stripes), translating that craft into a KineFX/Houdini CFX workflow. Builds on: Stags & Stripes, Creating Photoreal Characters | Framestore | FMX HIVE Europe 2021
- Correctness
- Studio practice, not peer-reviewed; the techniques are production-proven at Framestore, so treat the recipes (reusable HDA templates, solver choices) as battle-tested conventions rather than validated, generalizable results.
- Clarity
- Dense but accessible to Houdini practitioners; a single attentive pass conveys the stack, but rewatching individual stages pays off when implementing them.
- How to read it
- Watch for the order of the simulation stack and the reusable-HDA philosophy; revisit the specific stages (muscle, fat, skin sliding, hair) you actually need to build.
CFX
-
Introduces MotionMaker, Maya 2026.1's AI-driven locomotion synthesis tool that generates biped and quadruped motion from a few keyframes or a guide path, trained on dedicated mocap datasets.
abstract ▾ abstract ▴
This talk introduces MotionMaker, a generative locomotion tool built into Maya and powered by Autodesk AI that combines machine learning, motion capture and keyframing to synthesize believable biped and quadruped motion. Animators create a character, set keyframes in space or draw a motion path, then hit generate to produce base motion that can be refined in MotionMaker's dedicated editor with action tags such as jump or sit. The system retargets standard biped or quadruped motion onto custom characters using HIK or custom retargeting solutions, and layers further edits like head turns via animation layers. It also covers path modes for locking motion to timing, speed ramping to move characters faster between points, and motion scale to convey weight for large characters.
Related Generate animation with AI using Motion Maker in Maya 2026.1 · Autodesk MotionBuilder 2022 · MotionBuilder: Essentials Characterization, Retargeting and Baking Animations · Motion Builder Characterization and Retargeting Tutorial
how to read this ▾ how to read this ▴
- Category
- Production talk / tool introduction (MotionMaker, Maya 2026.1)
- Contributions
-
- Introduces MotionMaker, a generative locomotion tool in Maya powered by Autodesk AI combining machine learning, motion capture and keyframing for biped and quadruped motion.
- Demonstrates generating base motion from keyframes or a drawn path, refining in a dedicated editor with action tags such as jump or sit, and layering edits like head turns via animation layers.
- Covers retargeting onto custom characters via HIK or custom solutions, plus path modes, speed ramping and motion scale to convey weight.
- Context
- Frames a learned in-betweening/locomotion approach in the lineage of work like Robust Motion In-Betweening, packaged as an artist-facing Maya feature. Builds on: Robust Motion In-Betweening
- Correctness
- Vendor introduction, not peer-reviewed; capabilities are product-demonstrated and the underlying mocap training data is described but not evaluated, so a reader should treat motion quality claims as marketing-curated.
- Clarity
- Very accessible overview aimed at animators; one pass conveys the feature set and intent.
- How to read it
- Watch once for what the tool can and cannot do (paths, action tags, retargeting, weight); pair with the quick-start tutorial for the actual workflow.
Motion Synthesis / Retargeting
-
, ,
Mesh-free character rig for Elio's implicit-surface liquid supercomputer, giving animators full fidelity control over a rigged shader while preserving renderability.
abstract ▾ abstract ▴
We present a novel character rigging solution developed for OOOOO, a liquid supercomputer in Pixar’s Elio. OOOOO’s design and desired movement necessitated reimagining our conventional way of articulating characters and she became Pixar’s first mesh-free character rig. We developed a system that allowed our animators full fidelity control over what is essentially a rigged shader while ensuring downstream renderability. [Luo et al. 2025] The system’s architecture supports a hierarchical arrangement of implicit surface primitives and operators, allowing for complex transformations while preserving normal animation paradigms and offers unprecedented flexibility in character animation.
Related Eyes Without a Face: Integrating Detached Facial Features into Pixar's Character Pipeline · Shaping the Elements: Curvenet Animation Controls in Pixar's Elemental · Making Souls: Methods and a Pipeline for Volumetric Characters · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2
how to read this ▾ how to read this ▴
- Category
- Method / production case study: rigging an implicit-surface character
- Contributions
-
- A mesh-free character rigging solution for OOOOO, a liquid supercomputer in Pixar's Elio, described as Pixar's first mesh-free rig.
- A system giving animators full-fidelity control over what is essentially a rigged shader while preserving downstream renderability.
- An architecture supporting a hierarchical arrangement of implicit-surface primitives and operators, enabling complex transformations within normal animation paradigms.
- Context
- Draws on implicit-surface skin deformation work such as Implicit Skinning (Vaillant et al.), repurposing implicit primitives from skinning into a full animatable rig. Builds on: Implicit Skinning: Real-Time Skin Deformation with Contact Modeling
- Correctness
- Demonstrated on a single bespoke Elio character rather than a general system, so it is production-proven for that case; a reader should keep in mind the approach is tailored to a liquid/implicit-surface design and its reuse on other characters is not the claim.
- Clarity
- Accessible as a case study; a first pass conveys the rigged-shader concept, a second pass clarifies the primitive/operator hierarchy.
- How to read it
- Focus on how implicit primitives and operators map to familiar animation controls while staying renderable; a second pass pays off mainly if you build implicit-surface rigs yourself.
Rigging
-
, , ,
Meta's MHR combines a decoupled skeleton/shape paradigm with a modern rig and pose-corrective system, culminating 9 years of anatomically inspired body model work.
abstract ▾ abstract ▴
We present MHR, a parametric human body model that combines the decoupled skeleton/shape paradigm of ATLAS with a flexible, modern rig and pose corrective system inspired by the Momentum library. Our model enables expressive, anatomically plausible human animation, supporting non-linear pose correctives, and is designed for robust integration in AR/VR and graphics pipelines.
how to read this ▾ how to read this ▴
- Category
- Method: a parametric human body model and rig
- Contributions
-
- MHR (Momentum Human Rig), a parametric body model combining the decoupled skeleton/shape paradigm of ATLAS with a flexible modern rig and pose-corrective system.
- Support for expressive, anatomically plausible animation including non-linear pose correctives, inspired by the Momentum library.
- A design intended for robust integration in AR/VR and graphics pipelines.
- Context
- Continues the parametric body-model lineage from SMPL, decoupling skeleton from shape (after ATLAS) and adding a modern rig with non-linear correctives. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- The abstract states design goals (expressive, anatomically plausible, robustly integrable) more than measured results, so a reader should look to the full paper for quantitative validation and treat the anatomical-plausibility framing as a stated aim pending evidence.
- Clarity
- Abstract is terse; a first pass gives the positioning, but the formulation requires the full paper given the short summary.
- How to read it
- Read for how the skeleton/shape decoupling and pose correctives differ from SMPL; a second pass into the full paper is needed for the rig math and any evaluation.
Rigging / Skinning / Retargeting
-
, , ,
Neural method for transferring facial deformations across characters, generalizing expression retargeting to unseen identities and rig topologies.
abstract ▾ abstract ▴
This work addresses generating facial blendshapes and reference animations for a new 3D character in production settings where expressions and animations already exist on a predefined template character. The authors propose Neural Facial Deformation Transfer (NFDT), a data-driven method that transfers facial expressions from a template character to new target characters given only the target's neutral shape. They introduce a data generation strategy that automatically builds a large training dataset of paired template and target shapes in the same expression, then train a topology-agnostic decoder-only transformer adapted from the Shape Transformer to perform the transfer in high fidelity. NFDT operates without a rig inversion step, generalizes across varying mesh topologies and to humanoid creatures, and outperforms prior facial expression transfer methods in quantitative evaluations and a user study.
Related Transferring Facial Expressions to Different Face Models · Facial Retargeting with Automatic Range of Motion Alignment · Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild · Local Anatomically-Constrained Facial Performance Retargeting
how to read this ▾ how to read this ▴
- Category
- Method: neural facial expression / deformation transfer
- Contributions
-
- Neural Facial Deformation Transfer (NFDT), a data-driven method that transfers facial expressions from a template character to new targets given only the target's neutral shape.
- A data generation strategy that automatically builds a large training set of paired template/target shapes in the same expression.
- A topology-agnostic decoder-only transformer (adapted from the Shape Transformer) that transfers without a rig inversion step and generalizes across mesh topologies and to humanoid creatures.
- Context
- Extends facial performance retargeting beyond the authors' prior local anatomically-constrained retargeting, moving to a topology-agnostic, rig-inversion-free neural transfer. Builds on: Local Anatomically-Constrained Facial Performance Retargeting
- Correctness
- Reported to outperform prior expression-transfer methods in quantitative evaluations and a user study; the key dependency is the automatically generated paired-shape data, so a reader should keep in mind transfer quality is bounded by that synthetic data and by how well targets resemble the training distribution.
- Clarity
- Accessible problem framing; a first pass conveys the transfer setup, do a second pass for the transformer architecture and the data-generation strategy.
- How to read it
- Focus on the neutral-shape-only input, the rig-inversion-free design, and the data-generation strategy; second pass the Shape Transformer adaptation if implementing.
Facial / Retargeting / ML Deformation
-
, , , ,
Large autoregressive transformer generates skeletons and skinning weights for diverse 3D asset categories in a single unified framework.
abstract ▾ abstract ▴
The rapid evolution of 3D content creation, encompassing both AI-powered methods and traditional workflows, is driving an unprecedented demand for automated rigging solutions that can keep pace with the increasing complexity and diversity of 3D models. We introduce UniRig, a novel, unified framework for automatic skeletal rigging that leverages the power of large autoregressive models and a bone-point cross-attention mechanism to generate both high-quality skeletons and skinning weights. Unlike previous methods that struggle with complex or non-standard topologies, UniRig accurately predicts topologically valid skeleton structures thanks to a new Skeleton Tree Tokenization method that efficiently encodes hierarchical relationships within the skeleton. To train and evaluate UniRig, we present Rig-XL, a new large-scale dataset of over 14,000 rigged 3D models spanning a wide range of categories. UniRig significantly outperforms state-of-the-art academic and commercial methods, achieving a 215% improvement in rigging accuracy and a 194% improvement in motion accuracy on challenging datasets. Our method works seamlessly across diverse object categories, from detailed anime characters to complex organic and inorganic structures, demonstrating its versatility and robustness.
Related S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling · ASMR: Adaptive Skeleton-Mesh Rigging and Skinning via 2D Generative Prior · MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks
how to read this ▾ how to read this ▴
- Category
- Method: a unified autoregressive auto-rigging framework
- Contributions
-
- UniRig, a single autoregressive-transformer framework generating both skeletons and skinning weights via a bone-point cross-attention mechanism
- A Skeleton Tree Tokenization scheme that encodes hierarchy and yields topologically valid skeletons across diverse, non-standard topologies
- Rig-XL, a large-scale dataset of over 14,000 rigged 3D models spanning many categories
- Context
- Extends learning-based auto-rigging beyond RigNet (Xu et al. 2020), recasting skeleton plus skinning prediction as autoregressive sequence generation in the style of large transformer models. Builds on: RigNet: Neural Rigging for Articulated Characters
- Correctness
- Reported gains in rigging and motion accuracy are measured against academic and commercial baselines on challenging datasets, but results lean on the new Rig-XL training set, so generalization to categories or topologies under-represented there should be checked.
- Clarity
- Likely accessible at the idea level; a first pass conveys the tokenization-plus-cross-attention story, with a second pass needed for the tokenizer and training formulation.
- How to read it
- First pass on the abstract and the Skeleton Tree Tokenization figure to grasp how hierarchy becomes a token sequence; do a second pass on the bone-point cross-attention and Rig-XL composition if you care about reproducibility or coverage.
Rigging / Skinning / ML Deformation
- talk Project Outback: KineFX/APEX Rigging Workflows | Carlos Valcarcel | KineFEST 2025 Houdini SideFX
Explores the rigging framework built for Project Outback inside KineFX and APEX, covering customized rig workflows and future animation techniques under investigation.
Rigging / Skinning
- Pseudo-Collisions: A Method for Preventing Fur-Skin Intersections Without Physical Simulation DigiPro DreamWorks 0 cites
, , , , ,
Lightweight pseudo-collision method preventing fur-skin intersection without full simulation, enabling scalable creature grooming at production scale.
abstract ▾ abstract ▴
We introduce Pseudo-Collisions (PC), a numerical and time independent method to reduce the collisions between short fur and the skin of an asset as it deforms during animation. Typically, solving these intersections in a standard workflow would require a lot of time to set up and simulate each strand or their guide curves. With our solution, this can be achieved with a light post-process on top of the applied fur. This is achieved by adjusting the transformation matrix used to place the strands on the animated skin so that its rotation component smoothly lifts the hair where necessary. The algorithm dynamically adjusts the orientation of the animated strands to reduce the intersections using different pieces of information including the direction of the strands, the placement matrices for each strand on the rest and the animated skin meshes, as well as the change in the surface’s curvature due to the deformation. Pseudo-Collisions was successfully used during the production of Mufasa: The Lion King [Disney© 2024] and the set of parameters made available to the users to modulate the results of the correction is also presented.
Related Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · Hair and Fur in an Evolving Pipeline · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network
how to read this ▾ how to read this ▴
- Category
- Method: a fur-skin intersection-reduction technique (production)
- Contributions
-
- Pseudo-Collisions, a numerical, time-independent method that reduces short-fur penetration of deforming skin without per-strand or guide-curve simulation
- A correction that adjusts each strand's placement-matrix rotation using strand direction, rest and animated placement matrices, and surface curvature change
- An exposed, user-tunable parameter set, validated on the production of Mufasa: The Lion King
- Context
- Sits alongside production fur-motion systems such as DreamWorks' Skunk, offering a lightweight post-process alternative to simulating fur strands or guides. Builds on: Skunk: DreamWorks Fur Motion System
- Correctness
- Because it is a time-independent geometric correction rather than a physical simulation, it targets short fur and reduces (not provably eliminates) intersections; behavior on long fur or large deformations, and temporal coherence, are the things to watch.
- Clarity
- Practitioner-oriented and accessible; a first pass conveys the trick, a second pass clarifies the matrix-adjustment math.
- How to read it
- Read for the placement-matrix rotation idea and the parameter knobs; one careful pass on the correction formula suffices unless you intend to reimplement it.
CFX
-
, ,
Under-the-hood look at MetaHuman Animator's latest real-time facial capture capabilities and the updated Live Link pipeline for on-set performance transfer.
Facial / Retargeting
-
, , , ,
HSMR: first end-to-end image-to-SKEL regressor, jointly recovering the biomechanically accurate skeleton and body mesh from a single photograph.
abstract ▾ abstract ▴
HSMR is the first end-to-end approach for reconstructing 3D humans from a single image by estimating the parameters of the biomechanically accurate SKEL model. A transformer is trained to estimate SKEL parameters from image inputs. Due to the lack of training data, the authors build a pipeline to produce pseudo ground truth model parameters and implement a training procedure that iteratively refines them. HSMR achieves competitive performance on standard benchmarks and significantly outperforms prior methods in settings with extreme 3D poses and viewpoints, while producing more realistic joint rotation estimates by respecting anatomical joint limits. Accepted as CVPR 2025 Oral.
Related BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos · SMPL: A Skinned Multi-Person Linear Model · BOSS: Bones, Organs and Skin Shape Model · 3D Mesh Pose Transfer Based on Skeletal Deformation
how to read this ▾ how to read this ▴
- Category
- Method: monocular human reconstruction with a biomechanical skeleton
- Contributions
-
- HSMR, the first end-to-end image-to-SKEL regressor recovering a biomechanically accurate skeleton and body mesh from a single image
- A transformer that estimates SKEL parameters, trained via a pipeline that produces pseudo ground-truth and iteratively refines it
- Stronger results on extreme poses and viewpoints with more realistic, anatomically constrained joint rotations
- Context
- Builds directly on the SKEL biomechanical body model (Keller et al. 2023), bringing it into the single-image human mesh recovery setting. Builds on: From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans
- Correctness
- Validated on standard benchmarks (competitive) plus extreme-pose and viewpoint settings (clear gains), but accuracy depends on iteratively refined pseudo ground-truth, so estimates inherit any bias in that bootstrapped supervision.
- Clarity
- An accepted oral, so the narrative is likely clear; a first pass gives the pipeline, a second pass covers the pseudo-GT refinement loop.
- How to read it
- Focus first on how SKEL parameters are regressed and how anatomical joint limits are enforced; second pass on the pseudo-ground-truth generation and refinement procedure if you plan to retrain or trust the joint angles.
Skinning / Muscles / Retargeting
-
,
This paper introduces a rest shape optimization framework that produces sag-free simulations of discrete elastic rods under gravity.
abstract ▾ abstract ▴
This paper introduces a rest shape optimization framework that produces sag-free simulations of discrete elastic rods under gravity. Rather than adjusting stiffness or applying compensating forces, the method optimizes per-element rest shape parameters so that the input configuration is an equilibrium state of the simulated rod. The resulting formulation is compatible with existing discrete elastic rod solvers and avoids visually undesirable initial sagging for hair, cables, and similar thin elastic structures.
Related Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Simulation-Ready Hair Capture · Efficient and Stable Approach to Elasticity and Collisions for Hair Animation · Discrete Elastic Rods
how to read this ▾ how to read this ▴
- Category
- Method: rest-shape optimization for sag-free elastic rods
- Contributions
-
- A rest-shape optimization framework that makes the input configuration an equilibrium state of a discrete elastic rod under gravity
- Per-element rest-shape parameter optimization that avoids tweaking stiffness or adding compensating forces
- A formulation compatible with existing discrete elastic rod solvers, removing initial sag for hair, cables and similar thin structures
- Context
- Targets the sag-free initialization problem for the Discrete Elastic Rods model (Bergou et al. 2008), in the lineage of rest-state/inverse-elasticity preconditioning for production simulation. Builds on: Discrete Elastic Rods
- Correctness
- The equilibrium guarantee is for the modeled rod under gravity; results assume the optimized rest shapes remain valid as the rod subsequently deforms, so check behavior under large dynamic motion and other external forces.
- Clarity
- Focused and formulation-heavy; a first pass conveys the goal, but the optimization derivation needs a second, careful pass.
- How to read it
- Skim to confirm it is a rest-shape (not force or stiffness) approach, then do a second pass on the optimization objective and how it plugs into an existing DER solver if you maintain a hair or cable pipeline.
CFX
-
, , , , ,
A triangulation-agnostic surface network deforms neutral facial meshes into industry-standard FACS poses, using 2D supervision on unlabeled meshes to scale beyond scarce professional-rigged data.
abstract ▾ abstract ▴
In this paper, we present RigAnyFace (RAF), a scalable neural auto-rigging framework for facial meshes of diverse topologies, including those with multiple disconnected components. RAF deforms a static neutral facial mesh into industry-standard FACS poses to form an expressive blendshape rig. Deformations are predicted by a triangulation-agnostic surface learning network augmented with our tailored architecture design to condition on FACS parameters and efficiently process disconnected components. For training, we curated a dataset of facial meshes, with a subset meticulously rigged by professional artists to serve as accurate 3D ground truth for deformation supervision. Due to the high cost of manual rigging, this subset is limited in size, constraining the generalization ability of models trained exclusively on it. To address this, we design a 2D supervision strategy for unlabeled neutral meshes without rigs. This strategy increases data diversity and allows for scaled training, thereby enhancing the generalization ability of models trained on this augmented data. Extensive experiments demonstrate that RAF is able to rig meshes of diverse topologies on not only our artist-crafted assets but also in-the-wild samples, outperforming previous works in accuracy and generalizability.
Related Creating an Actor-Specific Facial Rig from Performance Capture · Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild · Dynamic 3D Avatar Creation from Hand-Held Video Input · Dynamic Facial Asset and Rig Generation from a Single Scan
how to read this ▾ how to read this ▴
- Category
- Method: neural facial auto-rigging scaled with unlabeled data
- Contributions
-
- RigAnyFace, a triangulation-agnostic surface network that deforms a neutral facial mesh into FACS blendshape poses across diverse topologies, including disconnected components
- An architecture conditioned on FACS parameters that efficiently handles meshes with multiple disconnected parts
- A 2D supervision strategy for unlabeled neutral meshes that scales training beyond a small artist-rigged ground-truth subset
- Context
- Follows automatic facial rig/asset generation work such as Dynamic Facial Asset and Rig Generation from a Single Scan (Li et al. 2020), addressing the scarcity of professionally rigged 3D faces. Builds on: Dynamic Facial Asset and Rig Generation from a Single Scan
- Correctness
- Generalization is demonstrated across topologies, but the approach leans on a limited artist-rigged set augmented by 2D supervision, so the 2D signal is the key assumption and its fidelity bounds the realism of out-of-distribution rigs.
- Clarity
- Method-paper density; a first pass conveys the data-scaling motivation, a second pass needed for the network and the 2D supervision losses.
- How to read it
- Read first for why 2D supervision is used and how disconnected components are handled; second pass on the surface-learning architecture and supervision losses if you weigh it against artist-built rigs.
Facial / Rigging
-
, , ,
DreamWorks rig system for stylized rubber-hose limb animation, enabling exaggerated cartoon deformation on production characters.
abstract ▾ abstract ▴
The DreamWorks Curvy-Limb rig allows animators to pose complex curves without sacrificing intuitive usability. The system adds additional flexibility to character limbs (arms and legs) and digits (fingers and toes) while retaining the standard arm, leg, and digit controls to which animators are already familiar. Animators can activate the extra controls as needed on the fly without having to blend into any special "states." The robustness of the Curvy-Limb has proven to be instrumental to the character performances of multiple DreamWorks Animation productions, including Ruby Gillman, Teenage Kraken, Trolls Band Together, and The Wild Robot.
Related Shaping the Elements: Curvenet Animation Controls in Pixar's Elemental · How the Rig Design Impacts the Animation Process · Premo: Powerful Character Rigging, Fast Animation · LibEE: A Multithreaded Dependency Graph for Character Animation
how to read this ▾ how to read this ▴
- Category
- Production rig system: stylized curvy-limb deformation
- Contributions
-
- The DreamWorks Curvy-Limb rig, letting animators pose limbs and digits as complex curves for exaggerated rubber-hose deformation
- Extra flexibility controls layered on top of standard arm, leg and digit controls, with no special state to blend into
- On-the-fly activation of the curvy controls, proven across multiple DreamWorks productions
- Context
- Builds on stretchable/twistable skeletal deformation ideas such as Jacobson et al. 2011, adapting them to an animator-facing production control scheme. Builds on: Stretchable and Twistable Bones for Skeletal Shape Deformation
- Correctness
- This is studio practice rather than a peer-reviewed method; robustness is argued from shipped films (Ruby Gillman, Trolls Band Together, The Wild Robot), so claims are production-proven rather than benchmarked.
- Clarity
- Accessible and workflow-focused; a single read conveys the control model and animator ergonomics.
- How to read it
- Read for the control design and the seamless activation idea; one pass is enough unless you are building a similar animator-facing limb-curve control.
Rigging
- Rumba Rig: A Procedural Rigging Framework with Direct Graph-Based Control DigiPro Industrial 0 cites
, ,
Procedural rigging framework using a direct graph-based control model enabling flexible, maintainable character rig construction.
abstract ▾ abstract ▴
We present Rumba Rig, a new rigging framework that enables riggers to work directly on the final rig graph in a modular and non-linear way, without the need of auto-rig script or abstraction layer. By reducing complexity, the framework makes rigging more accessible and easier to maintain. We demonstrate through practical examples that Rumba Rig is able to express complex rig constructions and delivers professional-quality rigs.
Related Maya 2020 | Proximity Wrap Deformer · Learning an Inverse Rig Mapping for Character Animation · Stable and Efficient Differential IK · FaceLab: Scalable Facial Performance Capture for Visual Effects
how to read this ▾ how to read this ▴
- Category
- Production rigging framework: direct graph-based control
- Contributions
-
- Rumba Rig, a framework letting riggers work directly on the final rig graph in a modular, non-linear way
- Removal of the auto-rig script and abstraction layer, reducing complexity and easing maintenance
- Practical examples showing it expresses complex rig constructions at professional quality
- Context
- Relates to dependency-graph character-animation systems such as LibEE (Watt et al. 2012), emphasizing direct, maintainable graph authoring over scripted abstraction layers. Builds on: LibEE: A Multithreaded Dependency Graph for Character Animation
- Correctness
- A production-tooling contribution validated by demonstrated rig constructions rather than quantitative evaluation; benefits in accessibility and maintainability are argued qualitatively, so judge by the examples and your own pipeline fit.
- Clarity
- Workflow-oriented and accessible; a first pass conveys the direct-graph philosophy and trade-offs.
- How to read it
- Read for the design rationale of dropping the abstraction layer and how complex rigs are still expressed; one pass suffices unless you are evaluating it against your own rig-graph tooling.
Rigging
-
, , , ,
Stabilizes cloth and hair collisions in impossible upstream animation configurations without modifying the original animation, integrated into Weta's Loki framework.
abstract ▾ abstract ▴
We present a suite of techniques from our in-house simulation framework, Loki, addressing the pervasive challenge of collision instabilities in character effects, particularly in cases where nonphysical pinching prevents collision resolution. We introduce a proximity-tolerant mode for contact projection that trades collision residual for stability, a compliant kinematic mechanism for on-demand gap expansion, and contact-aware strain limiting to prevent penetrations while enforcing target edge lengths. Additionally, we showcase our tools for collider management, including hierarchical collision exclusion, one-sided collision handling, and paintable collision thickness maps. These techniques collectively demonstrate a robust and intuitive workflow for combining physically based collisions with challenging production animations.
Related Implicit Multibody Penalty-based Distributed Contact · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Fast Corotated FEM using Operator Splitting · Hair and Fur in an Evolving Pipeline
how to read this ▾ how to read this ▴
- Category
- Method: production collision-stability techniques for character FX
- Contributions
-
- A proximity-tolerant contact-projection mode trading collision residual for stability under nonphysical pinching
- A compliant kinematic mechanism for on-demand gap expansion plus contact-aware strain limiting that prevents penetration while enforcing target edge lengths
- Collider-management tools: hierarchical collision exclusion, one-sided collision handling, and paintable thickness maps
- Context
- A suite within Weta's in-house Loki multiphysics framework (2022), addressing collision instability when upstream animation presents impossible, pinching configurations. Builds on: Loki: A Unified Multiphysics Simulation Framework for Production
- Correctness
- These are pragmatic stabilization techniques aimed at robustness in impossible inputs; the proximity-tolerant mode explicitly trades physical accuracy (collision residual) for stability, so expect controlled non-physicality rather than exact contact resolution.
- Clarity
- Reads as a techniques digest from a production framework; accessible per-technique, with a first pass conveying each tool's intent.
- How to read it
- Read technique-by-technique for the contact-projection and strain-limiting ideas; a second pass on the proximity-tolerant projection is worth it if you face pinching artifacts in cloth or hair sims.
CFX
-
,
Untold Studios CFX Supervisor and Head of VFX explain Houdini solutions for simulating octopus tentacles, skin, and hair in Disney's 2024 holiday short, covering challenging creature CFX.
abstract ▾ abstract ▴
Aman Akram (CTO) and Andrea Lacedelli (CFX Supervisor) of Untold Studios present the Houdini CFX workflows for Disney's short The Boy and the Octopus, covering tentacle, skin, and hair simulation under a tight six-month, non-destructive procedural pipeline. The octopus is built as a non-hollow tetrahedral mesh simulated with tet stretch constraints, driven from the inside by a stripped-down tissue-inspired core using metaballs and magnets to push volume, with procedural groups exposed as artist sliders for stiffness and a sucker-attraction system gluing tentacles to the boy's head. They detail a thin double-sided clear-coat wetness layer created via Zbrush displacement, ray-casting and VDB operations and stabilized with a pinned cloth simulation around the eye socket, plus five long-hair grooms simulated with Vellum hair constraints, glues, and touch constraints, and a Mickey hat solved as a cloth sim with stitch and shape-match constraints. Rendering is done in Arnold with grooms, feathers, and hair deformed at render time in C++ to keep the data footprint low and time-to-first-pixel fast on their cloud-based pipeline.
Related Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Creating a Photorealistic Hyena · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Feathers: From Model to Groom to Render | nineteentwenty | Character FX & Crowds Production Talks
how to read this ▾ how to read this ▴
- Category
- Production talk: creature CFX breakdown
- Contributions
-
- Demonstrates Houdini CFX for an octopus: a non-hollow tetrahedral mesh with tet stretch constraints driven by a stripped-down tissue-inspired core using metaballs and magnets, plus artist sliders and a sucker-attraction glue system
- Shows a thin double-sided clear-coat wetness layer from Zbrush displacement, ray-casting and VDB ops, stabilized by a pinned cloth sim around the eye socket
- Covers five Vellum long-hair grooms with glue and touch constraints, a shape-matched cloth Mickey hat, and Arnold rendering with render-time C++ groom deformation for low data footprint
- Context
- Untold Studios' CFX workflow for Disney's short The Boy and the Octopus, applying Houdini Vellum and tetrahedral solvers in a non-destructive procedural pipeline.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven and shaped by a tight six-month, cloud-pipeline schedule, so techniques are pragmatic choices rather than generalized methods.
- Clarity
- Accessible breakdown for Houdini-literate viewers; one viewing conveys the setups and artist-control philosophy.
- How to read it
- Watch for the tet-core-plus-magnets volume-driving approach and the render-time deformation trick; revisit specific segments (tentacle core, wetness layer, hair) if you build comparable creature CFX.
CFX
-
,
Overview of UE5.6 IK Retargeter enhancements and the roadmap for animation retargeting, covering the journey from IK Rig setup to fully automated skeleton mapping.
Retargeting / Rigging
2024
67-
, , ,
Represents a neutral head and expression basis shapes as 3D Gaussians learned from monocular video, enabling real-time blendshape-style face animation with high-frequency detail.
abstract ▾ abstract ▴
We introduce 3D Gaussian blendshapes for modeling photorealistic head avatars. Taking a monocular video as input, we learn a base head model of neutral expression, along with a group of expression blendshapes, each of which corresponds to a basis expression in classical parametric face models. Both the neutral model and expression blendshapes are represented as 3D Gaussians, which contain a few properties to depict the avatar appearance. The avatar model of an arbitrary expression can be effectively generated by combining the neutral model and expression blendshapes through linear blending of Gaussians with the expression coefficients. High-fidelity head avatar animations can be synthesized in real time using Gaussian splatting. Compared to state-of-the-art methods, our Gaussian blendshape representation better captures high-frequency details exhibited in input video, and achieves superior rendering performance.
Related I M Avatar: Implicit Morphable Head Avatars from Videos · Reconstruction of Personalized 3D Face Rigs from Monocular Video · 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting · Neural Head Avatars from Monocular RGB Videos
how to read this ▾ how to read this ▴
- Category
- Method: photorealistic head avatar (Gaussian-splatting blendshapes)
- Contributions
-
- Represents a neutral head model and a group of expression blendshapes entirely as 3D Gaussians, learned from a monocular video.
- Generates arbitrary expressions by linearly blending neutral and blendshape Gaussians with expression coefficients, mirroring classical parametric face models.
- Real-time synthesis via Gaussian splatting, with claimed better high-frequency detail and rendering performance than prior methods.
- Context
- Bridges classical blendshape control (cf. Lewis and Anjyo, 2010, Direct Manipulation Blendshapes) with rigged Gaussian-splatting avatars (cf. GaussianAvatars, Qian et al., 2024). Builds on: Direct Manipulation Blendshapes · GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians
- Correctness
- Trained and demonstrated from monocular video with comparisons to state-of-the-art; quality depends on input-video coverage of expressions, and linear Gaussian blending is an approximation that may struggle with expressions far outside the learned basis.
- Clarity
- Accessible to readers familiar with blendshapes and 3DGS; a first pass conveys the mapping, a second pass is needed for the Gaussian-property blending details.
- How to read it
- Read for how classical blendshape semantics map onto Gaussians and the real-time blending; second pass on training and per-Gaussian parameters if you intend to build or animate such avatars.
Facial / ML Deformation
-
, , , ,
Learns non-rigid deformation of 3D Gaussians driven by a skeleton with isometric regularization, achieving real-time 50+ FPS rendering and training in 30 minutes from monocular video.
abstract ▾ abstract ▴
We introduce an approach that creates animatable hu-man avatars from monocular videos using 3D Gaussian Splatting (3DGS). Existing methods based on neural radi-ance fields (NeRFs) achieve high-quality novel-viewlnovel-pose image synthesis but often require days of training, and are extremely slow at inference time. Recently, the com-munity has explored fast grid structures for efficient training of clothed avatars. Albeit being extremely fast at training, these methods can barely achieve an interactive ren-de ring frame rate with around 15 FPS. In this paper, we use 3D Gaussian Splatting and learn a non-rigid deformation network to reconstruct animatable clothed human avatars that can be trained within 30 minutes and rendered at real-time frame rates (50+ FPS). Given the explicit nature of our representation, we further introduce as-isometric-as-possible regularizations on both the Gaussian mean vectors and the covariance matrices, enhancing the generalization of our model on highly articulated unseen poses. Experi-mental results show that our method achieves comparable and even better performance compared to state-of-the-art approaches on animatable avatar creation from a monoc-ular input, while being 400x and 250x faster in training and inference, respectively. Please see our project page at https://neuralbodies.github.ioI3DGS-Avatar.
Related Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · PointAvatar: Deformable Point-Based Head Avatars from Videos · Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans · SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
how to read this ▾ how to read this ▴
- Category
- Method: animatable full-body avatar (deformable 3D Gaussian Splatting)
- Contributions
-
- Creates animatable clothed human avatars from monocular video using 3D Gaussian Splatting plus a learned non-rigid deformation network driven by a skeleton.
- As-isometric-as-possible regularizations on Gaussian means and covariances to improve generalization to unseen, highly articulated poses.
- Trains within about 30 minutes and renders at real-time rates (50+ FPS), faster than NeRF-based and grid-based predecessors.
- Context
- Positioned against NeRF-based and fast-grid clothed-avatar methods, and builds on forward-skinning ideas for neural shapes (cf. SNARF, Chen et al., 2021). Builds on: SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
- Correctness
- Reconstructed from monocular video with comparisons reported as comparable or better than state-of-the-art; the isometric regularizers are heuristics to aid pose generalization, and reconstruction quality is bounded by the single-view input and observed pose range.
- Clarity
- Accessible to readers tracking 3DGS avatars; a first pass conveys the deform-network-plus-3DGS design and the speed argument, a second pass for the regularization terms.
- How to read it
- Read the pipeline and the training/inference speed comparison first; do a second pass on the as-isometric-as-possible regularizers if pose generalization is your concern.
Skinning / ML Deformation
-
,
Anatomy-based pipeline translates mocap markers to FACS muscle activations with explicit passive-muscle modeling for diverse character faces.
abstract ▾ abstract ▴
3D facial motion retargeting has the advantage of capturing and recreating the nuances of human facial motions and speeding up the time‐consuming 3D facial animation process. However, the facial motion retargeting pipeline is limited in reflecting the facial motion's semantic information (i.e., meaning and intensity), especially when applied to nonhuman characters. The retargeting quality heavily relies on the target face rig, which requires time‐consuming preparation such as 3D scanning of human faces and modeling of blendshapes. In this paper, we propose a facial motion retargeting pipeline aiming to provide fast and semantically accurate retargeting results for diverse characters. The new framework comprises a target face parameterization module based on face anatomy and a compatible source motion interpretation module. From the quantitative and qualitative evaluations, we found that the proposed retargeting pipeline can naturally recreate the expressions performed by a motion capture subject in equivalent meanings and intensities, such semantic accuracy extends to the faces of nonhuman characters without labor‐demanding preparations.
Related Facial Retargeting with Automatic Range of Motion Alignment · Semi-Supervised Video-Driven Facial Animation Transfer for Production · Local Anatomically-Constrained Facial Performance Retargeting · Performance-Driven Facial Animation
how to read this ▾ how to read this ▴
- Category
- Method: facial motion retargeting pipeline (anatomy-based)
- Contributions
-
- An anatomy-based target-face parameterization module plus a compatible source-motion interpretation module that maps mocap markers to FACS-style muscle activations.
- Explicit modeling of passive muscle behavior to preserve semantic meaning and intensity of expressions.
- Aims for fast, semantically accurate retargeting to diverse and nonhuman characters without labor-heavy rig preparation (e.g. 3D scanning, blendshape modeling).
- Context
- Extends artist-friendly facial retargeting (cf. Seol et al., 2011) by grounding the target representation in face anatomy rather than per-character blendshapes. Builds on: Artist Friendly Facial Animation Retargeting
- Correctness
- Reported via quantitative and qualitative evaluation showing preserved meaning and intensity, including on nonhuman faces; 'semantic accuracy' rests on the anatomy/FACS parameterization being a faithful proxy, and results are evaluation-set dependent rather than a guarantee for arbitrary rigs.
- Clarity
- Accessible if you know FACS and retargeting; a first pass conveys the two-module structure, a second pass for the passive-muscle modeling and parameterization.
- How to read it
- Focus on the source-to-target mapping and how semantics (meaning, intensity) are preserved; second pass on the anatomy parameterization if you target nonhuman or appearance-agnostic rigs.
Facial / Retargeting
-
,
Grooming and moving feathers at DreamWorks had been a technically complicated and specialized process, and the feathered characters of Kung Fu Panda 4 and The Wild Robot prompted a re-imagin
abstract ▾ abstract ▴
Grooming and moving feathers at DreamWorks had been a technically complicated and specialized process, and the feathered characters of Kung Fu Panda 4 and The Wild Robot prompted a re-imagining of the toolset to be more approachable. The modernized system adds a card layout toolset, a new curves procedural, real-time viewport visualization, shot-specific feather controls, crowd LOD features and an automated pipeline, used on hundreds of shots across both films.
Related Hummingbird: DreamWorks Feather System · Mesh-Driven Generation and Animation of Groomed Feathers · Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · Rendertime Procedural Feathers Through Blended Guide Meshes
how to read this ▾ how to read this ▴
- Category
- Production paper / pipeline modernization (feather system)
- Contributions
-
- A modernized, more approachable DreamWorks feather toolset including a card layout toolset and a new curves procedural.
- Real-time viewport visualization, shot-specific feather controls and crowd LOD features.
- An automated pipeline deployed across hundreds of shots on Kung Fu Panda 4 and The Wild Robot.
- Context
- A re-imagining of DreamWorks' prior feather system (cf. Augello et al., 2019, Hummingbird) driven by the demands of new feathered characters. Builds on: Hummingbird: DreamWorks Feather System
- Correctness
- Studio practice, not peer-reviewed; the system is production-proven across two films at scale, but design choices are tuned to DreamWorks' pipeline and artist needs rather than offering a generalizable algorithm.
- Clarity
- Accessible and goal-oriented (approachability, scale); a single read conveys the toolset and motivations, no formal derivations to revisit.
- How to read it
- Read for the usability and scale motivations and which features address grooming vs shot-control vs crowds; useful as a reference for feather/groom tooling design rather than for reproducible methods.
CFX
- A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation SIGGRAPH Academic 5 cites
, , , , ,
Neural network approximates full musculoskeletal FEM skin deformation at interactive rates, trained on biomechanically accurate simulation data.
abstract ▾ abstract ▴
We present a comprehensive neural network to model the deformation of human soft tissues including muscle, tendon, fat and skin. Our approach provides kinematic and active correctives to linear blend skinning [Magnenat-Thalmann et al. 1989] that enhance the realism of soft tissue deformation at modest computational cost. Our network accounts for deformations induced by changes in the underlying skeletal joint state as well as the active contractile state of relevant muscles. Training is done to approximate quasistatic equilibria produced from physics-based simulation of hyperelastic soft tissues in close contact. We use a layered approach to equilibrium data generation where deformation of muscle is computed first, followed by an inner skin/fascia layer, and lastly a fat layer between the fascia and outer skin. We show that a simple network model which decouples the dependence on skeletal kinematics and muscle activation state can produce compelling behaviors with modest training data burden. Active contraction of muscles is estimated using inverse dynamics where muscle moment arms are accurately predicted using the neural network to model kinematic musculotendon geometry.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · How to Build a Human: Practical Physics-Based Character Animation · Data-Driven Physics for Human Soft Tissue Animation · NiLBS: Neural Inverse Linear Blend Skinning
how to read this ▾ how to read this ▴
- Category
- Method: neural musculoskeletal skin deformation
- Contributions
-
- A neural network that models deformation of muscle, tendon, fat and skin as kinematic and active correctives to linear blend skinning.
- A layered data-generation scheme (muscle, then inner skin/fascia, then fat) producing quasistatic equilibria from physics-based hyperelastic simulation for training.
- A network that decouples skeletal-kinematics from muscle-activation dependence, plus inverse-dynamics muscle-activation estimation using NN-predicted musculotendon moment arms, at interactive cost.
- Context
- Builds on muscle-actuated human simulation (cf. Lee et al., 2019, Scalable Muscle-Actuated Human Simulation and Control) and on linear blend skinning (Magnenat-Thalmann et al., 1989) as the base it corrects. Builds on: Scalable Muscle-Actuated Human Simulation and Control
- Correctness
- Trained to approximate quasistatic equilibria from physics-based simulation, so accuracy is bounded by the training data and the quasistatic assumption (no dynamics); the decoupled-network design is reported to work with modest data but is an approximation of the full FEM solve.
- Clarity
- Technical; a first pass conveys the corrective-to-LBS idea and the layered pipeline, a second/third pass is needed for the inverse-dynamics and moment-arm modeling.
- How to read it
- Read for the layered training-data strategy and the LBS-corrective framing first; do a second and likely third pass on the inverse-dynamics muscle-activation estimation if you want to reproduce the biomechanics.
Muscles / ML Deformation / Skinning
-
, , ,
This work proposes a practical surface-based appearance model for pennaceous feathers, representing far-field appearance with a BSDF that implicitly captures light scattering from the main b
abstract ▾ abstract ▴
This work proposes a practical surface-based appearance model for pennaceous feathers, representing far-field appearance with a BSDF that implicitly captures light scattering from the main biological structures of the shaft, barbs and barbules. It introduces a flexible BSDF for non-iridescent feathers, an elliptical BCSDF accounting for elliptical fiber cross-sections and a non-centered scattering medulla, and an analytic masking term. The model can be applied to any surface and avoids explicit geometric modeling of the internal microstructures required by prior approaches.
Related Rendering Iridescent Rock Dove Neck Feathers · Microstructure-based Appearance Rendering for Feathers · Modeling and Rendering of Realistic Feathers · Biological Modeling of Feathers by Morphogenesis Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a surface-based appearance model (BSDF) for rendering feathers
- Contributions
-
- A practical far-field BSDF for non-iridescent pennaceous feathers that implicitly captures scattering from shaft, barbs and barbules
- An elliptical BCSDF accounting for elliptical fiber cross-sections and a non-centered scattering medulla, plus an analytic masking term
- A formulation applicable to any surface, avoiding explicit geometric modeling of internal microstructures
- Context
- Sits in the line of fiber and feather appearance modeling, building on earlier explicit-geometry feather work such as Chen et al.'s 'Modeling and Rendering of Realistic Feathers'. Builds on: Modeling and Rendering of Realistic Feathers
- Correctness
- The model trades explicit microstructure geometry for an implicit surface BSDF, so it targets far-field appearance; a reader should keep in mind it focuses on non-iridescent feathers and inherits the usual assumptions of BSDF/BCSDF scattering rather than full wave-optics simulation.
- Clarity
- Reasonably accessible at a high level; a first pass conveys the surface-based idea, while a second pass is needed for the BCSDF and masking-term formulation.
- How to read it
- Focus on why a surface BSDF replaces explicit barb/barbule geometry and what the elliptical cross-section and medulla terms model; do a second pass on the scattering math if you intend to implement it.
CFX
-
,
Implicit face model constrained by anatomical priors, improving 3D face fitting generalization across diverse identities and expressions.
abstract ▾ abstract ▴
Coordinate based implicit neural representations have gained rapid popularity in recent years as they have been successfully used in image, geometry and scene modeling tasks. In this work, we present a novel use case for such implicit representations in the context of learning anatomi-cally constrained face models. Actor specific anatomically constrainedface models are the state of the art in bothfacial performance capture and performance retargeting. Despite their practical success, these anatomical models are slow to evaluate and often require extensive data capture to be built. We propose the anatomical implicit face model; an ensem-ble of implicit neural networks that jointly learn to model the facial anatomy and the skin surface with high-fidelity, and can readily be used as a drop in replacement to con-ventional blendshape models. Given an arbitrary set of skin surface meshes of an actor and only a neutral shape with estimated skull and jaw bones, our method can recover a dense anatomical substructure which constrains every point on the facial surface. We demonstrate the usefulness of our approach in several tasks ranging from shape fitting, shape editing, and performance retargeting.
Related Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer · Learning an Animatable Detailed 3D Face Model from In-The-Wild Images · Neural Head Avatars from Monocular RGB Videos · PointAvatar: Deformable Point-Based Head Avatars from Videos
how to read this ▾ how to read this ▴
- Category
- Method: an anatomically constrained implicit neural face model
- Contributions
-
- An ensemble of implicit neural networks that jointly model facial anatomy and skin surface as a drop-in replacement for blendshape models
- Recovery of a dense anatomical substructure constraining every surface point from skin meshes plus a neutral shape with estimated skull and jaw
- Demonstrated use across shape fitting, shape editing and performance retargeting
- Context
- Brings coordinate-based implicit neural representations into actor-specific anatomically constrained face modeling, building on prior anatomical local deformation work such as Wu et al.'s monocular face capture model. Builds on: An Anatomically Constrained Local Deformation Model for Monocular Face Capture
- Correctness
- Validated on tasks like fitting, editing and retargeting; note it still requires a neutral shape with estimated skull and jaw bones as input, and the anatomical substructure is learned/recovered rather than measured, so reconstructed internals are an approximation.
- Clarity
- Moderately accessible if you know implicit neural fields and blendshapes; a first pass gives the motivation, a second pass for the network ensemble and constraint formulation.
- How to read it
- Focus on how the implicit model replaces classical anatomical blendshapes and what inputs it needs; a second pass pays off for the training setup and how anatomical constraints are enforced.
Facial
-
Fallen Leaf indie team details Fort Solis character pipeline using runtime deformation, artist-friendly facial rigging, UE5 motion matching locomotion, and PS5 60fps optimisation on limited resources.
Rigging / Facial / Motion Synthesis
-
, , , ,
This work builds appearance models for biological iridescence using bird feathers as a case study, covering several distinct types of feathers with different structural coloration mechanisms
abstract ▾ abstract ▴
This work builds appearance models for biological iridescence using bird feathers as a case study, covering several distinct types of feathers with different structural coloration mechanisms in their barbules. It introduces procedural geometric models of feather barbules, a fast wave-optics simulator for computing barbule scattering, and a pipeline that produces a feather BRDF. The resulting models reproduce the angle-dependent color shifts of real iridescent plumage and won the SIGGRAPH Asia 2024 Best Paper Award.
Related Rendering Iridescent Rock Dove Neck Feathers · A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence · Microstructure-based Appearance Rendering for Feathers · A Surface-based Appearance Model for Pennaceous Feathers
how to read this ▾ how to read this ▴
- Category
- Method: appearance modeling of biological iridescence (feather BRDFs)
- Contributions
-
- Procedural geometric models of feather barbules covering several distinct structural-coloration mechanisms
- A fast wave-optics simulator for computing barbule scattering
- A pipeline producing a feather BRDF that reproduces angle-dependent color shifts of real iridescent plumage
- Context
- Combines structural-color and thin-film iridescence rendering with feather-specific modeling, building on Belcour and Barla's microfacet iridescence extension and Huang et al.'s rock dove neck feather rendering. Builds on: A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence · Rendering Iridescent Rock Dove Neck Feathers
- Correctness
- Grounded in wave-optics simulation of barbule nanostructures and validated against real iridescent plumage color shifts (SIGGRAPH Asia 2024 Best Paper); keep in mind it is a forward appearance model driven by procedural nanostructure assumptions rather than measured per-specimen geometry.
- Clarity
- Topic is specialized but the paper is well-regarded; a first pass conveys the pipeline, a second pass for the wave-optics and procedural-geometry details.
- How to read it
- Focus on the chain from nanostructure geometry to wave-optics simulation to BRDF; a second pass is worthwhile for the simulator and how diverse coloration mechanisms are parameterized.
CFX
-
Apteryx is a Weta FX feather toolset that provides a complete workflow of procedural generation, hand sculpting and grooming for feathered creatures and costumes.
abstract ▾ abstract ▴
Apteryx is a Weta FX feather toolset that provides a complete workflow of procedural generation, hand sculpting and grooming for feathered creatures and costumes. Developed during Kingdom of the Planet of the Apes to build a village of digital eagles, it produces simulation-ready models with realistic dynamics under motion and wind.
Related Mesh-Driven Generation and Animation of Groomed Feathers · Hummingbird: DreamWorks Feather System · A Modernization of the DreamWorks Feather System · Feathers for Mystical Creatures: Pegasus
how to read this ▾ how to read this ▴
- Category
- Production talk / breakdown: a studio feather toolset
- Contributions
-
- Demonstrates Apteryx, a complete procedural generation, hand sculpting and grooming workflow for feathered creatures and costumes
- Produces simulation-ready feather models with realistic dynamics under motion and wind
- Shows the toolset applied to building a village of digital eagles on Kingdom of the Planet of the Apes
- Context
- Extends procedural grooming pipeline practice (e.g. Choi et al.'s 'Build Your Own Procedural Grooming Pipeline') from hair/fur toward feathers. Builds on: Build Your Own Procedural Grooming Pipeline
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a specific film, so the workflow reflects that production's needs rather than a generalized or benchmarked method.
- Clarity
- Accessible as a workflow overview; one pass conveys the pipeline and intent, with limited formal detail to revisit.
- How to read it
- Read once for the procedural-to-sculpt-to-groom workflow and how feathers are made simulation-ready; useful as reference for pipeline design rather than for reproducible algorithms.
CFX
-
, , ,
Tools and workflows for grooming, simulating, and stylizing Asha's complex tightly-braided locks in Wish, covering braid simulation and art-direction techniques.
abstract ▾ abstract ▴
For Walt Disney Animation Studios’ 100th anniversary feature, the filmmakers wanted to honor the studio’s legacy with a stylized look that draws from the rich artistic heritage of Disney’s earliest films. In addition to the challenges of a highly art-directed stylized look, Asha’s hairstyle comprises a full head of long, thin, tightly-braided locks which was far more complex than our previous braid grooms. In order to art direct Asha’s stylized hair performance in "Wish", several advancements were made in tools and workflows across the character departments for grooming, simulation, and stylization techniques.
Related Scriptable Character FX Solution · Choreography of Hair and Cloth in Disney's Moana 2 · Hierarchical Controls for Art-Directed Hair at Disney · The Art and Technology of Hair Simulation in Disney's Moana
how to read this ▾ how to read this ▴
- Category
- Production talk / breakdown: art-directed braided hair workflow
- Contributions
-
- Demonstrates advancements in grooming, simulation and stylization for a full head of long, thin, tightly-braided locks on Disney's Wish
- Shows braid simulation and art-direction techniques to hit a highly stylized look honoring early Disney films
- Covers cross-department tool and workflow changes needed for the stylized hair performance
- Context
- Builds on Disney's art-directed hair lineage, notably Kaur et al.'s 'Hierarchical Controls for Art-Directed Hair', extending it to more complex braided grooms. Builds on: Hierarchical Controls for Art-Directed Hair at Disney
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a single feature, so the techniques are tuned to that film's stylized braids rather than presented as a general method.
- Clarity
- Accessible as a production case study; one pass conveys the challenges and solutions, with little heavy formalism to revisit.
- How to read it
- Read once focusing on how braid complexity is groomed, simulated and kept art-directable; treat as a practical reference for stylized hair pipelines rather than for algorithms.
CFX
-
Remedy Entertainment describes integrating OpenUSD into the Northlight engine for Alan Wake 2, covering character asset modularization and a live-edit pipeline from USD to runtime format.
Rigging
- CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control arXiv Academic 102 cites
, , , , , , ,
Autoregressive diffusion planner runs in closed-loop with an RL tracking controller enabling text-guided multi-task physics character animation.
abstract ▾ abstract ▴
Motion diffusion models and Reinforcement Learning (RL) based control for physics-based simulations have complementary strengths for human motion generation. The former is capable of generating a wide variety of motions, adhering to intuitive control such as text, while the latter offers physically plausible motion and direct interaction with the environment. In this work, we present a method that combines their respective strengths. CLoSD is a text-driven RL physics-based controller, guided by diffusion generation for various tasks. Our key insight is that motion diffusion can serve as an on-the-fly universal planner for a robust RL controller. To this end, CLoSD maintains a closed-loop interaction between two modules -- a Diffusion Planner (DiP), and a tracking controller. DiP is a fast-responding autoregressive diffusion model, controlled by textual prompts and target locations, and the controller is a simple and robust motion imitator that continuously receives motion plans from DiP and provides feedback from the environment. CLoSD is capable of seamlessly performing a sequence of different tasks, including navigation to a goal location, striking an object with a hand or foot as specified in a text prompt, sitting down, and getting up. https://guytevet.github.io/CLoSD-page/
Related Perpetual Humanoid Control for Real-time Simulated Avatars · SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation · Generating Diverse and Natural 3D Human Motions from Text · TEMOS: Generating Diverse Human Motions from Textual Descriptions
how to read this ▾ how to read this ▴
- Category
- Method: text-driven physics-based character control coupling diffusion and RL
- Contributions
-
- CLoSD, a text-driven RL physics-based controller guided on-the-fly by motion diffusion
- A closed-loop interaction between a fast autoregressive Diffusion Planner (DiP) and a robust RL tracking/imitation controller
- Seamless execution of a sequence of multi-task behaviors such as navigation, striking objects, and other goal-directed actions
- Context
- Unifies motion diffusion models and RL physics control, building on text-to-motion diffusion (Tevet et al.'s Human Motion Diffusion Model) with a tracking-imitation controller. Builds on: Human Motion Diffusion Model
- Correctness
- Key insight is that diffusion can act as an on-the-fly universal planner for a robust imitator; demonstrated on simulated physics tasks, so plausibility comes from simulation and the controller's tracking fidelity rather than real-world deployment, and the abstract is truncated on full quantitative limits.
- Clarity
- Accessible if familiar with diffusion and RL control; a first pass conveys the closed-loop architecture, a second pass for DiP's autoregression and the controller's training.
- How to read it
- Focus on the planner-controller loop and why closed-loop coupling beats either module alone; a second pass pays off for the autoregressive diffusion conditioning and imitation reward design.
Motion Synthesis
-
, , , ,
Proximal-algorithm method to bake large facial blendshape sets into a compact linear blend skinning representation for real-time use.
abstract ▾ abstract ▴
We present a new method to bake classical facial animation blendshapes into a fast linear blend skinning representation. Previous work explored skinning decomposition methods that approximate general animated meshes using a dense set of bone transformations; these optimizers typically alternate between optimizing for the bone transformations and the skinning weights. We depart from this alternating scheme and propose a new approach based on proximal algorithms, which effectively means adding a projection step to the popular Adam optimizer. This approach is very flexible and allows us to quickly experiment with various additional constraints and/or loss functions. Specifically, we depart from the classical skinning paradigms and restrict the transformation coefficients to contain only about 90% non-zeros, while achieving similar accuracy and visual quality as the state-of-the-art. The sparse storage enables our method to deliver significant savings in terms of both memory and run-time speed. We include a compact implementation of our new skinning decomposition method in PyTorch, which is easy to experiment with and modify to related problems.
Related Fast and Efficient Skinning of Animated Meshes · Robust and Accurate Skeletal Rigging from Mesh Sequences · Efficient Dynamic Skinning with Low-Rank Helper Bone Controllers · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks
how to read this ▾ how to read this ▴
- Category
- Method: a skinning-decomposition algorithm for compressing facial blendshapes
- Contributions
-
- A method to bake classical facial blendshapes into a fast linear blend skinning representation
- A proximal-algorithm approach (a projection step added to Adam) replacing the usual alternating optimization, allowing flexible constraints and losses
- Sparse transformation coefficients (about 90% non-zeros restricted out) yielding memory and run-time savings at comparable accuracy, with a compact PyTorch implementation
- Context
- Sits in the skinning-decomposition and pose-space deformation lineage (building on Lewis et al.'s Pose Space Deformation), targeting real-time facial animation. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Claims similar accuracy and visual quality to state-of-the-art while enforcing sparsity; keep in mind it approximates blendshapes with LBS, so reproduction quality depends on the target rig and the sparsity level chosen, and gains are reported relative to prior decomposition methods.
- Clarity
- Accessible to readers familiar with skinning and optimization; a first pass conveys the proximal/projection idea, a second pass for the loss formulation and constraints.
- How to read it
- Focus on how the proximal projection enforces sparsity and why that helps memory and speed; a second pass plus the PyTorch code pays off if implementing or extending it.
Facial / Skinning
-
,
Autodesk Animation Product Manager Lance Thornton and Sr. Principal Research Scientist Evan Atherton explore how AI and ML techniques can accelerate rigging and animation production workflows in Maya.
abstract ▾ abstract ▴
Autodesk's Lance Thornton and Evan Atherton present two AI approaches to character work in Maya. Thornton demonstrates the machine learning deformer, which trains locally on automatically generated or animation-driven poses to approximate a complex deformer or chain (such as a hero muscle system) with a single constant-time node, showing roughly 4x faster playback and interactive manipulation on a Digital Fish rig, support for bodies and faces, paintable weights and falloffs, CPU or GPU evaluation, switching back to the full rig for polish, and a mesh-compare heat map. Atherton then shows a neural motion control research prototype built in Bifrost, where a small neural network trained on under an hour of motion capture predicts each next pose from the current pose and a keyframed trajectory, producing art-directable walk, run, jump, and sit behaviors for quadrupeds and bipeds. The motion is passed through Human IK to retarget behavior across rigs of differing proportions and bake to standard controllers, with animation layers for final tweaks and Bifrost particle instancing for crowds.
Related Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · One Model to Rig Them All: Diverse Skeleton Rigging with UniRig · Motion Retargeting for Crowd Simulation · MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds
how to read this ▾ how to read this ▴
- Category
- Production talk: AI/ML for rigging and animation in Maya
- Contributions
-
- Demonstrates a machine learning deformer that trains locally to approximate a complex deformer or chain with a single constant-time node, with faster playback, paintable weights/falloffs, CPU or GPU evaluation, and a mesh-compare heat map
- Shows a neural motion control prototype in Bifrost where a small network trained on under an hour of mocap predicts the next pose from the current pose and a keyframed trajectory
- Covers art-directable walk/run/jump/sit for quadrupeds and bipeds, with Human IK retargeting across rigs and animation layers for final polish
- Context
- Brings learned-deformer and learned-motion ideas into a commercial DCC, in the spirit of fast deformation approximation work such as Bailey et al.'s 'Fast and Deep Deformation Approximations'. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Studio/vendor practice, not peer-reviewed; results are demo-proven on specific rigs (e.g. a Digital Fish rig) and described as approximations of the full rig, with the full rig retained for polish, so quality and the cited speedups are illustrative rather than benchmarked.
- Clarity
- Very accessible; a single viewing conveys both tools and their intended workflow role.
- How to read it
- Watch once to understand where learned deformers and neural motion fit in a Maya pipeline and their tradeoffs (approximation vs full-rig polish); no deep formal pass needed, treat as a capabilities demo.
ML Deformation / Rigging
- Developing a Curve Rigging Toolset: A Case Study in Adapting to Production Changes DigiPro Industrial 1 cites
,
Case study on iteratively developing a curve rigging toolset responding to shifting production requirements across multiple film projects.
abstract ▾ abstract ▴
We present an overview of Animal Logic’s curve rigging toolset and its development process, serving as a case study to discuss challenges specific to doing software development for animated feature film production. We will show how R&D projects at Animal Logic lean on agile software practices to enable ambitious development projects, with flexible plans that adapt to the reality of working with creative stakeholders. We will highlight the importance of production engagement, reflect on our technical decisions made over a year of active development while reacting to drastic production schedule changes, and share lessons learned along the way.
Related USD in Production · Group Based Rigging of Realistically Feathered Wings · ChopRig System · Rubber Hose Revival: The DreamWorks Curvy-Limb Rig
how to read this ▾ how to read this ▴
- Category
- Production case study: developing a curve rigging toolset
- Contributions
-
- An overview of Animal Logic's curve rigging toolset and its development process
- A case study in applying agile software practices to feature-film R&D with flexible, adapting plans
- Reflections on technical decisions and lessons learned across a year of development amid drastic schedule changes
- Context
- Frames a production-software process around curve-based rigging, relating to articulation approaches such as de Goes et al.'s 'Character Articulation through Profile Curves'. Builds on: Character Articulation through Profile Curves
- Correctness
- Studio practice and process retrospective, not a benchmarked method; insights are specific to one studio's projects and workflow, so they are illustrative rather than generalizable claims.
- Clarity
- Accessible and narrative; one pass conveys the process and lessons, with little formal technique to revisit.
- How to read it
- Read once for the process and project-management lessons more than for rigging algorithms; most valuable to those managing or building production R&D rather than implementers.
Rigging
-
Advanced MetaHuman workflows covering MetaHuman Creator, Mesh to MetaHuman, and MetaHuman Animator best practices including Live Link Face calibration and groom application.
Facial / Rigging / Skinning
-
, , , , ,
Shared finite-scalar-quantisation codebook maps retargeted human skeleton motion to realistic synchronised quadruped animation.
abstract ▾ abstract ▴
Many VR animal embodiment sytsems suffer from poor animation fidelity, typically animating the animal avatars using inverse kinematics. We address this issue, presenting a novel deep-learning method, centred around a shared codebook, for mapping human motion to quadruped motion. Rather than trying to directly bridge the gap from human motion to quadruped motion, a task which has proven difficult, we first use a rule-based retargeter, relying on inverse and forward kinematics, to retarget human motions to an intermediate motion domain in which the motions share the same skeleton as the quadruped. We then use finite scalar quantization to construct a shared latent space, or codebook, between this intermediate domain and the quadruped motion domain. We do this by first pre-defining a finite number of discrete latent codes and then teaching these codes, using unsupervised deep-learning, to represent semantically similar motions in the two domains. We incorporate our real-time human-to-quadruped motion mapping into a VR quadruped embodiment system. The output quadruped animations are natural and realistic, while also preserving the semantics of users’ actions. Moreover, there is a strong synchrony between the input human motions and retargeted quadruped motions, an important factor for inducing a strong sense of VR embodiment.
Related Automated Extraction and Parameterization of Motions in Large Data Sets · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Robust Marker Trajectory Repair for MOCAP Using Kinematic Reference · Physically Based Motion Transformation
how to read this ▾ how to read this ▴
- Category
- Method: deep-learning human-to-quadruped motion mapping for VR embodiment
- Contributions
-
- A shared codebook (via finite scalar quantization) linking an intermediate human-skeleton domain to the quadruped motion domain
- A two-stage approach: rule-based IK/FK retargeting of human motion to an intermediate domain sharing the quadruped skeleton, then learned mapping to quadruped motion
- Real-time integration into a VR quadruped embodiment system producing natural, semantics-preserving animations
- Context
- Advances neural quadruped animation and motion retargeting, building on the authors' prior 'How to Train Your Dog: Neural Enhancement of Quadruped Animations'. Builds on: How to Train Your Dog: Neural Enhancement of Quadruped Animations
- Correctness
- Outputs are reported as natural and realistic while preserving action semantics; keep in mind the pipeline relies on a rule-based intermediate retargeter and an unsupervised shared codebook, so fidelity depends on that intermediate domain and on the captured quadruped data, and the abstract's quantitative claims are truncated.
- Clarity
- Accessible if familiar with motion retargeting and VQ-style latent spaces; a first pass conveys the two-stage idea, a second pass for the finite scalar quantization and codebook training.
- How to read it
- Focus on why an intermediate shared-skeleton domain plus a learned shared codebook beats direct human-to-quadruped mapping; a second pass pays off for the quantization and unsupervised training details.
Motion Synthesis / Retargeting
-
, , , , ,
End-to-end pipeline estimates shell-level cloth elasticity parameters from yarn stretch tests via extended homogenization matching.
abstract ▾ abstract ▴
Virtual garment simulation has become increasingly important with applications in garment design and virtual try-on. However, reproducing garments faithfully remains a cumbersome process. We propose an end-to-end forward pipeline for estimating parameters of shell material models corresponding to real fabrics with minimal input. In contrast to prior work that relies on complex and often expensive capture systems, our method determines yarn model parameters from Young’s moduli determined during standard yarn stretch tests. We use an extended homogenization method to match yarn-level and shell-level hyperelastic energies with respect to a range of surface deformations represented by the first and second fundamental forms, including anisotropic bending. We optimize the parameters of a shell material model involving uncoupled bending and membrane energies. This allows the simulated shell model to exhibit deformation modes motivated by yarn-level physics in real fabrics. Finally, we validate our results with quantitative and visual comparisons against real world fabrics through stretch tests and drape experiments. Using the homogenized parameters, the shell models are capable of capturing the characteristics of underlying yarn patterns and exhibiting distinct behaviors for different yarn materials.
Related Homogenized Yarn-Level Cloth · Sag-Free Initialization for Strand-Based Hybrid Hair Simulation · Interactive Hair Simulation on the GPU Using ADMM · Simulating Cloth Using Bilinear Elements
how to read this ▾ how to read this ▴
- Category
- Method: parameter estimation for cloth simulation
- Contributions
-
- An end-to-end forward pipeline that estimates shell-level cloth material parameters from minimal input (standard yarn stretch tests, Young's moduli).
- An extended homogenization that matches yarn-level and shell-level hyperelastic energies across surface deformations described by the first and second fundamental forms, including anisotropic bending.
- Optimizes a shell model with uncoupled bending and membrane energies so simulated cloth shows deformation modes motivated by real yarn-level physics.
- Context
- Builds on homogenized yarn-level cloth (Sperl et al., 'Homogenized Yarn-Level Cloth'), extending that homogenization idea to derive shell parameters from inexpensive yarn tests rather than full capture rigs. Builds on: Homogenized Yarn-Level Cloth
- Correctness
- Aimed at avoiding expensive capture systems by relying on standard yarn stretch tests; results are validated quantitatively and visually against real fabrics via stretch and drape experiments, so a reader should keep in mind it targets fabrics describable by yarn-level physics and the homogenization assumptions used.
- Clarity
- Reasonably accessible at the conceptual level; a first pass conveys the pipeline, but a second pass is needed for the homogenization and energy-matching formulation.
- How to read it
- First pass for the pipeline and where the inputs come from; do a focused second pass on the extended homogenization (fundamental forms, anisotropic bending, uncoupled energies) if you intend to reimplement or assess fidelity.
CFX
- talk Exploring the Power of Control Rig for Animation in Unreal Engine 5.4 | Unreal Fest Gold Coast 2024 Unreal Epic
Deep dive into Control Rig's animation editing and procedural rigging capabilities in UE5.4, covering real-time authoring workflows for character control rigs.
Rigging / Motion Synthesis
- Factorized Motion Diffusion for Precise and Character-Agnostic Motion Inbetweening MIG Disney Research 12 cites
, , , , , ,
Diffusion model for motion inbetweening that factorizes motion into character-agnostic and character-specific components for precise controllability.
abstract ▾ abstract ▴
Animation is a challenging and time-consuming process where animators must manipulate hundreds of controls over space and time to create compelling motions. Recent advances in motion diffusion models have shown impressive results for general motion generation and hold the potential to reduce the number of controls manipulated by animators to achieve high quality results. However, these models are limited by their inability to match sparse constraints precisely, preventing frame-level joint control required by artists. Additionally, recent models are trained for specific characters, preventing reuse, and are incompatible for characters with only a small datasets available. To tackle these shortcomings, we propose a novel factorization of motion between a character-agnostic Bézier Motion Model (BMM), which can be trained on a large motion dataset, followed by a character-specific posing model, trainable on a much smaller pose dataset, that enables reuse across many characters. BMM provides accuracy for meeting sparse joint-level constraints by working in a reduced space of Bézier curves that better aligns the condition signal with the prediction space of our model. Additionally, the Bézier curves offer animators an intuitive interface compatible with existing authoring software.
Related SKEL-Betweener: a Neural Motion Rig for Interactive Motion Authoring · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Improv: A System for Scripting Interactive Actors in Virtual Worlds · Character Motion Synthesis by Topology Coordinates
how to read this ▾ how to read this ▴
- Category
- Method: motion diffusion model for inbetweening
- Contributions
-
- A factorization of motion into a character-agnostic Bezier Motion Model (BMM), trainable on a large dataset, and a character-specific posing model trainable on a much smaller pose dataset.
- Working in a reduced space of Bezier curves to better align sparse joint-level constraints with the model's prediction space, improving precision at meeting sparse constraints.
- Reuse across many characters, including those with only small datasets available.
- Context
- Sits in the line of learned motion inbetweening and motion diffusion, building on robust motion in-betweening (Harvey et al., 'Robust Motion In-Betweening') while addressing diffusion models' difficulty matching sparse constraints precisely. Builds on: Robust Motion In-Betweening
- Correctness
- Motivated by the artist need for frame-level joint control that general diffusion models miss; the factorization assumes a useful split between character-agnostic curves and per-character posing, and a reader should note that quality on novel characters depends on the small per-character pose data available.
- Clarity
- Accessible framing of the problem; a first pass conveys the factorization idea, a second pass clarifies the Bezier reduced space and the diffusion conditioning.
- How to read it
- First pass for the agnostic-vs-specific factorization and why Bezier curves help constraint matching; second pass for the model conditioning and posing stage if applying it to your own character rigs.
Motion Synthesis
-
, , ,
Real-time method for synthesizing temporally consistent dynamic facial wrinkles driven by expression parameters for character faces.
abstract ▾ abstract ▴
The paper presents a method to animate dynamic skin wrinkles under facial expressions by modifying how wrinkles are constructed in displacement maps based on local stretch and compression. Wrinkles perpendicular to the stress axis become wider when stretched and deeper when compressed. A fast approximation divides wrinkles into orientation bins with pre-computed displacement maps that are blended in a shader, achieving 50x speedup compared to the per-frame approach.
Related Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation · The Digital Emily Project: Achieving a Photorealistic Digital Actor · A Pixel-Based Framework for Data-Driven Clothing
how to read this ▾ how to read this ▴
- Category
- Method: real-time facial wrinkle synthesis
- Contributions
-
- A method that animates dynamic skin wrinkles under expression by modifying displacement-map wrinkles based on local stretch and compression (wider when stretched, deeper when compressed).
- A fast approximation that bins wrinkles by orientation with pre-computed displacement maps blended in a shader.
- A reported ~50x speedup compared to the per-frame approach while keeping temporally consistent dynamic wrinkles.
- Context
- Extends deformation-driven displacement-map facial detailing (Ma et al., 'Facial Performance Synthesis using Deformation-Driven Polynomial Displacement Maps') toward a real-time, stretch/compression-aware wrinkle formulation. Builds on: Facial Performance Synthesis using Deformation-Driven Polynomial Displacement Maps
- Correctness
- Rests on the assumption that wrinkles behave perpendicular to the local stress axis and can be approximated by orientation bins blended in a shader; the speedup is an approximation of a per-frame method, so a reader should weigh fidelity against the binning/blending shortcut.
- Clarity
- Quite accessible; a single first pass conveys the core stretch/compression rule and the binning trick, with a second pass only for the shader blending details.
- How to read it
- First pass for the stretch-compression wrinkle rule and the orientation-bin approximation; a brief second pass on the shader blending is enough if you plan to integrate it into a real-time face pipeline.
Facial / ML Deformation
-
, , , ,
Retargets interaction motions between two characters while preserving skinned mesh geometry and contact semantics.
abstract ▾ abstract ▴
Interactive motion between multiple characters is widely utilized in games and movies. However, the method for generating interactive motions considering the character's diverse mesh shape has yet to be studied. We propose a Spatio Cooperative Transformer (SCT) to retarget the interacting motions of two characters having arbitrary mesh connectivity. SCT predicts the residual of root position and joint rotations considering the shape difference between the source and target of interacting characters. In addition, we introduce an anchor loss function for SCT to maintain the geometric distance between the interacting characters when they are retargeted. We also propose a motion augmentation method with deformation-based adaptation to prepare a source-target paired dataset with an identical mesh connectivity for training. In experiments, our method achieved higher accuracy for semantic preservation and produced less artifacts of inter-penetration between the interacting characters for unseen characters and motions than the baselines. Moreover, we conducted a user evaluation using characters with various shapes, spanning low-to-high interaction levels to prove better semantic preservation of our method compared to previous studies.
Related Contact-Aware Retargeting of Skinned Motion · Aura Mesh: Motion Retargeting to Preserve the Spatial Relationships between Skinned Characters · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks
how to read this ▾ how to read this ▴
- Category
- Method: interaction-aware motion retargeting
- Contributions
-
- A Spatio Cooperative Transformer (SCT) that retargets the interacting motions of two characters with arbitrary mesh connectivity by predicting residual root position and joint rotations from shape differences.
- An anchor loss that maintains the geometric distance between interacting characters after retargeting, reducing inter-penetration.
- A motion augmentation with deformation-based adaptation to build a source-target paired dataset with identical mesh connectivity for training.
- Context
- Extends contact-aware skinned-motion retargeting (Villegas et al., 'Contact-Aware Retargeting of Skinned Motion') to the two-character interaction setting where both skinned shapes must be respected. Builds on: Contact-Aware Retargeting of Skinned Motion
- Correctness
- Reported to give higher semantic preservation and fewer inter-penetration artifacts than baselines on unseen characters and motions, plus a user study; the validation rests on its augmented paired dataset, so generalization beyond the trained interaction types is a reasonable caution.
- Clarity
- Accessible problem statement; a first pass conveys the SCT and anchor-loss idea, a second pass for the residual prediction and augmentation pipeline.
- How to read it
- First pass for the two-character contact-preservation goal and the anchor loss; second pass on the SCT architecture and dataset augmentation if you work on retargeting interacting characters.
Retargeting / Skinning
-
, , , , , , ,
Prior-free multi-view hair capture using a neural implicit hair volume that encodes high-resolution 3D orientation and occupancy; best paper award at SIGGRAPH Asia 2024.
abstract ▾ abstract ▴
GroomCap is a multi-view hair capture method that reconstructs high-fidelity strand-level hair geometry without relying on external data priors. It introduces a neural implicit representation for the hair volume that encodes high-resolution 3D orientation and occupancy from input views, trained with a volumetric 3D orientation rendering algorithm and 2D orientation distribution supervision to avoid loss of structural information from orientation blending. Initial strands are traced within the volume and then refined through a Gaussian-based optimization using a chained Gaussian representation with direct photometric supervision from images. The pipeline produces dense, scalp-rooted strand geometries across diverse hairstyles using the same parameters, supporting re-rendering, physics-based animation, and editing.
Related Simulation-Ready Hair Capture · Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction · Structure-Aware Hair Capture · GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations
how to read this ▾ how to read this ▴
- Category
- Capture system: multi-view strand-level hair
- Contributions
-
- A prior-free multi-view hair capture method reconstructing high-fidelity strand-level geometry without external data priors.
- A neural implicit hair-volume representation encoding high-resolution 3D orientation and occupancy, trained with a volumetric 3D orientation rendering algorithm and 2D orientation distribution supervision to avoid blending-induced loss of structure.
- Strands traced in the volume then refined by a chained-Gaussian optimization with direct photometric supervision, yielding dense scalp-rooted strands across diverse hairstyles with the same parameters.
- Context
- Advances strand-accurate multi-view hair capture (Nam et al., 'Strand-Accurate Multi-View Hair Capture') and relates to generative hair modeling (Zhou et al., 'GroomGen'), here pursuing fidelity without learned priors; best paper at SIGGRAPH Asia 2024. Builds on: Strand-Accurate Multi-View Hair Capture · GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations
- Correctness
- Demonstrated across diverse hairstyles using one parameter set and supports re-rendering, physics animation, and editing; being prior-free it leans on multi-view input quality and the orientation-supervision scheme, which a reader should keep in mind for sparse or occluded captures.
- Clarity
- Dense but well-motivated; a first pass conveys the implicit-volume-then-trace-then-refine flow, a second pass for the orientation rendering and chained-Gaussian refinement.
- How to read it
- First pass for the two-stage pipeline (implicit volume to traced strands to Gaussian refinement) and why prior-free matters; second pass on the orientation supervision if reconstructing your own grooms.
CFX
-
, , , , , , ,
HIT is an implicit volumetric function that classifies body interior points as fat, lean tissue, or bone given the SMPL body shape, trained on full-body MRI scans.
abstract ▾ abstract ▴
HIT learns to infer the 3D location of three anatomic tissue types: subcutaneous adipose tissue, lean tissue (muscles and organs), and long bones, from an external body shape. The dataset consists of 260 female and 182 male full-body MRI scans. A learned volumetric deformation field corrects for soft-tissue changes between upright and supine MRI positions. Because HIT is parameterized by SMPL, internal structures deform plausibly when bodies are reposed or reshaped. Dataset and model are publicly available.
Related BOSS: Bones, Organs and Skin Shape Model · From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging · SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
how to read this ▾ how to read this ▴
- Category
- Method + dataset: implicit internal-anatomy model
- Contributions
-
- HIT, an implicit volumetric function that classifies body-interior points as subcutaneous adipose tissue, lean tissue, or long bone given an external body shape.
- A learned volumetric deformation field correcting soft-tissue changes between upright and supine MRI positions.
- Parameterization by SMPL so internal structures deform plausibly when the body is reposed or reshaped; dataset (260 female, 182 male full-body MRI) and model released publicly.
- Context
- Continues the authors' line on inferring internal structure from the body surface (OSSO, SKEL) and is built on the SMPL body model (Loper et al., 'SMPL: A Skinned Multi-Person Linear Model'). Builds on: OSSO: Obtaining Skeletal Shape from Outside · From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans · SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Trained and grounded on full-body MRI scans with a stated subject count, so a reader should keep in mind the demographic and posture range of that scan set and that tissue inference is a learned classification, not a per-subject medical measurement.
- Clarity
- Accessible given familiarity with SMPL and implicit functions; a first pass conveys the inputs, tissue classes, and SMPL coupling, with a second pass for the deformation field.
- How to read it
- First pass for what HIT infers and how SMPL parameterization drives reposing; second pass on the upright-to-supine deformation field and training data if you plan to use the released model.
Muscles / Skinning
-
, ,
Production experience implementing a neural network deformer for crowd characters, covering training, integration, and quality tradeoffs.
abstract ▾ abstract ▴
CG crowds have become increasingly popular this last decade in the VFX and animation industry: formerly reserved to only a few high end studios and blockbusters, they are now widely used in TV shows or commercials. Yet, there is still one major limitation: in order to be ingested properly in crowd software, studio rigs have to comply with specific prerequisites, especially in terms of deformations. Usually only skinning, blend shapes and geometry caches are supported preventing close-up shots with facial performances on crowd characters. We envisioned two approaches to tackle this: either reverse engineer the hundreds of deformer nodes available in the major DCCs/plugins and incorporate them in our crowd package, or surf the machine learning wave to compress the deformations of a rig using a neural network architecture. Considering we could not commit 5+ man/years of development into this problem, and that we were excited to dip our toes in the machine learning pool, we went for the latter. From our first tests to a minimum viable product, we went through hopes and disappointments: we hit multiple pitfalls, took false shortcuts and dead ends before reaching our destination. With this paper, we hope to provide a valuable feedback by sharing the lessons we learnt from this experience.
Related Accurate Face Rig Approximation with Deep Differential Subspace Reconstruction · Multi-Resolution Real-Time Deep Pose-Space Deformation · Dyna: A Model of Dynamic Human Shape in Motion · Subspace Neural Physics: Fast Data-Driven Interactive Simulation
how to read this ▾ how to read this ▴
- Category
- Production talk / experience report: ML deformer for crowds
- Contributions
-
- Demonstrates a production journey implementing a neural-network deformer to compress a studio rig's deformations for CG crowd characters.
- Motivates ML over reverse-engineering hundreds of DCC deformer nodes, so crowd characters can support close-up facial performance beyond skinning, blend shapes, and geometry caches.
- Shares the path from first tests to a minimum viable product, including pitfalls, false shortcuts, and dead ends.
- Context
- Applies learned rig approximation in the spirit of Fast and Deep Deformation Approximations (Bailey et al., 'FDDA') to the specific constraints of crowd-software pipelines. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Studio practice rather than peer-reviewed research; results are production-oriented and the value is in the reported tradeoffs and integration lessons, so treat quality and performance claims as context-specific to this studio's pipeline.
- Clarity
- Very accessible and narrative; a single first pass conveys the decisions and pitfalls, no formal formulation to decode.
- How to read it
- Read once for the practical decision rationale (ML vs node reimplementation) and the integration/quality tradeoffs; useful as a checklist of pitfalls rather than a method to reimplement.
ML Deformation / Skinning
- talk Innovation Unleashed: High-Performance UE5 Mobile Rendering and Next-Gen Character Creation Pipeline Powered by Machine Learning GDC Industrial
,
Lightspeed Studios (Tencent) on a machine-learning-driven next-gen character creation pipeline for UE5 mobile, alongside high-performance mobile rendering.
ML Deformation / Rigging
-
, , , , ,
Auto-regressive diffusion model generates physically plausible real-time character motion responding interactively to user control signals.
abstract ▾ abstract ▴
Real-time character control is an essential component for interactive experiences, with a broad range of applications, including physics simulations, video games, and virtual reality. The success of diffusion models for image synthesis has led to the use of these models for motion synthesis. However, the majority of these motion diffusion models are primarily designed for offline applications, where space-time models are used to synthesize an entire sequence of frames simultaneously with a pre-specified length. To enable real-time motion synthesis with diffusion model that allows time-varying controls, we propose A-MDM (Auto-regressive Motion Diffusion Model). Our conditional diffusion model takes an initial pose as input, and auto-regressively generates successive motion frames conditioned on the previous frame. Despite its streamlined network architecture, which uses simple MLPs, our framework is capable of generating diverse, long-horizon, and high-fidelity motion sequences. Furthermore, we introduce a suite of techniques for incorporating interactive controls into A-MDM, such as task-oriented sampling, in-painting, and hierarchical reinforcement learning (See Figure 1). These techniques enable a pre-trained A-MDM to be efficiently adapted for a variety of new downstream tasks.
Related Character Controllers Using Motion VAEs · SKEL-Betweener: a Neural Motion Rig for Interactive Motion Authoring · PDP: Physics-Based Character Animation via Diffusion Policy · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control
how to read this ▾ how to read this ▴
- Category
- Method: real-time motion diffusion for character control
- Contributions
-
- A-MDM, an auto-regressive conditional motion diffusion model that takes an initial pose and generates successive frames conditioned on the previous frame, enabling real-time, time-varying control.
- A streamlined architecture using simple MLPs that still produces diverse, long-horizon, high-fidelity motion.
- A suite of interactive-control techniques: task-oriented sampling, in-painting, and hierarchical reinforcement learning.
- Context
- Adapts motion diffusion (Tevet et al., 'Human Motion Diffusion Model') from offline whole-sequence synthesis to an auto-regressive, frame-by-frame formulation for interactive control. Builds on: Human Motion Diffusion Model
- Correctness
- Built on the assumption that frame-conditioned auto-regression preserves quality while gaining real-time controllability; long-horizon stability and control fidelity rest on the added sampling/in-painting/RL techniques, so a reader should watch for drift and the cost of the control machinery.
- Clarity
- Accessible if familiar with diffusion-based motion; a first pass conveys the auto-regressive shift and control toolkit, a second pass for the conditioning and RL details.
- How to read it
- First pass for the offline-to-auto-regressive reframing and the three control techniques; second pass on the sampling/in-painting/hierarchical-RL mechanics if building interactive control.
Motion Synthesis
- talk Jelly by Name, Jelly by Nature: A Deep Dive into Deforming Characters in UE 5.3 | Unreal Fest 2024 Unreal Epic
,
Brown Bag Films details their transition to an in-engine animation pipeline in UE5.3, focusing on squash, stretch, and melting deformation requirements for the character Jelly.
ML Deformation / Skinning / CFX
-
Production case study implementing Houdini 20's APEX framework for quadruped rigging and applying the ML Deformer for high-quality real-time skinning deformation.
Rigging / ML Deformation / Skinning
-
, , , , , ,
Data-driven generalized physical face model that learns anatomy-consistent tissue mechanics from captured data for novel character synthesis.
abstract ▾ abstract ▴
Physically-based simulation is a powerful approach for 3D facial animation as the resulting deformations are governed by physical constraints, allowing to easily resolve self-collisions, respond to external forces and perform realistic anatomy edits. Today's methods are data-driven, where the actuations for finite elements are inferred from captured skin geometry. Unfortunately, these approaches have not been widely adopted due to the complexity of initializing the material space and learning the deformation model for each character separately, which often requires a skilled artist followed by lengthy network training. In this work, we aim to make physics-based facial animation more accessible by proposing a generalized physical face model that we learn from a large 3D face dataset. Once trained, our model can be quickly fit to any unseen identity and produce a ready-to-animate physical face model automatically. Fitting is as easy as providing a single 3D face scan, or even a single face image. After fitting, we offer intuitive animation controls, as well as the ability to retarget animations across characters. All the while, the resulting animations allow for physical effects like collision avoidance, gravity, paralysis, bone reshaping and more.
Related An Implicit Physical Face Model Driven by Expression and Style · Phace: Physics-based Face Modeling and Animation · BlendForces: A Dynamic Framework for Facial Animation · Shape Targeting: A Versatile Active Elasticity Constitutive Model
how to read this ▾ how to read this ▴
- Category
- Method: data-driven generalized physical face model
- Contributions
-
- A generalized physical face model learned from a large 3D face dataset that, once trained, fits quickly to any unseen identity and produces a ready-to-animate physical face model automatically.
- Fitting from minimal input (a single 3D face scan, or even a single image), avoiding per-character material-space initialization and lengthy separate training.
- Intuitive animation controls plus cross-character retargeting, while retaining physics effects like self-collision resolution and response to external forces.
- Context
- Generalizes the authors' implicit physical face model (Yang et al., 'An Implicit Physical Face Model Driven by Expression and Style') from per-character setup toward a single model learned across many identities. Builds on: An Implicit Physical Face Model Driven by Expression and Style
- Correctness
- Targets the adoption barrier of physics-based faces (per-character material setup and training); fitting from a single scan or image is a strong convenience claim, so a reader should keep in mind dependence on the training dataset's coverage and the fidelity tradeoff of generalized versus per-character material spaces.
- Clarity
- Accessible motivation; a first pass conveys the generalize-then-fit idea, a second pass for the learned material space and physical simulation formulation.
- How to read it
- First pass for why per-character setup is the bottleneck and how fitting works; second pass on the learned physical/material model if evaluating it for a production face pipeline.
Facial / Muscles / ML Deformation
- talk Machine Learning Summit: From Photo to Expression: Generating Photorealistic Facial Rigs GDC Industrial
EA presented FaceMixer, FaceOptim, and FaceBot: three ML tools that together generate blendshapes, automate expression scanning, and produce photorealistic face textures and meshes with minimal manual work.
Facial / ML Deformation
- MarkerNet: A Divide-and-Conquer Solution to Motion Capture Solving From Raw Markers CASA Academic 4 cites
, , , , , ,
Decomposes full-body mocap solving into local-part sub-motions aggregated by a graph network, reducing the costly manual labeling step.
abstract ▾ abstract ▴
Marker‐based optical motion capture (MoCap) aims to localize 3D human motions from a sequence of input raw markers. It is widely used to produce physical movements for virtual characters in various games such as the role‐playing game, the fighting game, and the action‐adventure game. However, the conventional MoCap cleaning and solving process is extremely labor‐intensive, time‐consuming, and usually the most costly part of game animation production. Thus, there is a high demand for automated algorithms to replace costly manual operations and achieve accurate MoCap cleaning and solving in the game industry. In this article, we design a divide‐and‐conquer‐based MoCap solving network, dubbed MarkerNet, to estimate human skeleton motions from sequential raw markers effectively. In a nutshell, our key idea is to decompose the task of direct solving of global motion from all markers into first modeling sub‐motions of local parts from the corresponding marker subsets and then aggregating sub‐motions into a global one. In this manner, our model can effectively capture local motion patterns w.r.t. different marker subsets, thus producing more accurate results compared to the existing methods. Extensive experiments on both real and synthetic data verify the effectiveness of the proposed method.
Related Motion Retargetting based on Dilated Convolutions and Skeleton-Specific Loss Functions · Robust Marker Trajectory Repair for MOCAP Using Kinematic Reference · Motion Retargeting for Crowd Simulation · A Facial Motion Retargeting Pipeline for Appearance Agnostic 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: a learned motion-capture solving network
- Contributions
-
- MarkerNet, a divide-and-conquer network that estimates skeleton motion from raw optical markers
- Decomposes global solving into local-part sub-motions from marker subsets, then aggregates them
- Aims to automate the labor-intensive MoCap cleaning and solving step in game production
- Context
- Targets automated optical MoCap solving for game animation, building on marker-cleanup lineage such as Perepichka et al.'s kinematic-reference trajectory repair. Builds on: Robust Marker Trajectory Repair for MOCAP Using Kinematic Reference
- Correctness
- The local-then-global decomposition assumes marker subsets map cleanly to body parts; accuracy gains are reported relative to existing solving methods, so a reader should check the evaluation protocol and how labeling effort is actually reduced.
- Clarity
- Reasonably accessible; a first pass conveys the divide-and-conquer idea, a second pass clarifies the graph aggregation and marker-subset design.
- How to read it
- Focus on how marker subsets are defined and how the graph network aggregates sub-motions; a second pass is worth it for the architecture and accuracy comparison.
Retargeting
- MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting SIGGRAPH Asia Academic 138 cites
, , , ,
Frames physics-based character control as motion inpainting so one model handles keyframes, objects, text, and sensor constraints.
abstract ▾ abstract ▴
Crafting a single, versatile physics-based controller that can breathe life into interactive characters across a wide spectrum of scenarios represents an exciting frontier in character animation. An ideal controller should support diverse control modalities, such as sparse target keyframes, text instructions, and scene information. While previous works have proposed physically simulated, scene-aware control models, these systems have predominantly focused on developing controllers that each specializes in a narrow set of tasks and control modalities. This work presents MaskedMimic, a novel approach that formulates physics-based character control as a general motion inpainting problem. Our key insight is to train a single unified model to synthesize motions from partial (masked) motion descriptions, such as masked keyframes, objects, text descriptions, or any combination thereof. This is achieved by leveraging motion tracking data and designing a scalable training method that can effectively utilize diverse motion descriptions to produce coherent animations. Through this process, our approach learns a physics-based controller that provides an intuitive control interface without requiring tedious reward engineering for all behaviors of interest. The resulting controller supports a wide range of control modalities and enables seamless transitions between disparate tasks.
Related SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control · MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations · Generating Diverse and Natural 3D Human Motions from Text
how to read this ▾ how to read this ▴
- Category
- Method: a unified physics-based character controller
- Contributions
-
- MaskedMimic, formulating physics-based control as motion inpainting from partial (masked) descriptions
- A single model handling masked keyframes, objects, text, and combinations as control modalities
- A scalable training method using motion-tracking data to produce coherent controllable animation
- Context
- Extends adversarial and latent-space physics-based control (e.g. Tessler et al.'s CALM) toward one general controller spanning many control modalities. Builds on: CALM: Conditional Adversarial Latent Models for Directable Virtual Characters
- Correctness
- The inpainting framing assumes diverse partial descriptions can be unified under masking; it is demonstrated on physically simulated control across modalities, so a reader should note which task mixes and constraint types are actually shown versus claimed generality.
- Clarity
- Accessible at a conceptual level; a first pass conveys the masking idea, a second pass for the masking scheme and training pipeline.
- How to read it
- Read the masking formulation and how modalities are encoded first; a second pass pays off for the training method and the breadth of demonstrated tasks.
Motion Synthesis
- MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations TOG Academic 82 cites
, , , , ,
VQ-VAE with model-based RL learns scalable discrete motion codes enabling universal tracking, interactive control, and GPT-based motion generation.
abstract ▾ abstract ▴
In this work, we present MoConVQ, a novel unified framework for physics-based motion control leveraging scalable discrete representations. Building upon vector quantized variational autoencoders (VQ-VAE) and model-based reinforcement learning, our approach effectively learns motion embeddings from a large, unstructured dataset spanning tens of hours of motion examples. The resultant motion representation not only captures diverse motion skills but also offers a robust and intuitive interface for various applications. We demonstrate the versatility of MoConVQ through several applications: universal tracking control from various motion sources, interactive character control with latent motion representations using supervised learning, physics-based motion generation from natural language descriptions using the GPT framework, and, most interestingly, seamless integration with large language models (LLMs) with in-context learning to tackle complex and abstract tasks.
Related MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting · TEMOS: Generating Diverse Human Motions from Textual Descriptions · PDP: Physics-Based Character Animation via Diffusion Policy · MuscleVAE: Model-Based Controllers of Muscle-Actuated Characters
how to read this ▾ how to read this ▴
- Category
- Method: a discrete motion representation for physics-based control
- Contributions
-
- MoConVQ, a unified physics-based control framework using VQ-VAE motion embeddings learned via model-based RL
- Learns motion codes from a large unstructured dataset spanning tens of hours of examples
- Enables universal tracking, latent interactive control, GPT-based motion generation, and LLM integration
- Context
- Builds on model-based learning of physics-based controllers (e.g. Yao et al.'s ControlVAE) by replacing continuous latents with scalable discrete VQ-VAE codes. Builds on: ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters
- Correctness
- Assumes a discrete code space can capture diverse skills while staying controllable; versatility is shown across several applications, so a reader should gauge how robust each application is rather than treating breadth as uniform quality.
- Clarity
- Moderately technical; a first pass conveys the discrete-representation idea, a second pass for the VQ-VAE plus model-based RL formulation.
- How to read it
- Focus on the discrete representation and how it interfaces with tracking, RL, and the GPT/LLM layers; a second and possibly third pass help if you care about the training detail.
Motion Synthesis
-
, , , , ,
Feathers behave in a highly anisotropic way governed by their hierarchical microstructure of barbs clamped onto a rachis and linked by tiny barbules, which prior cloth-strip approximations f
abstract ▾ abstract ▴
Feathers behave in a highly anisotropic way governed by their hierarchical microstructure of barbs clamped onto a rachis and linked by tiny barbules, which prior cloth-strip approximations fail to capture. Using measurement protocols on real feather samples, the authors find a linear orientation-dependent strain-stress relationship and an extreme ratio of stiffnesses between the barb and barbule directions, leading to a three-parameter anisotropic elastic shell model. They overcome the resulting numerical locking and ill-conditioning by aligning the mesh with barb directions and replacing the stiffest modes with an inextensibility constraint, then add anisotropic bending and demonstrate full-feather and bird-scale scenarios.
Related Biological Modeling of Feathers by Morphogenesis Simulation · A Biologically-Parameterized Feather Model · Animating Puss in Boots' Feather in Shrek 2 · Procedurally Generating Biologically Driven Feathers
how to read this ▾ how to read this ▴
- Category
- Method: an anisotropic elastic shell model for feathers
- Contributions
-
- A measurement-derived three-parameter anisotropic elastic shell model capturing feather barb/barbule stiffness
- Numerical treatment that aligns the mesh with barb directions and replaces the stiffest modes with an inextensibility constraint to avoid locking
- Anisotropic bending plus full-feather and bird-scale demonstrations
- Context
- Grounded in thin-shell elasticity for graphics (e.g. Grinspun et al.'s Discrete Shells), specializing it to the strongly anisotropic microstructure of feathers that cloth-strip approximations miss. Builds on: Discrete Shells
- Correctness
- The linear orientation-dependent strain-stress assumption comes from measurements on real samples; validity is tied to those samples and the stated stiffness regime, and the locking/ill-conditioning fixes are necessary precisely because the anisotropy is extreme.
- Clarity
- Clearly motivated physically; a first pass conveys the anisotropy story, a second pass for the shell formulation and the numerical conditioning fixes.
- How to read it
- Read the measurement findings and the three-parameter model first; a second pass is worth it for the mesh alignment, inextensibility constraint, and anisotropic bending.
CFX
-
Explores the toolset behind the Game Animation Sample project, detailing how Motion Matching's pose search schema drives responsive character locomotion in UE5.4.
Motion Synthesis / Retargeting
- talk MotorNerve: A Character Animation System Using Machine Learning (Presented by Tencent Games) GDC Industrial
,
Tencent TiMi Studio combined Motion Matching with a VAE-based in-betweening system encoding leg movement to reduce foot skating, enabling high-quality low-cost locomotion and transition animation in games.
Motion Synthesis / ML Deformation
-
,
Deep network generates multi-resolution skeleton-driven soft-body shapes at sub-millisecond speeds for real-time game applications.
abstract ▾ abstract ▴
We present a hard-real-time multi-resolution mesh shape deformation technique for skeleton-driven soft-body characters. Producing mesh deformations at multiple levels of detail is very important in many applications in computer graphics. Our work targets applications where the multi-resolution shapes must be generated at fast speeds ("hard-real-time", e.g., a few milliseconds at most and preferably under 1 millisecond), as commonly needed in computer games, virtual reality and Metaverse applications. We assume that the character mesh is driven by a skeleton, and that high-quality character shapes are available in a set of training poses originating from a high-quality (slow) rig such as volumetric FEM simulation. Our method combines multi-resolution analysis, mesh partition of unity, and neural networks, to learn the pre-skinning shape deformations in an arbitrary character pose. Combined with linear blend skinning, this makes it possible to reconstruct the training shapes, as well as interpolate and extrapolate them. Crucially, we simultaneously achieve this at hard real-time rates and at multiple mesh resolution levels. Our technique makes it possible to trade deformation quality for memory and computation speed, to accommodate the strict requirements of modern real-time systems. Furthermore, we propose memory layout and code improvements to boost computation speeds.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Fast and Deep Deformation Approximations · Implementing a Machine Learning Deformer for CG Crowds: Our Journey · Data-Driven Physics for Human Soft Tissue Animation
how to read this ▾ how to read this ▴
- Category
- Method: a real-time neural pose-space deformation technique
- Contributions
-
- A multi-resolution mesh deformation method generating skeleton-driven soft-body shapes at hard-real-time (sub-millisecond) rates
- Combines multi-resolution analysis, mesh partition of unity, and neural networks to learn pre-skinning deformations
- Paired with linear blend skinning to reconstruct, interpolate, and extrapolate training poses across resolution levels
- Context
- Extends pose-space deformation and learned deformation approximation (Lewis et al.'s PSD and Bailey et al.'s FDDA) to multiple resolutions at real-time speeds. Builds on: Fast and Deep Deformation Approximations · Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Assumes high-quality training shapes from a slow rig (e.g. volumetric FEM) are available and that learned deformations generalize across poses; the real-time claim is the central result, so a reader should watch extrapolation quality and dependence on training-pose coverage.
- Clarity
- Accessible to graphics readers; a first pass conveys the goal and pipeline, a second pass for the multi-resolution and partition-of-unity formulation.
- How to read it
- Focus on how multi-resolution analysis combines with the neural pre-skinning model; a second pass pays off for the timing/quality tradeoff and generalization limits.
ML Deformation / Skinning
- Muscles in Time: Learning to Understand Human Motion by Simulating Muscle Activations NeurIPS Academic 5 cites project ↗
, , , , , , , ,
MinT: large-scale synthetic dataset of muscle activation sequences enriched from motion capture via OpenSim simulation, covering 227 subjects and 402 muscle strands.
abstract ▾ abstract ▴
Exploring the intricate dynamics between muscular and skeletal structures is pivotal for understanding human motion. This domain presents substantial challenges, primarily attributed to the intensive resources required for acquiring ground truth muscle activation data, resulting in a scarcity of datasets. In this work, we address this issue by establishing Muscles in Time (MinT), a large-scale synthetic muscle activation dataset. For the creation of MinT, we enriched existing motion capture datasets by incorporating muscle activation simulations derived from biomechanical human body models using the OpenSim platform, a common approach in biomechanics and human motion research. Starting from simple pose sequences, our pipeline enables us to extract detailed information about the timing of muscle activations within the human musculoskeletal system. Muscles in Time contains over nine hours of simulation data covering 227 subjects and 402 simulated muscle strands. We demonstrate the utility of this dataset by presenting results on neural network-based muscle activation estimation from human pose sequences with two different sequence-to-sequence architectures. Data and code are provided under https://simplexsigil.github.io/mint.
Related AddBiomechanics: Automating model scaling, inverse kinematics, and inverse dynamics from human motion data through sequential optimization · From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans · TEMOS: Generating Diverse Human Motions from Textual Descriptions · MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations
how to read this ▾ how to read this ▴
- Category
- Dataset: a synthetic muscle-activation dataset
- Contributions
-
- Muscles in Time (MinT), a large-scale synthetic muscle-activation dataset built by enriching motion-capture with OpenSim simulations
- Over nine hours of simulation data covering 227 subjects and 402 simulated muscle strands
- Baseline muscle-activation estimation from pose sequences using two seq-to-seq architectures, with data and code released
- Context
- Addresses the scarcity of ground-truth muscle data using biomechanical body models, relating to musculature-driven motion work such as Ryu et al.'s functionality-driven musculature retargeting. Builds on: Functionality-Driven Musculature Retargeting
- Correctness
- Activations are simulated via OpenSim rather than measured, so the data inherits the biomechanical model's assumptions; a reader should treat it as a synthetic benchmark whose realism depends on the underlying musculoskeletal model.
- Clarity
- Accessible as a dataset paper; a first pass conveys scope and pipeline, a second pass for the simulation setup and baseline architectures.
- How to read it
- Read the dataset construction (OpenSim enrichment, coverage) and the baseline task definition first; a deep methods pass is only needed if you plan to train on or extend it.
Muscles / Motion Synthesis
-
,
Applies high-order continuous B-spline surfaces from isogeometric analysis to cloth deformation, reducing DOFs while capturing nonlinear behavior.
abstract ▾ abstract ▴
Physically based cloth simulation with nonlinear behaviors is studied in this article by employing isogeometric analysis (IGA) for the surface deformation in 3D space. State‐of‐the‐art simulation techniques, which primarily rely on the triangular mesh to calculate physical points on the cloth directly, require a large number of degrees of freedom. An effective method for the cloth deformation that employs high‐order continuous B‐spline surfaces dependent on control points is proposed. This method leads to the merit of fewer degrees of freedom and superior smoothness. The deformation gradient on the high‐order IGA element is then represented by the gradient of the B‐spline function. An iterative method for solving the nonlinear optimization transferred from the implicit integration and a direct implicit, explicit method are derived on the basis of elastic force calculation to improve efficiency. The knots of the representation are effectively utilized in collision detection and response to reduce the computational burden. Experiments of nonlinear cloth simulation demonstrate the superiority of the proposed method considering performance and efficiency, achieving accurate, efficient, and stable deformation.
Related Dynamic Deformables: Implementation and Production Practicalities · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Strain Based Dynamics · Projective Dynamics: Fusing Constraint Projections for Fast Simulation
how to read this ▾ how to read this ▴
- Category
- Method: an isogeometric cloth simulation technique
- Contributions
-
- A nonlinear cloth simulation using isogeometric analysis with high-order continuous B-spline surfaces driven by control points
- Fewer degrees of freedom and smoother surfaces than triangular-mesh approaches, with deformation gradient from the B-spline function
- Iterative implicit and direct implicit/explicit solvers, plus knot-based collision detection and response
- Context
- Applies isogeometric analysis to cloth, relating to the broader physically based contact/deformation lineage such as Li et al.'s Incremental Potential Contact. Builds on: Incremental Potential Contact: Intersection- and Inversion-free Large-Deformation Dynamics
- Correctness
- Assumes B-spline control-point surfaces represent cloth deformation well enough to trade triangle DOFs for smoothness; efficiency and accuracy are shown on nonlinear cloth experiments, so a reader should check which scenarios and contact cases are covered.
- Clarity
- Technical; a first pass conveys the IGA-for-cloth idea and DOF benefit, a second pass for the B-spline deformation gradient and solver derivations.
- How to read it
- Focus on the B-spline representation and the DOF/smoothness argument; a second pass is worth it for the solvers and knot-based collision handling.
CFX
-
, , ,
Diffusion policy for physics-based animation trained with RL corrective actions, demonstrated on perturbation recovery and text-to-motion tracking.
abstract ▾ abstract ▴
Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multi-modal skills. However, naively training a diffusion policy often results in unstable motions for high-frequency, under-actuated control tasks like bipedal locomotion due to rapidly accumulating compounding errors, pushing the agent away from optimal training trajectories. The key idea lies in using RL policies not just for providing optimal trajectories but for providing corrective actions in sub-optimal states which gives the policy a chance to correct for errors caused by environmental stimulus, model errors, or numerical errors in simulation. Our method, Physics-Based Character Animation via Diffusion Policy (PDP), combines reinforcement learning (RL) and behavior cloning (BC) to create a robust diffusion policy for physics-based character animation. We demonstrate PDP on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis.
Related MotionCLIP: Exposing Human Motion Generation to CLIP Space · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control · SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation · Generating Diverse and Natural 3D Human Motions from Text
how to read this ▾ how to read this ▴
- Category
- Method: a diffusion policy for physics-based animation
- Contributions
-
- PDP, a diffusion policy for physics-based character animation combining reinforcement learning and behavior cloning
- Uses RL policies to provide corrective actions in sub-optimal states, countering compounding errors in under-actuated control
- Demonstrated on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis
- Context
- Brings robotics-style diffusion policies into physics-based character control, relating to motion diffusion models such as Tevet et al.'s Human Motion Diffusion Model. Builds on: Human Motion Diffusion Model
- Correctness
- Rests on the idea that RL-provided corrective actions stabilize an otherwise unstable diffusion policy for high-frequency control; results are shown on the three named tasks, so a reader should note stability still depends on the RL correction design.
- Clarity
- Accessible if you know diffusion policies; a first pass conveys the RL-plus-BC insight, a second pass for the corrective-action training detail.
- How to read it
- Read why naive diffusion policies are unstable and how corrective actions fix it; a second pass pays off for the RL/BC combination and the per-task setups.
Motion Synthesis
-
, , , , , , , , , ,
PCA-based frequency-domain strand disentanglement separates global hair structure from local curl patterns for precise editing and reconstruction as a generic prior.
abstract ▾ abstract ▴
We present Perm, a learned parametric representation of human 3D hair designed to facilitate various hair-related applications. Unlike previous work that jointly models the global hair structure and local curl patterns, we propose to disentangle them using a PCA-based strand representation in the frequency domain, thereby allowing more precise editing and output control. Specifically, we leverage our strand representation to fit and decompose hair geometry textures into low- to high-frequency hair structures, termed guide textures and residual textures, respectively. These decomposed textures are later parameterized with different generative models, emulating common stages in the hair grooming process. We conduct extensive experiments to validate the architecture design of Perm, and finally deploy the trained model as a generic prior to solve task-agnostic problems, further showcasing its flexibility and superiority in tasks such as single-view hair reconstruction, hairstyle editing, and hair-conditioned image generation. More details can be found on our project page: https://cs.yale.edu/homes/che/projects/perm/.
Related GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations · HAAR: Text-Conditioned Generative Model of 3D Strand-Based Human Hairstyles · HairNet: Single-View Hair Reconstruction Using Convolutional Neural Networks · Motion Guided Deep Dynamic 3D Garments
how to read this ▾ how to read this ▴
- Category
- Method: a parametric representation for 3D hair
- Contributions
-
- Perm, a learned parametric hair representation that disentangles global structure from local curl via a PCA-based frequency-domain strand model
- Decomposes hair geometry textures into low-frequency guide textures and high-frequency residual textures, parameterized by separate generative models
- Deployed as a generic prior for single-view reconstruction, hairstyle editing, and hair-conditioned image generation
- Context
- Builds on generative hair modeling with hierarchical latents (e.g. Zhou et al.'s GroomGen), proposing frequency-domain disentanglement for more precise control. Builds on: GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations
- Correctness
- Assumes PCA in the frequency domain cleanly separates structure from curl and that this prior transfers across tasks; validity rests on the architecture ablations and the demonstrated applications, so a reader should gauge per-task quality rather than assume uniform superiority.
- Clarity
- Fairly accessible; a first pass conveys the disentanglement idea, a second pass for the strand frequency representation and the staged generative models.
- How to read it
- Focus on the strand representation and the guide/residual texture split; a second pass is worth it for the generative-model stages and how the prior is applied per task.
CFX / ML Deformation
-
, ,
Hierarchical control with a muscle layer and trajectory layer reconstructs physically plausible character motion from monocular video.
abstract ▾ abstract ▴
We propose a novel method that combines human pose estimation and physical simulation of character animation. Our approach allows characters to learn from the actor's skills captured in videos and subsequently reconstruct the motions with high fidelity in a physically simulated environment. Firstly, we model the character based on the human musculoskeletal system and build a complete dynamics model of the proposed system using the Lagrange equations of motion. Next, we employ the pose estimation method to process the input video and generate human reference motion. Finally, we design a hierarchical control framework consisting of a trajectory tracking layer and a muscle control layer. The trajectory tracking layer aims to minimize the difference between the reference motion pose and the actual output pose, while the muscle control layer aims to minimize the difference between the target torque and the actual output muscle force. The two layers interact by passing parameters through a proportional differential controller until the desired learning objective is achieved. A series of complex experimental results demonstrate that our proposed method can learn to produce comparable high‐quality motions with high similarity from videos of different complexity levels and remains stable in the presence of muscle contracture weakness perturbations.
Related Physics-Based Character Controllers Using Conditional VAEs · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Generative GaitNet · Scalable Muscle-Actuated Human Simulation and Control
how to read this ▾ how to read this ▴
- Category
- Method: physics-based motion reconstruction from video
- Contributions
-
- A method combining pose estimation with physics simulation to reconstruct character motion from monocular video
- A musculoskeletal character with full dynamics built from the Lagrange equations of motion
- A hierarchical controller with a trajectory-tracking layer and a muscle-control layer coupled via a PD controller
- Context
- Joins video-based pose estimation with muscle-actuated simulation, relating to scalable muscle-actuated control such as Lee et al.'s scalable muscle-actuated human simulation. Builds on: Scalable Muscle-Actuated Human Simulation and Control
- Correctness
- Quality is bounded by the monocular pose-estimation reference and the fidelity of the musculoskeletal dynamics model; results are reported on complex experiments, so a reader should note dependence on the input pose estimate and the two-layer convergence behavior.
- Clarity
- Moderately technical; a first pass conveys the pipeline and two-layer control, a second pass for the Lagrangian dynamics and the layer interaction.
- How to read it
- Read the hierarchical control framework and how the trajectory and muscle layers exchange parameters; a second pass pays off for the dynamics derivation and reconstruction fidelity.
Muscles / Motion Synthesis
-
, , ,
Overview of rig solutions for Inside Out 2, including curvenet cloth controls giving Animation direct garment control in individual shots.
abstract ▾ abstract ▴
The characters team on Pixar’s Inside Out 2 shares some of the technical & design challenges on our character rigs and presents the techniques used to solve them.
Related Shaping the Elements: Curvenet Animation Controls in Pixar's Elemental · It's a UVN Face Rig, Charlie Brown: Facial Techniques for Peanuts · Premo: Powerful Character Rigging, Fast Animation · Creating an Actor-Specific Facial Rig from Performance Capture
how to read this ▾ how to read this ▴
- Category
- Production talk / character-rig breakdown
- Contributions
-
- Shares technical and design challenges on the Inside Out 2 character rigs
- Presents curvenet cloth controls that give animators direct, per-shot garment control
- Context
- A studio rigging breakdown that extends Pixar's articulation work, building on Character Articulation through Profile Curves (de Goes 2022). Builds on: Character Articulation through Profile Curves
- Correctness
- Studio practice rather than peer-reviewed research; the techniques are production-proven on one specific film, so generality to other rigs or pipelines is not claimed.
- Clarity
- Accessible to riggers and TDs; a single first pass conveys the solutions, with figures doing most of the explaining.
- How to read it
- One pass is enough; focus on the curvenet cloth-control idea and how it hands per-shot authority to animation, and note which problems motivated each rig solution.
Rigging
- ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE MIG Academic 19 cites
, ,
Two-stage VQ-VAE learns motion priors; probabilistic codebook sampling produces diverse non-deterministic emotion-controllable facial animations.
abstract ▾ abstract ▴
Audio-driven 3D facial animation synthesis has been an active field of research with attention from both academia and industry. While there are promising results in this area, recent approaches largely focus on lip-sync and identity control, neglecting the role of emotions and emotion control in the generative process. That is mainly due to the lack of emotionally rich facial animation data and algorithms that can synthesize speech animations with emotional expressions at the same time. In addition, majority of the models are deterministic, meaning given the same audio input, they produce the same output motion. We argue that emotions and non-determinism are crucial to generate diverse and emotionally-rich facial animations. In this paper, we propose ProbTalk3D a non-deterministic neural network approach for emotion controllable speech-driven 3D facial animation synthesis using a two-stage VQ-VAE model and an emotionally rich facial animation dataset 3DMEAD. We provide an extensive comparative analysis of our model against the recent 3D facial animation synthesis approaches, by evaluating the results objectively, qualitatively, and with a perceptual user study. We highlight several objective metrics that are more suitable for evaluating stochastic outputs and use both in-the-wild and ground truth data for subjective evaluation.
Related FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion · FaceFormer: Speech-Driven 3D Facial Animation with Transformers · Capture, Learning, and Synthesis of 3D Speaking Styles · CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
how to read this ▾ how to read this ▴
- Category
- Method: speech-driven 3D facial animation synthesis (generative)
- Contributions
-
- A non-deterministic, emotion-controllable approach to audio-driven 3D facial animation
- A two-stage VQ-VAE that learns motion priors and samples the codebook for diverse outputs
- Comparative analysis (objective, qualitative, and a perceptual user study) against recent methods, using the emotionally rich 3DMEAD dataset
- Context
- Addresses the lip-sync and identity focus of prior transformer-based work such as FaceFormer (Fan 2022) by adding emotion control and non-determinism. Builds on: FaceFormer: Speech-Driven 3D Facial Animation with Transformers
- Correctness
- Validated against recent synthesis methods on the 3DMEAD emotional dataset via objective, qualitative, and perceptual evaluations; results are conditioned on that dataset's emotion coverage, so generalization beyond its expressions and identities should be read with care.
- Clarity
- Reasonably accessible if you know VQ-VAE; a first pass gives the idea, a second pass is needed for the two-stage formulation and sampling scheme.
- How to read it
- First pass for the motivation (emotion plus diversity) and the two-stage architecture; do a second pass on the codebook sampling and the user-study setup if you care about how diversity is measured.
Facial / Motion Synthesis
-
, ,
Progressive simulation framework for cloth and thin shells dynamically balancing accuracy and speed during animation production workflows.
abstract ▾ abstract ▴
We propose Progressive Dynamics, a coarse-to-fine, level-of-detail simulation method for the physics-based animation of complex frictionally contacting thin shell and cloth dynamics. Progressive Dynamics provides tight-matching consistency and progressive improvement across levels, with comparable quality and realism to high-fidelity, IPC-based shell simulations [Li et al. 2021] at finest resolutions. Together these features enable an efficient animation-design pipeline with predictive coarse-resolution previews providing rapid design iterations for a final, to-be-generated, high-resolution animation. In contrast, previously, to design such scenes with comparable dynamics would require prohibitively slow design iterations via repeated direct simulations on high-resolution meshes. We evaluate and demonstrate Progressive Dynamics's features over a wide range of challenging stress-tests, benchmarks, and animation design tasks. Here Progressive Dynamics efficiently computes consistent previews at costs comparable to coarsest-level direct simulations. Its matching progressive refinements across levels then generate rich, high-resolution animations with high-speed dynamics, impacts, and the complex detailing of the dynamic wrinkling, folding, and sliding of frictionally contacting thin shells and fabrics.
Related Multi-Resolution Isotropic Strain Limiting · Codimensional Incremental Potential Contact · An Implicit Frictional Contact Solver for Adaptive Cloth Simulation · Continuum-based Strain Limiting
how to read this ▾ how to read this ▴
- Category
- Method: level-of-detail physics simulation for cloth and thin shells
- Contributions
-
- Progressive Dynamics, a coarse-to-fine simulation method for frictionally contacting thin shells and cloth
- Tight-matching consistency and progressive refinement across levels, enabling fast coarse previews that refine to high-resolution animation
- Evaluation across stress-tests, benchmarks, and animation-design tasks
- Context
- A physics-based cloth and shell method in the discrete-shell and IPC-contact lineage, building on Discrete Shells (Grinspun 2003) and referencing IPC-based shell simulation (Li et al. 2021). Builds on: Discrete Shells
- Correctness
- Claims quality comparable to high-fidelity IPC-based simulation at the finest level while giving consistent coarse previews; consistency across levels is the central assumption, and a reader should keep in mind that preview cost is comparable to coarsest-level direct simulation rather than free.
- Clarity
- Dense; a first pass conveys the coarse-to-fine design intent, but the consistency and refinement guarantees need a careful second or third pass.
- How to read it
- First pass for the design-iteration motivation and the preview-to-final workflow; reserve a deeper pass for how cross-level consistency is enforced and how it relates to IPC contact.
CFX
-
,
Data-driven framework generating diverse kinematic in-between motions with explicit user control over duration, path, and style at runtime.
abstract ▾ abstract ▴
In this work, we present a data-driven framework for generating diverse in-betweening motions for kinematic characters. Our approach injects dynamic conditions and explicit motion controls into the procedure of motion transitions. Notably, this integration enables a finer-grained spatial-temporal control by allowing users to impart additional conditions, such as duration, path, style, etc., into the in-betweening process. We demonstrate that our in-betweening approach can synthesize both locomotion and unstructured motions, enabling rich, versatile, and high-quality animation generation.
Related Taming Diffusion Probabilistic Models for Character Control · WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds · Neural Animation Layering for Synthesizing Martial Arts Movements · Learning Robust and Scalable Motion Matching with Lipschitz Continuity and Sparse Mixture of Experts
how to read this ▾ how to read this ▴
- Category
- Method: data-driven motion in-betweening with user control
- Contributions
-
- A data-driven framework that generates diverse in-between motions for kinematic characters
- Injection of dynamic conditions and explicit controls (duration, path, style) for finer spatial-temporal control
- Synthesis of both locomotion and unstructured motion
- Context
- A controllable in-betweening method in the learned-transition lineage, building on Robust Motion In-Betweening (Harvey 2020). Builds on: Robust Motion In-Betweening
- Correctness
- Demonstrated on locomotion and unstructured motion with runtime controls; the abstract reports qualitative diversity and control rather than detailed quantitative comparison, so judge breadth of validation from the full paper.
- Clarity
- Short and approachable; a first pass conveys the controllability story well.
- How to read it
- First pass for what controls are exposed (duration, path, style) and how they are injected; a second pass only if you need the conditioning mechanism for real-time use.
Motion Synthesis
-
, , , , , ,
This paper tackles joint reconstruction of shape and appearance for thin translucent objects such as leaves and paper from real photographs, where light transmission complicates capture.
abstract ▾ abstract ▴
This paper tackles joint reconstruction of shape and appearance for thin translucent objects such as leaves and paper from real photographs, where light transmission complicates capture. It presents an affordable and fast acquisition pipeline that simultaneously recovers spatially varying reflectance and transmission through a two-phase optimization. The method reproduces the layered, translucent look of thin natural materials including feathers under varied lighting.
Related A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence · Microstructure-based Appearance Rendering for Feathers · Single-Shot High-Quality Facial Geometry and Skin Appearance Capture · A Biologically-Parameterized Feather Model
how to read this ▾ how to read this ▴
- Category
- Method: shape-and-appearance reconstruction (inverse rendering) for thin translucent objects
- Contributions
-
- An affordable, fast acquisition pipeline for joint shape and appearance of thin translucent objects from real photos
- A two-phase optimization that recovers spatially varying reflectance and transmission together
- Reproduction of the layered translucent look of thin natural materials such as leaves, paper, and feathers under varied lighting
- Context
- An inverse-rendering capture method for thin translucent materials, related generally to spatially varying BRDF/BTDF acquisition and differentiable-rendering reconstruction (no specific prior works listed).
- Correctness
- Demonstrated on real photographs of thin natural materials where transmission complicates capture; results target thin objects specifically, so applicability to thick or strongly multiple-scattering media should not be assumed.
- Clarity
- Moderately technical; a first pass conveys the capture setup and goal, a second pass is needed for the two-phase optimization details.
- How to read it
- First pass for the acquisition setup and what reflectance-plus-transmission jointly buys you; deeper pass on the two-phase optimization if you intend to reproduce the pipeline.
CFX
- Refined Inverse Rigging: A Balanced Approach to High-fidelity Blendshape Animation SIGGRAPH Asia Academic 3 cites
, ,
Inverse rig solver combines l1 sparsity and temporal roughness penalties to produce high-fidelity, smooth blendshape weight sequences.
abstract ▾ abstract ▴
In this paper, we present an advanced approach to solving the inverse rig problem in blendshape animation, using high-quality corrective blendshapes. Our algorithm focuses on three key areas: ensuring high data fidelity in reconstructed meshes, achieving greater sparsity in weight distributions, and facilitating smoother frame-to-frame transitions. While the incorporation of corrective terms is a known practice, our method differentiates itself by employing a unique combination of l1 norm regularization for sparsity and a temporal smoothness constraint through roughness penalty, focusing on the sum of second differences in consecutive frame weights. A significant innovation in our approach is the temporal decoupling of blendshapes, which permits simultaneous optimization across entire animation sequences. This feature sets our work apart from existing methods and contributes to a more efficient and effective solution. Our algorithm exhibits a marked improvement in maintaining data fidelity and ensuring smooth frame transitions when compared to prior approaches that either lack smoothness regularization or rely solely on linear blendshape models. In addition to superior mesh resemblance and smoothness, our method offers practical benefits, including reduced computational complexity and execution time, achieved through a novel parallelization strategy using clustering methods.
Related Reusable Facial Rigging and Animation: Create Once, Use Many · Realtime Performance-Based Facial Animation · Example-Based Facial Rigging · DreamWorks Animation Facial Motion and Deformation System
how to read this ▾ how to read this ▴
- Category
- Method: inverse-rig solver for blendshape animation
- Contributions
-
- An inverse-rig solver using high-quality corrective blendshapes that targets data fidelity, weight sparsity, and smooth frame-to-frame transitions
- A combination of l1 regularization for sparsity with a temporal roughness penalty (sum of second differences across frames)
- Temporal decoupling of blendshapes allowing simultaneous optimization over an entire sequence
- Context
- An optimization-based inverse-rig method extending the inverse rig-mapping line of work, building on Learning an Inverse Rig Mapping for Character Animation (Holden 2015). Builds on: Learning an Inverse Rig Mapping for Character Animation
- Correctness
- Reports improved fidelity and smoother transitions versus methods lacking smoothness regularization or using linear-only blendshapes; gains depend on having high-quality corrective blendshapes, which a reader should treat as a precondition.
- Clarity
- Mathematically framed; a first pass conveys the three objectives, but the regularization terms and decoupling need a second pass.
- How to read it
- First pass for the three goals (fidelity, sparsity, smoothness) and the l1-plus-roughness recipe; second pass on the temporal-decoupling formulation if you implement the solver.
Facial / Rigging
-
, , , ,
Relative-feature modifications to GAIL's discriminator observation enable agile physics-based character control from a single motion cycle.
abstract ▾ abstract ▴
We present an approach for training "agile" character control policies, able to produce a wide variety of motor skills from a single reference motion cycle. Our technique builds off of generative adversarial imitation learning (GAIL), with a key novelty of our approach being to provide modification to the observation map in order to improve agility and robustness. Namely, to support more agile behavior, we adjust the value measurements of the training discriminator through relative features - hence the name ReGAIL. Our state observations include both task relevant relative velocities and poses, as well as relative goal deviation information. In addition, to increase robustness of the resulting gaits, servo gains and damping values are included as part of the policy action to let the controller learn how to best combine tension and relaxation during motion. From a policy informed by a single reference motion, our resulting agent is able to maneuver as needed, at runtime, from walking forward to walking backward or sideways, turning and stepping nimbly. We demonstrate our approach for a humanoid and a quadruped, on both flat and sloped terrains, as well as provide ablation studies to validate the design choices of our framework.
Related AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control · Near-Optimal Character Animation with Continuous Control · UniCon: Universal Neural Controller for Physics-Based Character Motion · Physics-Based Motion Retargeting from Sparse Inputs
how to read this ▾ how to read this ▴
- Category
- Method: physics-based character control via imitation learning
- Contributions
-
- ReGAIL, training agile control policies from a single reference motion cycle
- A relative-feature modification to the GAIL discriminator's observation map for agility and robustness
- Including servo gains and damping in the policy action so the controller learns to combine tension and relaxation
- Context
- A physics-based control method in the adversarial-imitation lineage, building on AMP: Adversarial Motion Priors (Peng 2021) and GAIL. Builds on: AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
- Correctness
- Demonstrated on a humanoid and a quadruped across flat and sloped terrain with ablation studies; results come from a single reference cycle per character, so the variety of emergent skills is bounded by what that cycle and the simulation support.
- Clarity
- Accessible if familiar with RL/GAIL; a first pass conveys the relative-feature idea, a second pass for the observation and action design.
- How to read it
- First pass for the single-reference premise and the relative-feature trick; second pass on the discriminator observation map and the gain/damping action if you train policies yourself.
Motion Synthesis
-
Autodesk University class on integrating the Maya ML Deformer into production pipelines, covering pose generation, training workflows, parameter fine-tuning, and use cases for interactive deformation approximation.
ML Deformation / Rigging
- SKEL-Betweener: a Neural Motion Rig for Interactive Motion Authoring SIGGRAPH Asia Disney Research 12 cites
, , , ,
Neural motion inbetweening that reasons about skeleton structure for intuitive, physics-plausible gap filling between sparse keyframes.
abstract ▾ abstract ▴
Authoring 3D motions is a laborious process that requires manipulating and coordinating many control handles over time. Neural motion representations learned from large motion datasets have recently shown impressive capabilities in many motion completion tasks. However, current methods are not designed for interactive motion authoring workflows. The reasons being their requirement of a dense context of full poses, which takes considerable time to author, as well as their lack of joint-level controls for refinement. In this paper, we introduce a Neural Motion Rig called SKEL-Betweener, tailored to interactive motion authoring. SKEL-Betweener is able to generate long motion sequences from two poses only, and enables intermediate motion authoring via neural motion curves---intuitive joint-level controls for positions and orientations. Through user evaluations, we demonstrate the effectiveness of our Neural Motion Rig for efficiently creating and editing motions.
Related Pose and Skeleton-aware Neural IK for Pose and Motion Editing · Interactive Character Control with Auto-Regressive Motion Diffusion Models · Factorized Motion Diffusion for Precise and Character-Agnostic Motion Inbetweening · TEMOS: Generating Diverse Human Motions from Textual Descriptions
how to read this ▾ how to read this ▴
- Category
- Method: neural motion rig for interactive in-betweening
- Contributions
-
- SKEL-Betweener, a neural motion rig tailored to interactive motion authoring
- Generation of long sequences from just two poses, removing the dense full-pose context other methods require
- Neural motion curves giving intuitive joint-level controls over positions and orientations for refinement
- Context
- An interactive-authoring take on learned in-betweening, building on Robust Motion In-Betweening (Harvey 2020) while adding skeleton-aware joint-level control. Builds on: Robust Motion In-Betweening
- Correctness
- Effectiveness shown through user evaluations of creating and editing motion; validation emphasizes authoring usability rather than large-scale quantitative motion benchmarks, so read the user study to judge the claims.
- Clarity
- Accessible and workflow-oriented; a first pass conveys the two-pose, joint-curve idea clearly.
- How to read it
- First pass for the authoring workflow (two poses plus neural motion curves) and the user-study findings; deeper pass only if you need the neural-rig representation internals.
Motion Synthesis / Rigging
- Sony Imageworks Animation Layout Workflow with Unreal Engine and OpenUSD SIGGRAPH Industrial 1 cites
, ,
Sony Pictures Imageworks rebuilds rough layout for feature films using Unreal Engine with OpenUSD export to other DCCs, reducing per-sequence export time from eight hours to five minutes.
abstract ▾ abstract ▴
This is an overview of the new rough layout pipeline at Sony Pictures Imageworks. In a notable departure from the legacy pipeline, sequence and shot-based work now begin in Unreal Engine. By exporting USD data out of Unreal Engine to share with other DCCs, we were able to reinvent the early stages of feature film production.
Related A.C.M.E. Multilimb System · Building Scalable and Evolutive USD Pipelines on Distributed Architecture at Ubisoft · Demystifying OpenUSD: End-to-End Data Pipelines and Workflows · A Deep Dive into Universal Scene Description and Hydra
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline breakdown
- Contributions
-
- Demonstrates a rebuilt rough-layout pipeline that begins sequence and shot work in Unreal Engine
- Exports USD data out of Unreal Engine to share with other DCCs, reinventing early feature-film stages
- Reports per-sequence export time dropping from eight hours to five minutes
- Context
- A studio pipeline talk grounded in OpenUSD as the interchange layer, building on Universal Scene Description: Open Source Release (Pixar 2016). Builds on: Universal Scene Description: Open Source Release
- Correctness
- Studio practice, not peer-reviewed; the workflow and the eight-hours-to-five-minutes figure are production-proven at one studio and tied to its specific Unreal and USD setup.
- Clarity
- Accessible to pipeline and layout staff; a single pass conveys the architecture and the payoff.
- How to read it
- One pass; focus on where Unreal sits in the layout stage and how USD export bridges to other DCCs, and note the reported speedup as a motivation rather than a benchmark.
Rigging
-
, , , , , ,
Self-supervised personalized face capture reconstructing a relightable avatar from multiple unconstrained videos, with real-time tracking of unseen footage.
abstract ▾ abstract ▴
Feedforward monocular face capture methods seek to reconstruct posed faces from a single image of a person. Current state of the art approaches have the ability to regress parametric 3D face models in real-time across a wide range of identities, lighting conditions and poses by leveraging large image datasets of human faces. These methods however suffer from clear limitations in that the underlying parametric face model only provides a coarse estimation of the face shape, thereby limiting their practical applicability in tasks that require precise 3D reconstruction (aging, face swapping, digital make-up,...). In this paper, we propose a method for high-precision 3D face capture taking advantage of a collection of unconstrained videos of a subject as prior information. Our proposal builds on a two stage approach. We start with the reconstruction of a detailed 3D face avatar of the person, capturing both precise geometry and appearance from a collection of videos. We then use the encoder from a pre-trained monocular face reconstruction method, substituting its decoder with our personalized model, and proceed with transfer learning on the video collection. Using our pre-estimated image formation model, we obtain a more precise self-supervision objective, enabling improved expression and pose alignment.
Related EMOCA: Emotion Driven Monocular Face Capture and Animation · I M Avatar: Implicit Morphable Head Avatars from Videos · FLARE: Fast Learning of Animatable and Relightable Mesh Avatars · Learning a Model of Facial Shape and Expression from 4D Scans
how to read this ▾ how to read this ▴
- Category
- Method: personalized monocular face capture and avatar reconstruction
- Contributions
-
- SPARK, a self-supervised method for high-precision personalized 3D face capture from a collection of unconstrained videos
- A two-stage approach: reconstruct a detailed relightable avatar (geometry and appearance), then transfer-learn a personalized decoder onto a pretrained monocular encoder
- Real-time tracking of unseen footage with the personalized model
- Context
- A personalization layer over feedforward monocular face reconstruction, building on Learning an Animatable Detailed 3D Face Model from In-The-Wild Images (DECA, Feng 2021). Builds on: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
- Correctness
- Aims to overcome the coarse shape of generic parametric models by exploiting per-subject video priors; precision therefore depends on having a suitable collection of videos per subject, a requirement to keep in mind versus single-image methods.
- Clarity
- Fairly technical; a first pass conveys the two-stage personalize-then-track idea, a second pass for the encoder swap and transfer-learning details.
- How to read it
- First pass for the avatar-then-tracker structure and why personalization beats a generic parametric model; second pass on the encoder substitution and transfer learning if you plan to reproduce it.
Facial
-
Official Autodesk video introducing the Maya ML Deformer, which uses machine learning to approximate complex character deformations for interactive playback speeds during animation, blocking, and crowd work.
abstract ▾ abstract ▴
Rigging supervisor Todd Whittep demonstrates Maya 2025.2's machine learning ML Deformer on a layered walrus-man rig, where segmented low-res geometry is skin clustered with corrective shapes and drives higher-resolution meshes via proximity wraps and blendshapes. He applies the ML Deformer, sets up the control collector by adding the translating and rotating joints that drive the skin cluster, and trains the system against an existing target mesh using a keyed range-of-motion rather than auto-generated random poses, exporting training data with offset delta mode and smoothing iterations. He then walks through the training UI parameters (batch size, epochs, validation ratio, learning rate, hidden layers, neurons per layer, dropout, and principal shapes accuracy), iterating across several solves to reduce crosstalk artifacts where one body part wrongly influences another. The talk shows how pose count drives deformation quality, how component tagging can split different body regions across separate ML solves, and how the trained deformer approximates costly muscle or cloth simulation results for major interactive playback speedups.
Related MetaHuman Framework and Machine Learning for Next-Gen Character Deformation · Delta Mush: Smoothing Deformations While Preserving Detail · Maya 2022: New Features for Rigging · FaceBaker: Baking Character Facial Rigs with Machine Learning
how to read this ▾ how to read this ▴
- Category
- Production talk / tool walkthrough (ML-based deformation approximation)
- Contributions
-
- Demonstrates Maya 2025.2's ML Deformer learning a layered character rig's deformation for interactive playback
- Shows a practical training pipeline: control collector setup, keyed range-of-motion training data, offset delta mode with smoothing
- Walks through training UI parameters and component tagging to split body regions across separate solves and reduce crosstalk artifacts
- Context
- A vendor-tool realization of fast learned deformation approximation, in the lineage of Bailey et al.'s Fast and Deep Deformation Approximations, packaged inside Maya to replace costly muscle or cloth simulation at runtime. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Studio/tool practice rather than peer-reviewed; quality is shown to depend on pose count and careful region splitting, and the demo itself notes crosstalk artifacts that require iterating across several solves, so results are approximations tied to the training range of motion.
- Clarity
- Very accessible as a screen-recorded demo; one pass conveys the workflow, with no formulation to chase.
- How to read it
- Watch for the data-prep and parameter choices (keyed ROM vs random poses, offset delta mode, epochs/neurons, component tagging); treat it as a how-to and revisit specific UI steps when you actually set up a solve, rather than for theory.
ML Deformation / Skinning
- SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation SIGGRAPH Academic 22 cites
, , ,
Progressive distillation from thousands of RL expert policies into a single real-time controller covering over 5,000 language-specified skills.
abstract ▾ abstract ▴
Physically-simulated models for human motion can generate high-quality responsive character animations, often in real-time. Natural language serves as a flexible interface for controlling these models, allowing expert and non-expert users to quickly create and edit their animations. Many recent physics-based animation methods, including those that use text interfaces, train control policies using reinforcement learning (RL). However, scaling these methods beyond several hundred motions has remained challenging. Meanwhile, kinematic animation models are able to successfully learn from thousands of diverse motions by leveraging supervised learning methods. Inspired by these successes, in this work we introduce SuperPADL, a scalable framework for physics-based text-to-motion that leverages both RL and supervised learning to train controllers on thousands of diverse motion clips. SuperPADL is trained in stages using progressive distillation, starting with a large number of specialized experts using RL. These experts are then iteratively distilled into larger, more robust policies using a combination of reinforcement learning and supervised learning. Our final SuperPADL controller is trained on a dataset containing over 5000 skills and runs in real time on a consumer GPU.
Related PADL: Language-Directed Physics-Based Character Control · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control · MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting · CALM: Conditional Adversarial Latent Models for Directable Virtual Characters
how to read this ▾ how to read this ▴
- Category
- Method: scalable language-directed physics-based character control
- Contributions
-
- Introduces SuperPADL, a framework combining RL and supervised learning to scale physics-based text-to-motion to thousands of clips
- Uses progressive distillation, training many RL experts then iteratively distilling them into larger, more robust policies
- Yields a single real-time controller covering a large set of language-specified skills
- Context
- Extends language-directed physics-based control from Juravsky et al.'s PADL, borrowing the scaling intuition of supervised kinematic models to push RL controllers past the usual few-hundred-motion ceiling. Builds on: PADL: Language-Directed Physics-Based Character Control
- Correctness
- Centers on the assumption that distillation preserves expert skill while improving robustness and coverage; validated on thousands of diverse motion clips, but the abstract is truncated, so read the paper for the exact skill count, metrics, and any motions where distillation degrades fidelity.
- Clarity
- Reasonably accessible at a high level; a first pass conveys the staged-distillation idea, a second pass is needed for the RL plus supervised loss details.
- How to read it
- Focus on the staged pipeline (expert RL -> iterative distillation) and why supervised signal is mixed in; do a second pass on the distillation objective and evaluation if you care about reproducing the scaling.
Motion Synthesis
- SwinGar: Spectrum-Inspired Neural Dynamic Deformation for Free-Swinging Garments TVCG Academic 10 cites
, , ,
Frequency-domain supervision enables a unified neural model to generate dynamic deformations for garments with arbitrary topology and looseness.
abstract ▾ abstract ▴
Our work presents a novel spectrum-inspired learning-based approach for generating clothing deformations with dynamic effects and personalized details. Existing methods in the field of clothing animation are limited to either static behavior or specific network models for individual garments, which hinders their applicability in real-world scenarios where diverse animated garments are required. Our proposed method overcomes these limitations by providing a unified framework that predicts dynamic behavior for different garments with arbitrary topology and looseness, resulting in versatile and realistic deformations. First, we observe that the problem of bias towards low frequency always hampers supervised learning and leads to overly smooth deformations. To address this issue, we introduce a frequency-control strategy from a spectral perspective that enhances the generation of high-frequency details of the deformation. In addition, to make the network highly generalizable and able to learn various clothing deformations effectively, we propose a spectral descriptor to achieve a generalized description of the global shape information. Building on the above strategies, we develop a dynamic clothing deformation estimator that integrates graph attention mechanisms with long short-term memory.
Related PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · SNUG: Self-Supervised Neural Dynamic Garments · A Pixel-Based Framework for Data-Driven Clothing · Motion Guided Deep Dynamic 3D Garments
how to read this ▾ how to read this ▴
- Category
- Method: learning-based dynamic garment deformation
- Contributions
-
- A unified neural framework predicting dynamic deformation for garments with arbitrary topology and looseness
- A frequency-control strategy from a spectral perspective to counter low-frequency bias and recover high-frequency detail
- A spectral descriptor giving a generalized global shape description for cross-garment generalization
- Context
- Builds on learning-based clothing animation such as Santesteban et al.'s virtual try-on work, generalizing beyond per-garment networks and static behavior toward one model for diverse free-swinging garments. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Rests on the observation that supervised learning is biased toward low frequencies and produces oversmoothed results; the spectral strategy targets that, but as a learned approximation it is validated on the authors' garment set and one should check how it holds for unseen topologies and extreme looseness.
- Clarity
- Moderately technical due to the spectral framing; a first pass conveys the motivation, a second pass is needed for the frequency-control and descriptor formulation.
- How to read it
- Focus on why low-frequency bias matters and how the spectral descriptor enables topology-agnostic generalization; a second pass on the frequency-domain supervision pays off if you work on neural cloth.
CFX / ML Deformation
- TailorMe: Self-Supervised Learning of an Anatomically Constrained Volumetric Human Shape Model Eurographics Academic 2 cites code ↗
, , ,
Self-supervised volumetric body shape model fitting anatomical template (bones and soft tissue) to surface scans, enabling localized shape manipulation and fast inference.
abstract ▾ abstract ▴
Human shape spaces have been extensively studied, as they are a core element of human shape and pose inference tasks. Classic methods for creating a human shape model register a surface template mesh to a database of 3D scans and use dimensionality reduction techniques, such as Principal Component Analysis, to learn a compact representation. While these shape models enable global shape modifications by correlating anthropometric measurements with the learned subspace, they only provide limited localized shape control. We instead register a volumetric anatomical template, consisting of skeleton bones and soft tissue, to the surface scans of the CAESAR database. We further enlarge our training data to the full Cartesian product of all skeletons and all soft tissues using physically plausible volumetric deformation transfer. This data is then used to learn an anatomically constrained volumetric human shape model in a self‐supervised fashion. The resulting TailorMe model enables shape sampling, localized shape manipulation, and fast inference from given surface scans.
Related Data-driven Modeling of Skin and Muscle Deformation · Efficient and Robust Skin Slide Simulation · Capturing and Animating Skin Deformation in Human Motion · OSSO: Obtaining Skeletal Shape from Outside
how to read this ▾ how to read this ▴
- Category
- Method / model: anatomically constrained volumetric human shape model
- Contributions
-
- Registers a volumetric anatomical template of skeleton bones and soft tissue to surface scans of the CAESAR database
- Enlarges training data via physically plausible volumetric deformation transfer across the Cartesian product of skeletons and soft tissues
- Learns the TailorMe model self-supervised, enabling shape sampling, localized shape manipulation, and fast inference from scans
- Context
- Combines anatomical-template ideas from Dicko et al.'s Anatomy Transfer and Kadlecek et al.'s personalized anatomical models with the surface shape-space tradition of Loper et al.'s SMPL, swapping global PCA control for an anatomically grounded volumetric one. Builds on: Anatomy Transfer · Reconstructing Personalized Anatomical Models for Physics-based Body Animation · SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Assumes the synthesized skeleton-by-soft-tissue combinations are physically plausible enough to train on; demonstrated on CAESAR-derived data with localized control and fast scan fitting, but augmented data realism and anatomical accuracy versus real subjects are the caveats to keep in mind.
- Clarity
- Clearly motivated against PCA shape models; a first pass conveys the anatomical-template idea, a second pass is needed for the registration and deformation-transfer details.
- How to read it
- Focus on the volumetric template and the deformation-transfer data augmentation that make localized control possible; a second pass on registration and the self-supervised objective is worthwhile if building or fitting body models.
Muscles / Skinning
-
, , , , ,
Conditional autoregressive motion diffusion model enabling real-time diverse character animation from high-level user control with a single unified model.
abstract ▾ abstract ▴
We present a novel character control framework that effectively utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations, responding in real-time to a variety of dynamic user-supplied control signals. At the heart of our method lies a transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM), which takes as input the character’s historical motion and can generate a range of diverse potential future motions conditioned on high-level, coarse user control. To meet the demands for diversity, controllability, and computational efficiency required by a real-time controller, we incorporate several key algorithmic designs. These include separate condition tokenization, classifier-free guidance on past motion, and heuristic future trajectory extension, all designed to address the challenges associated with taming motion diffusion probabilistic models for character control. As a result, our work represents the first model that enables real-time generation of high-quality, diverse character animations based on user interactive control, supporting animating the character in multiple styles with a single unified model. We evaluate our method on a diverse set of locomotion skills, demonstrating the merits of our method over existing character controllers.
Related Human Motion Diffusion Model · Learned Motion Matching · Robust Motion In-Betweening · Real-Time Diverse Motion In-Betweening with Space-Time Control
how to read this ▾ how to read this ▴
- Category
- Method: real-time diffusion-based interactive character control
- Contributions
-
- A transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM) generating diverse futures from historical motion and coarse user control
- Algorithmic designs for real-time use: separate condition tokenization, classifier-free guidance on past motion, and heuristic future trajectory extension
- A single unified model enabling real-time, diverse, multi-style animation from interactive control
- Context
- Adapts motion diffusion models, in the lineage of Tevet et al.'s Human Motion Diffusion Model, from offline text-to-motion generation into an autoregressive real-time interactive controller. Builds on: Human Motion Diffusion Model
- Correctness
- Hinges on the claim that the listed designs make diffusion fast and controllable enough for real-time interaction; presented as the first such real-time diverse controller, though the abstract is truncated, so consult the paper for evaluation datasets, latency, and diversity-versus-control tradeoffs.
- Clarity
- Accessible framing with named components; a first pass conveys the controller design, a second pass is needed for the autoregressive diffusion formulation and guidance scheme.
- How to read it
- Focus on the three real-time design choices (condition tokenization, CFG on past motion, trajectory extension) and how autoregression bridges diffusion and interactivity; a second pass on the model and timing is worth it for runtime work.
Motion Synthesis
-
,
Grin Machine and Tumblehead animators present a behind-the-scenes look at character rigging using Houdini 20.5's APEX system for a short film, demonstrating KineFX-based production workflows.
abstract ▾ abstract ▴
Grin Machine and Tumblehead animators present the behind-the-scenes of their stylized short film Turbulence, built almost entirely in Houdini. Magnus covers rigging stylized cartoon characters with the new APEX system and KineFX, recreating his Maya face and eye rigs as reusable HDAs, building a blendshape system, and adding procedural CHOPs-based overlap and jiggle so animators can work on stepped poses and dial in overshoot and delay. Chris details a sweat-generator HDA using scatter, a point-deform-in-solver trick and POP nets for dripping droplets, plus new Copernicus COPs for procedural wet maps and tile-pattern seat textures read live into Karma XPU shaders via op: paths. The talk also shows Vellum CFX run inside Solaris to fix face intersections, the new wrinkle and sculpting SOPs, and procedural VDB cloud shaders, all assembled through a USD pipeline with Solaris look-dev HDAs and a shot builder.
Related Designing Feathers Using Houdini at FOLKS | Amelie Goursat | Paris HIVE 2023 · CFX Cloth and Hair at Unit Image · Automation of Creature FX in a Small Studio Pipeline · USD and Scene Interoperability: Demystifying the State of the Art
how to read this ▾ how to read this ▴
- Category
- Production talk / film breakdown (Houdini rigging and effects)
- Contributions
-
- Shows stylized cartoon rigging in Houdini's APEX and KineFX: reusable face/eye rigs as HDAs, a blendshape system, and CHOPs-based overlap and jiggle
- Demonstrates an effects pipeline: a sweat-generator HDA, POP-net droplets, Copernicus COPs wet maps and tile textures read live into Karma XPU via op: paths
- Walks through Vellum CFX inside Solaris for face intersections, new wrinkle and sculpting SOPs, and VDB cloud shaders assembled in a USD pipeline
- Context
- A production-workflow talk extending the KineFX-for-games procedural-rigging direction (Kruel's Houdini HIVE session) into a full stylized short-film pipeline on APEX, Solaris, and USD. Builds on: KineFX for Games | Luiz Kruel | Houdini 18.5 HIVE
- Correctness
- Studio practice, not peer-reviewed; the techniques are production-proven on the short film Turbulence and are tied to specific Houdini tools, so transferability depends on that software stack rather than general validation.
- Clarity
- Accessible as a tool-driven breakdown; one viewing conveys the approach, with rewatching useful only to capture specific node setups.
- How to read it
- Watch for the reusable-HDA and procedural-secondary-motion patterns (CHOPs overlap/jiggle, sweat/droplet networks, live op: textures); revisit individual segments when replicating a setup in Houdini, not for theory.
Rigging
-
, , , , , ,
PULSE learns a universal motion imitator from large unstructured mocap then builds a comprehensive physics-based motion representation.
abstract ▾ abstract ▴
We present a universal motion representation that encompasses a comprehensive range of motor skills for physics-based humanoid control. Due to the high dimensionality of humanoids and the inherent difficulties in reinforcement learning, prior methods have focused on learning skill embeddings for a narrow range of movement styles (e.g. locomotion, game characters) from specialized motion datasets. This limited scope hampers their applicability in complex tasks. We close this gap by significantly increasing the coverage of our motion representation space. To achieve this, we first learn a motion imitator that can imitate all of human motion from a large, unstructured motion dataset. We then create our motion representation by distilling skills directly from the imitator. This is achieved by using an encoder-decoder structure with a variational information bottleneck. Additionally, we jointly learn a prior conditioned on proprioception (humanoid's own pose and velocities) to improve model expressiveness and sampling efficiency for downstream tasks. By sampling from the prior, we can generate long, stable, and diverse human motions. Using this latent space for hierarchical RL, we show that our policies solve tasks using human-like behavior. We demonstrate the effectiveness of our motion representation by solving generative tasks (e.g.
Related Perpetual Humanoid Control for Real-time Simulated Avatars · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control · CALM: Conditional Adversarial Latent Models for Directable Virtual Characters · SAME: Skeleton-Agnostic Motion Embedding for Character Animation
how to read this ▾ how to read this ▴
- Category
- Method / representation: universal motion latent space for physics-based control
- Contributions
-
- Learns a motion imitator that can reproduce all of human motion from a large, unstructured mocap dataset
- Distills skills from the imitator into a universal motion representation via an encoder-decoder with a variational information bottleneck
- Jointly learns a proprioception-conditioned prior enabling long, stable, diverse motion generation and hierarchical RL on downstream tasks
- Context
- Builds directly on Luo et al.'s Perpetual Humanoid Control, broadening prior narrow, style-specific skill embeddings into a comprehensive universal representation (PULSE). Builds on: Perpetual Humanoid Control for Real-time Simulated Avatars
- Correctness
- Assumes a strong full-coverage imitator can be distilled into a reusable latent without losing skill breadth; demonstrated for downstream hierarchical RL control, though the abstract is truncated, so check the paper for which tasks, coverage limits, and failure modes apply.
- Clarity
- Conceptually dense (VAE bottleneck, distillation, hierarchical RL); a first pass conveys the imitate-then-distill idea, later passes are needed for the latent-space and prior details.
- How to read it
- Focus on the two-stage imitate-then-distill design and the role of the variational bottleneck and proprioceptive prior; a second pass on the representation pays off if reusing it as a downstream control prior.
Motion Synthesis / Retargeting
-
In-depth look at how LEGO Fortnite characters were rigged and animated in UE5.4, and how Motion Matching was used to reinvent Fortnite Battle Royale locomotion.
Motion Synthesis / Rigging / Retargeting
-
, , ,
Vector-quantized periodic autoencoder learns a shared phase manifold across different morphologies for unsupervised motion alignment and transfer.
abstract ▾ abstract ▴
We present a new approach for understanding the periodicity structure and semantics of motion datasets, independently of the morphology and skeletal structure of characters. Unlike existing methods using an overly sparse high-dimensional latent, we propose a phase manifold consisting of multiple closed curves, each corresponding to a latent amplitude. With our proposed vector quantized periodic autoencoder, we learn a shared phase manifold for multiple characters, such as a human and a dog, without any supervision. This is achieved by exploiting the discrete structure and a shallow network as bottlenecks, such that semantically similar motions are clustered into the same curve of the manifold, and the motions within the same component are aligned temporally by the phase variable. In combination with an improved motion matching framework, we demonstrate the manifold’s capability of timing and semantics alignment in several applications, including motion retrieval, transfer and stylization. Code and pre-trained models for this paper are available at peizhuoli.github.io/walkthedog.
Related Real-Time Diverse Motion In-Betweening with Space-Time Control · SAME: Skeleton-Agnostic Motion Embedding for Character Animation · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Automated Extraction and Parameterization of Motions in Large Data Sets
how to read this ▾ how to read this ▴
- Category
- Method: unsupervised cross-morphology motion alignment via phase manifolds
- Contributions
-
- Proposes a phase manifold of multiple closed curves (one per latent amplitude) to capture motion periodicity and semantics independent of skeleton
- A vector-quantized periodic autoencoder that learns a shared phase manifold across morphologies (e.g. human and dog) without supervision
- Combines the manifold with an improved motion matching framework for timing/semantics-aligned retrieval, transfer, and stylization
- Context
- Synthesizes the phase-manifold idea of Starke et al.'s DeepPhase with the cross-character retargeting goals of Aberman et al.'s Skeleton-Aware Networks, replacing a sparse high-dimensional latent with a quantized periodic one shared across morphologies. Builds on: Skeleton-Aware Networks for Deep Motion Retargeting · DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds
- Correctness
- Relies on the assumption that a discrete bottleneck and shallow network force semantically similar motions onto the same curve; demonstrated unsupervised on human and dog data across retrieval, transfer, and stylization, with code released, though alignment quality for highly dissimilar morphologies is the open question.
- Clarity
- Idea is elegantly motivated but technically involved; a first pass conveys the shared-manifold concept, a second pass is needed for the VQ periodic autoencoder.
- How to read it
- Focus on the phase-manifold structure and why vector quantization plus a shallow bottleneck drives alignment; a second pass on the autoencoder and motion matching is worthwhile, and code is available to inspect.
Retargeting / Motion Synthesis
-
, ,
Curvenet-based wig rig that shares and reuses hair grooms across Emotion characters of different shapes using surface and volumetric deformations.
abstract ▾ abstract ▴
In Pixar’s feature animation Inside Out 2 (2024), emotion characters are identified with their corresponding human characters by exhibiting similar wigs. To achieve this look, we developed a custom rig that assists the sharing and reuse of hair grooms between characters of different shapes, feature proportions, and mesh connectivities. Our approach starts by adopting curvenets as a light-weighted representation of scalp surfaces that eases the registration from human to emotion models by detaching the groom setup from the underlying mesh discretization. We then implemented a mix of surface-based and volumetric deformations that warp hair shells and guide curves onto the new character’s scalp defined by the refit curvenet. At last, we incorporated a shaping tool for editing the wig layout controlled by additional curvenets that profile each hair shell.
Related Sketch to Pose in Pixar's Presto Animation System · Space Rangers with Cornrows: Methods for Modeling Braids and Curls in Pixar's Groom Pipeline · Shaping the Elements: Curvenet Animation Controls in Pixar's Elemental · Stable and Efficient Differential IK
how to read this ▾ how to read this ▴
- Category
- Production method / talk (Pixar hair-groom reuse rig)
- Contributions
-
- A custom rig that shares and reuses hair grooms across characters of different shapes, proportions, and mesh connectivities
- Adopts curvenets as a lightweight scalp representation to detach groom setup from mesh discretization and ease human-to-emotion registration
- Mixes surface-based and volumetric deformations to warp hair shells and guide curves onto the refit scalp, plus a curvenet-driven shaping tool for layout
- Context
- A production technique from Inside Out 2 extending the style-guide hair-grooming direction of Montell et al.'s Hair Emoting in Turning Red toward cross-character groom sharing. Builds on: Hair Emoting with Style Guides in Turning Red
- Correctness
- Studio practice, production-proven on the film rather than benchmarked; the curvenet abstraction assumes scalp grooms can be cleanly detached from mesh connectivity, so applicability outside similar stylized character setups is the caveat.
- Clarity
- Accessible and concrete; a first pass conveys the curvenet-plus-deformation pipeline, with details kept practical.
- How to read it
- Focus on the curvenet representation and the surface-plus-volumetric warp that enables groom transfer; a single careful pass suffices unless you are implementing groom refitting, then revisit the deformation specifics.
CFX / Rigging
-
Framestore Animation Supervisor details the performance capture pipeline, facial sculpting, and scale-bridging animation process to transpose Hugh Grant's performance onto the Oompa Loompa in Wonka.
Retargeting / Facial
2023
78-
, , , , ,
Edge-convolution network extracts skeleton and rigid weights, then diffuses smooth weight fields before deforming source mesh to a reference pose.
abstract ▾ abstract ▴
For 3D mesh pose transfer, the target model is obtained by transferring the pose of the reference mesh to the source mesh, where the shape and pose of the source are usually different from that of the reference. In this paper, pose transfer is considered as a deformation process of the source mesh, and we propose a 3D mesh pose transfer method based on skeletal deformation. First, we design a neural network based on the edge convolution operator to extract the skeleton of the 3D mesh and bind the rigid weights; then, we calculate the bone transformations between the two skeletons with different poses and use the diffusion equation to smooth the rigid weights; finally, the source mesh is deformed according to the bone transformations and the smooth weights to get the target mesh. Experiment results on different datasets show that the pose of the reference mesh can be effectively transferred to the source one while maintaining the shape and high‐quality geometric details of the source mesh by using our method.
Related Delta Mush: Smoothing Deformations While Preserving Detail · Implicit Skinning: Real-Time Skin Deformation with Contact Modeling · Skinning: Real-time Shape Deformation · Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
how to read this ▾ how to read this ▴
- Category
- Method: a skeleton-based mesh pose-transfer algorithm
- Contributions
-
- An edge-convolution network that extracts a 3D mesh skeleton and binds rigid skinning weights
- A diffusion-equation step that smooths the rigid weights before deformation
- Pose transfer cast as a deformation driven by computed bone transformations plus smoothed weights, preserving source shape and detail
- Context
- Relates to classic deformation-driven transfer such as Sumner's Deformation Transfer for Triangle Meshes, recasting the problem through learned skeleton extraction and skinning rather than direct correspondence. Builds on: Deformation Transfer for Triangle Meshes
- Correctness
- Demonstrated on different datasets showing pose is transferred while source shape and geometric detail are kept; the skeletal/rigid-weight assumption may limit fidelity for highly non-rigid or topology-mismatched cases, which a reader should keep in mind.
- Clarity
- Reasonably accessible; a first pass gives the three-stage pipeline, a second pass is needed for the edge-convolution and diffusion details.
- How to read it
- Trace the pipeline stage by stage (skeleton extraction, weight smoothing, deformation); a second pass on the diffusion smoothing pays off if weight quality matters to you.
Skinning / Retargeting
- ACE: Adversarial Correspondence Embedding for Cross Morphology Motion Retargeting from Human to Nonhuman Characters SIGGRAPH Asia Academic 30 cites
, , , , ,
Adversarial correspondence learning retargets human motion to nonhuman characters with different body structure and proportions.
abstract ▾ abstract ▴
Motion retargeting is a promising approach for generating natural and compelling animations for nonhuman characters. However, it is challenging to translate human movements into semantically equivalent motions for target characters with different morphologies due to the ambiguous nature of the problem. This work presents a novel learning-based motion retargeting framework, Adversarial Correspondence Embedding (ACE), to retarget human motions onto target characters with different body dimensions and structures. Our framework is designed to produce natural and feasible character motions by leveraging generative-adversarial networks (GANs) while preserving high-level motion semantics by introducing an additional feature loss. In addition, we pretrain a character motion prior that can be controlled in a latent embedding space and seek to establish a compact correspondence. We demonstrate that the proposed framework can produce retargeted motions for three different characters, a quadrupedal robot with a manipulator, a crab character, and a wheeled manipulator. We further validate the design choices of our framework by conducting baseline comparisons and a user study. We also showcase sim-to-real transfer of the retargeted motions by transferring them to a real Spot robot.
Related Skeleton-Aware Networks for Deep Motion Retargeting · WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks
how to read this ▾ how to read this ▴
- Category
- Method: a cross-morphology motion retargeting framework
- Contributions
-
- Adversarial Correspondence Embedding (ACE) retargeting human motion onto characters with different body structure and proportions
- A GAN-based generator with an added feature loss to preserve high-level motion semantics
- A pretrained, latent-controllable character motion prior used to establish compact correspondence
- Context
- Builds on deep retargeting work such as Aberman's Skeleton-Aware Networks, extending beyond skeleton matching to non-human morphologies via adversarial correspondence. Builds on: Skeleton-Aware Networks for Deep Motion Retargeting
- Correctness
- Validated on three target characters (a quadrupedal robot with manipulator, a crab, and a wheeled manipulator) plus baseline comparisons and a user study; the cross-morphology mapping is inherently ambiguous, so results depend on the chosen characters and perceptual judging.
- Clarity
- Moderately accessible; a first pass conveys the adversarial-plus-semantic-loss idea, a second pass is needed for the embedding and prior training.
- How to read it
- Focus on how correspondence is learned adversarially and what the feature loss preserves; a second pass on the motion prior is worth it if you target a new character morphology.
Retargeting
- AddBiomechanics: Automating model scaling, inverse kinematics, and inverse dynamics from human motion data through sequential optimization PLOS ONE Academic 48 cites
, , , , , , ,
Open-source cloud service that automates OpenSim skeleton model scaling, marker registration, inverse kinematics, and inverse dynamics from mocap data via bilevel optimization.
abstract ▾ abstract ▴
Creating large-scale public datasets of human motion biomechanics could unlock data-driven breakthroughs in our understanding of human motion, neuromuscular diseases, and assistive devices. However, the manual effort currently required to process motion capture data and quantify the kinematics and dynamics of movement is costly and limits the collection and sharing of large-scale biomechanical datasets. We present a method, called AddBiomechanics, to automate and standardize the quantification of human movement dynamics from motion capture data. We use linear methods followed by a non-convex bilevel optimization to scale the body segments of a musculoskeletal model, register the locations of optical markers placed on an experimental subject to the markers on a musculoskeletal model, and compute body segment kinematics given trajectories of experimental markers during a motion. We then apply a linear method followed by another non-convex optimization to find body segment masses and fine tune kinematics to minimize residual forces given corresponding trajectories of ground reaction forces.
Related Muscles in Time: Learning to Understand Human Motion by Simulating Muscle Activations · BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos · From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans · Automated Extraction and Parameterization of Motions in Large Data Sets
how to read this ▾ how to read this ▴
- Category
- Tool / pipeline: automated biomechanics processing service
- Contributions
-
- An automated, standardized pipeline for model scaling, inverse kinematics, and inverse dynamics from motion capture data
- A linear-then-nonconvex bilevel optimization that scales musculoskeletal body segments and registers experimental markers to model markers
- A second linear-then-optimization stage that recovers segment masses and tunes kinematics to minimize residual forces given ground reaction forces
- Context
- Relates to biomechanically accurate digital-human modeling such as Keller's SKEL, packaging OpenSim-style scaling, IK, and ID into an automated cloud service to enable large-scale dataset sharing. Builds on: From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans
- Correctness
- Built on standard musculoskeletal modeling with non-convex bilevel optimization; results depend on marker and ground-reaction-force data quality, and residual-force minimization is an approximation, so dynamics estimates should be treated as model-dependent.
- Clarity
- Accessible at the goal level for a first pass; the bilevel optimization formulation needs a second, careful pass.
- How to read it
- First pass for the pipeline stages and what each optimization solves; do a second pass on the bilevel optimization and residual handling if you plan to process your own mocap.
Muscles / Retargeting
- An Implicit Physical Face Model Driven by Expression and Style SIGGRAPH Asia Disney Research 6 cites
, , , , , , ,
Implicit neural face model coupling expression controls with physics simulation, enabling expression-driven and style-varied realistic facial deformation.
abstract ▾ abstract ▴
3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression “style", as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression in their own style. To date, style is usually entangled with the expression, and it is not possible to transfer the style of one character to another when considering facial animation. We present a new face model, based on a data-driven implicit neural physics model, that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. Once trained, our method allows generalized physics-based facial animation for any of the trained identities, extending to unseen performances. Furthermore, it grants control over the animation style, enabling style transfer from one character to another or blending styles of different characters.
Related Learning a Generalized Physical Face Model From Data · Phace: Physics-based Face Modeling and Animation · BlendForces: A Dynamic Framework for Facial Animation · High-Quality Face Capture Using Anatomical Muscles
how to read this ▾ how to read this ▴
- Category
- Method: an implicit physics-based face model
- Contributions
-
- A data-driven implicit neural physics face model driven separately by expression and by style
- A framework for learning implicit physics-based actuations for multiple subjects simultaneously from a few capture sequences
- Generalized physics-based facial animation across trained identities, with style transfer and blending between characters
- Context
- Builds on physically enriched facial rigs such as Kozlov's Enriching Facial Blendshape Rigs with Physical Simulation, disentangling expression from performance style within a learned implicit physics model. Builds on: Enriching Facial Blendshape Rigs with Physical Simulation
- Correctness
- Trained on a few arbitrary performance-capture sequences from a small identity set and claimed to extend to unseen performances; generalization beyond the trained identities and the limited training data is the key caveat a reader should keep in mind.
- Clarity
- Conceptually accessible on the expression-versus-style split; the implicit physics and actuation learning warrant a second pass.
- How to read it
- Focus on how expression and style are disentangled and what the implicit actuation represents; a second/third pass on the physics learning is worthwhile if you work on facial rigs or style transfer.
Facial / Muscles
-
, ,
High-fidelity FEM simulation of the human torso with detailed muscle, organ, and tissue anatomy for realistic breathing and movement.
abstract ▾ abstract ▴
Many existing digital human models approximate the human skeletal system using rigid bodies connected by rotational joints. While the simplification is considered acceptable for legs and arms, it significantly lacks fidelity to model rich torso movements in common activities such as dancing, Yoga, and various sports. Research from biomechanics provides more detailed modeling for parts of the torso, but their models often operate in isolation and are not fast and robust enough to support computationally heavy applications and large-scale data generation for full-body digital humans. This paper proposes a new torso model that aims to achieve high fidelity both in perception and in functionality, while being computationally feasible for simulation and optimal control tasks. We build a detailed human torso model consisting of various anatomical components, including facets, ligaments, and intervertebral discs, by coupling efficient finite-element and rigid-body simulations. Given an existing motion capture sequence without dense markers placed on the torso, our new model is able to recover the underlying torso bone movements. Our method is remarkably robust that it can be used to automatically "retrofit" the entire Mixamo motion database of highly diverse human motions without user intervention.
Related Simulation of Hand Anatomy Using Medical Imaging · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging · Shape Targeting: A Versatile Active Elasticity Constitutive Model · EMU: Efficient Muscle Simulation in Deformation Space
how to read this ▾ how to read this ▴
- Category
- Method: an anatomically detailed torso simulation model
- Contributions
-
- A detailed human torso model with facets, ligaments, and intervertebral discs coupling finite-element and rigid-body simulation
- Recovery of underlying torso bone movement from motion capture without dense torso markers
- A model fast and robust enough for simulation, optimal control, and large-scale data generation
- Context
- Builds on anatomically based body modeling such as Saito's Computational Bodybuilding, adding higher-fidelity coupled torso anatomy aimed at rich movements like dancing, yoga, and sports. Builds on: Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies
- Correctness
- Demonstrated by recovering torso bone movement from existing sparse-marker mocap with a coupled FEM/rigid-body solver; fidelity depends on the anatomical modeling assumptions, and validation is shown on motion recovery rather than against in-vivo ground truth.
- Clarity
- The motivation and components are accessible; the coupled FEM/rigid-body formulation requires a careful second pass.
- How to read it
- First pass for which anatomical components are modeled and why rigid-joint torsos fall short; a second pass on the FEM/rigid coupling pays off if you need torso simulation or motion recovery.
Muscles
-
, , ,
Santa Monica Studio animation leads share quickfire breakdowns covering cinematic storytelling, dismemberment systems, and directorial challenges across God of War Ragnarok's character animation.
Retargeting / Rigging
-
JALI Research masterclass on facial muscle anatomy, subtext observation, rigging principles, and real-time JALI demonstrations showing how accurate lip sync improves character performance in games.
Facial
-
EA retrospective on five years of shipping motion matching across multiple titles, covering system evolution, production lessons, and how the technique was adapted for diverse game types.
Motion Synthesis
-
Introduces Weta FX's Anatomically Plausible Facial System, an evolution of FACS controls that replicates actor facial performance for Avatar: The Way of Water beyond previous techniques.
Facial
-
, , , , , , ,
Production talk on cloth, hair, and coupled simulation for Avatar 2, covering solver integration and artist workflow at Weta FX.
abstract ▾ abstract ▴
This talk presents CreLoki, an extension to the multi-physics framework Loki. It enables unified creatures physics, such as hair and cloth in wet, dry, and underwater contexts with predefined coupling modes. By maintaining a solver setup configuration that is general enough for every foreseeable use case, CreLoki avoids the most tedious and error-prone steps of scene configuration. CreLoki also offers users a familiar interface via an Autodesk Maya plugin without compromising quality, customizability, or extendability. We found that this tool encourages a broader adoption of unified physics among creature artists.
Related Scriptable Character FX Solution · Choreography of Hair and Cloth in Disney's Moana 2 · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Simulating Wind Effects on Cloth and Hair in Disney's Frozen
how to read this ▾ how to read this ▴
- Category
- Production talk: coupled creature simulation pipeline
- Contributions
-
- Demonstrates CreLoki, an extension to the Loki multi-physics framework for unified creature physics across wet, dry, and underwater contexts
- Shows predefined coupling modes and a general solver setup that avoids tedious, error-prone scene configuration
- Presents a familiar Autodesk Maya plugin interface intended to broaden adoption of unified physics among creature artists
- Context
- Extends Weta's Loki unified multiphysics framework for production use on Avatar: The Way of Water, focusing on coupled hair and cloth simulation. Builds on: Loki: A Unified Multiphysics Simulation Framework for Production
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on the film, and benefits like reduced setup effort and broader artist adoption are reported from internal use rather than controlled evaluation.
- Clarity
- Accessible talk-level material; a single pass conveys the workflow and motivation.
- How to read it
- Read once for the coupling modes and the artist-workflow integration via the Maya plugin; revisit only for the solver configuration ideas if you build production simulation tooling.
CFX
-
, , ,
ML-driven character deformation pipeline used in Avatar 2 production, combining optimization-based and learned approaches for hero characters.
abstract ▾ abstract ▴
We present Bodyopt, a character skin deformation framework developed for Avatar: The Way of Water. Our approach aims to learn the skin deformations from a given dataset and reproduce them reliably during shot production. In conjunction with the kinematic skeleton, we employ muscle fibers as an additional anatomical basis, where their length changes serve as a parametrization for the non-linear deformation components. We provide a novel way of curating the dataset to minimizing differences between similar poses, which would otherwise lead to a quality loss in the reconstruction. Our approach also handles runtime skin dynamics and utilities for artists to transfer deformations to new character types as well as extra modifiers for secondary motions like breathing. Additionally, we close the gap between final skin deformation and the representation used in Animation by providing a fast proxy solution that is based on the same input data.
Related Interactive Skeleton-Driven Dynamic Deformations · Physically Based Rigging for Deformable Characters · Data-Driven Physics for Human Soft Tissue Animation · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Production talk: a character deformation pipeline
- Contributions
-
- Demonstrates Bodyopt, a framework that learns hero-character skin deformations from a dataset and reproduces them in shot production
- Uses muscle-fiber length changes alongside the kinematic skeleton to parametrize non-linear deformation, plus runtime skin dynamics and breathing modifiers
- Provides dataset curation to minimize differences between similar poses and a fast animation-side proxy from the same input data
- Context
- Part of the Avatar: The Way of Water production toolset alongside Weta's Loki framework, combining optimization-based and learned deformation for hero characters. Builds on: Loki: A Unified Multiphysics Simulation Framework for Production
- Correctness
- Studio practice, not peer-reviewed; results are production-proven, and the reported quality gains (for example from pose-difference-minimizing curation) come from production use rather than formal benchmarks.
- Clarity
- Accessible talk-level material; a first pass conveys the muscle-fiber parametrization and proxy idea.
- How to read it
- Read once for the muscle-fiber parametrization, the dataset-curation trick, and the animation proxy; a second look helps if you build ML-driven deformation pipelines.
Skinning / ML Deformation
-
, , , , , ,
Deformable full-body model integrating skin, organs, and bones trained on 300 CT scans using SMPL architecture, achieving 3.6 mm bone and 8.8 mm organ accuracy.
abstract ▾ abstract ▴
A virtual anatomical model of a patient can be a valuable tool for enhancing clinical tasks such as workflow automation, patient-specific X-ray dose optimization, markerless tracking, positioning, and navigation assistance in image-guided interventions. For these tasks, it is highly desirable that the patient's surface and internal organs are of high quality for any pose and shape estimate. At present, the majority of statistical shape models (SSMs) are restricted to a small number of organs or bones or do not adequately represent the general population. To address this, we propose a deformable human shape and pose model that combines skin, internal organs, and bones, learned from CT images. By modeling the statistical variations in a pose-normalized space using probabilistic PCA while also preserving joint kinematics, our approach offers a holistic representation of the body that can be beneficial for automation in various medical applications. In an interventional setup, our model could, for example, facilitate automatic system/patient positioning, organ-specific iso-centering, automated collimation or collision prediction. We assessed our model's performance on a registered dataset, utilizing the unified shape space, and noted an average error of 3.6 mm for bones and 8.8 mm for organs.
Related From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans · NIMBLE: A Non-rigid Hand Model with Bones and Muscles · HIT: Estimating Internal Human Implicit Tissues from the Body Surface · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies
how to read this ▾ how to read this ▴
- Category
- Method / model: a full-body statistical shape model with internal anatomy
- Contributions
-
- A deformable human shape and pose model integrating skin, internal organs, and bones, learned from CT images
- Statistical variation modeled with probabilistic PCA in a pose-normalized space while preserving joint kinematics
- Reported accuracy of 3.6 mm for bone and 8.8 mm for organs, with proposed medical uses such as positioning, iso-centering, and collision prediction
- Context
- Builds on SMPL's skinned shape-model architecture (Loper) and skeletal-from-surface work like OSSO (Keller), extending statistical body models to include organs and bones for clinical use. Builds on: SMPL: A Skinned Multi-Person Linear Model · OSSO: Obtaining Skeletal Shape from Outside
- Correctness
- Trained and assessed on roughly 300 CT scans with stated millimeter accuracy; population coverage and generalization beyond the training cohort, plus pose/shape extremes, are the caveats to keep in mind for the medical claims.
- Clarity
- Accessible if familiar with SMPL-style models; a first pass conveys the integrated-anatomy idea, a second pass covers the probabilistic PCA and registration.
- How to read it
- First pass for what the model integrates and the reported accuracies; do a second pass on the pose-normalized PCA and CT registration if you intend to use or extend it clinically.
Muscles / Skinning
- C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters SIGGRAPH Asia Academic 77 cites
, , , ,
Divides heterogeneous motion into homogeneous subsets for training a conditional adversarial model that controls diverse skill behaviors.
abstract ▾ abstract ▴
We present C · ASE, an efficient and effective framework that learns Conditional Adversarial Skill Embeddings for physics-based characters. C · ASE enables the physically simulated character to learn a diverse repertoire of skills while providing controllability in the form of direct manipulation of the skills to be performed. This is achieved by dividing the heterogeneous skill motions into distinct subsets containing homogeneous samples for training a low-level conditional model to learn the conditional behavior distribution. The skill-conditioned imitation learning naturally offers explicit control over the character’s skills after training. The training course incorporates the focal skill sampling, skeletal residual forces, and element-wise feature masking to balance diverse skills of varying complexities, mitigate dynamics mismatch to master agile motions and capture more general behavior characteristics, respectively. Once trained, the conditional model can produce highly diverse and realistic skills, outperforming state-of-the-art models, and can be repurposed in various downstream tasks. In particular, the explicit skill control handle allows a high-level policy or a user to direct the character with desired skill specifications, which we demonstrate is advantageous for interactive character animation.
Related CALM: Conditional Adversarial Latent Models for Directable Virtual Characters · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Character Controllers Using Motion VAEs · DReCon: Data-Driven Responsive Control of Physics-Based Characters
how to read this ▾ how to read this ▴
- Category
- Method: conditional adversarial skill embedding for physics-based character control
- Contributions
-
- Learns conditional adversarial skill embeddings (C-ASE) so a simulated character acquires a diverse skill repertoire with direct, explicit skill control
- Divides heterogeneous skill motions into homogeneous subsets to train a low-level conditional model of the skill behavior distribution
- Introduces focal skill sampling, skeletal residual forces, and element-wise feature masking to balance varied skills, ease dynamics mismatch, and capture general behavior
- Context
- Extends the adversarial skill embedding line of work, directly building on ASE (Peng et al. 2022) by adding explicit skill conditioning. Builds on: ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters
- Correctness
- Claims to outperform prior models and to be repurposable downstream, but gains rest on the assumption that heterogeneous motions can be cleanly partitioned into homogeneous subsets, so reusability depends on the quality of that division.
- Clarity
- Fairly accessible at a high level; a first pass conveys the conditioning idea, a second pass is needed for the sampling, residual-force, and masking mechanics.
- How to read it
- Focus on how the skill subsets are formed and how the conditioning handle is exposed to a high-level policy; a second pass pays off for the three training mechanisms (focal sampling, residual forces, feature masking).
Motion Synthesis
- CALM: Conditional Adversarial Latent Models for Directable Virtual Characters SIGGRAPH Academic 131 cites
, , , , ,
Jointly learns a physics-based control policy and motion encoder for diverse, directable character behavior from mocap via imitation learning.
abstract ▾ abstract ▴
In this work, we present Conditional Adversarial Latent Models (CALM), an approach for generating diverse and directable behaviors for user-controlled interactive virtual characters. Using imitation learning, CALM learns a representation of movement that captures the complexity and diversity of human motion, and enables direct control over character movements. The approach jointly learns a control policy and a motion encoder that reconstructs key characteristics of a given motion without merely replicating it. The results show that CALM learns a semantic motion representation, enabling control over the generated motions and style-conditioning for higher-level task training. Once trained, the character can be controlled using intuitive interfaces, akin to those found in video games.
Related ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters · PADL: Language-Directed Physics-Based Character Control · SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation
how to read this ▾ how to read this ▴
- Category
- Method: conditional adversarial latent model for directable physics-based characters
- Contributions
-
- Generates diverse, directable behaviors for user-controlled interactive characters via imitation learning
- Jointly learns a control policy and a motion encoder that reconstructs key motion characteristics without merely replicating them
- Yields a semantic motion representation enabling direct control and style-conditioning for higher-level task training
- Context
- Part of the adversarial skill embedding lineage, building on ASE (Peng et al. 2022) toward game-style directable control. Builds on: ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters
- Correctness
- Demonstrated on mocap-driven character control with game-like interfaces; the claim of a semantic representation depends on the encoder capturing motion characteristics rather than overfitting, and directability quality is shown qualitatively.
- Clarity
- Accessible, framed around familiar game-control intuitions; a first pass conveys the idea, a second pass clarifies the joint policy and encoder objective.
- How to read it
- Focus on how the encoder and policy are trained jointly and what makes the latent space directable; a second pass is worth it to see how style-conditioning feeds higher-level tasks.
Motion Synthesis
-
, ,
Digital Domain adapted feature-film facial capture technology to handle 30 hours of performance for The Quarry, automating emotion transfer onto digital puppets at game-scale workloads.
Facial / Retargeting
-
, , , , ,
Reformulates speech-driven facial animation as code-query in a learned discrete codebook of realistic facial motion priors, reducing cross-modal mapping uncertainty.
abstract ▾ abstract ▴
Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty. The codebook is learned by self-reconstruction over real facial motions and thus embedded with realistic facial motion priors. Over the discrete motion space, a temporal autoregressive model is employed to sequentially synthesize facial motions from the input speech signal, which guarantees lip-sync as well as plausible facial expressions. We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. Also, a user study further justifies our superiority in perceptual quality. Code and video demo are available at https://doubiiu.github.io/projects/codetalker.
Related FaceFormer: Speech-Driven 3D Facial Animation with Transformers · FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion · ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE · SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
how to read this ▾ how to read this ▴
- Category
- Method: speech-driven 3D facial animation via a discrete motion prior
- Contributions
-
- Recasts speech-driven facial animation as a code-query task over a learned finite codebook instead of a regression task
- Learns the codebook by self-reconstruction of real facial motions, embedding realistic facial motion priors
- Uses a temporal autoregressive model over the discrete space to synthesize lip-synced, plausible facial motion from speech
- Context
- Follows transformer-based speech-to-face work such as FaceFormer (Fan et al. 2022), replacing direct regression with a discrete codebook prior. Builds on: FaceFormer: Speech-Driven 3D Facial Animation with Transformers
- Correctness
- Reported to beat state of the art qualitatively, quantitatively, and in a user study; the approach targets the regression-to-mean over-smoothing problem, but quality is bounded by the codebook expressiveness and the scarce audio-visual data it notes.
- Clarity
- Readable; the code-query framing is intuitive on a first pass, with a second pass needed for the codebook learning and autoregressive decoding.
- How to read it
- Focus on why discretization reduces cross-modal uncertainty and how the codebook is learned; a second pass pays off for the autoregressive synthesis and the lip-sync evaluation.
Facial / Motion Synthesis
-
, , ,
GAN-based decoupled multi-discriminator framework learns composite body-part motions from multiple references without manual annotation.
abstract ▾ abstract ▴
We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions for specific body parts from multiple reference motions simultaneously and directly by leveraging the use of multiple discriminators in a GAN-like setup. In this process, there is no need of any manual work to produce composite reference motions for learning. Instead, the control policy explores by itself how the composite motions can be combined automatically. We further account for multiple task-specific rewards and train a single, multi-objective control policy. To this end, we propose a novel framework for multi-objective learning that adaptively balances the learning of disparate motions from multiple sources and multiple goal-directed control objectives. In addition, as composite motions are typically augmentations of simpler behaviors, we introduce a sample-efficient method for training composite control policies in an incremental manner, where we reuse a pre-trained policy as the meta policy and train a cooperative policy that adapts the meta one for new composite tasks.
Related PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network · CALM: Conditional Adversarial Latent Models for Directable Virtual Characters · QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars · ReGAIL: Toward Agile Character Control From a Single Reference Motion
how to read this ▾ how to read this ▴
- Category
- Method: composite, task-driven motion control for physics-based characters
- Contributions
-
- Learns decoupled per-body-part motions from multiple reference clips simultaneously using a multi-discriminator GAN-like setup, with no manual composite references
- Proposes a multi-objective learning framework that adaptively balances disparate motion sources and goal-directed control objectives in one policy
- Adds a sample-efficient incremental scheme that reuses a pre-trained policy as a meta policy and trains a cooperative policy for composite behaviors
- Context
- Built on the adversarial motion prior approach, extending AMP (Peng et al. 2021) from full-body imitation to decoupled, composable body-part control. Builds on: AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
- Correctness
- The control policy is left to discover how to combine motions automatically, which avoids annotation but means composition emerges from training rather than being guaranteed; demonstrated on physically simulated characters with multiple tasks.
- Clarity
- Moderately technical; a first pass conveys the multi-discriminator idea, a second pass is needed for the multi-objective balancing and incremental meta-policy scheme.
- How to read it
- Focus on how multiple discriminators map to body parts and how objectives are balanced; a second pass is worthwhile for the incremental, sample-efficient training and what 'cooperative policy' means.
Motion Synthesis
-
,
Curve-based fabric authoring workflow supporting custom weave patterns and fiber thicknesses from fine threads to thick yarns, used for Strange World's costumes.
abstract ▾ abstract ▴
In Walt Disney Animation Studios’ "Strange World", the handmade quality of the costumes was a creative need for representing the technological limits of the setting, Avalonia. We expanded and standardized our curve-based fabric authoring workflows to provide simplified artistic controls, handle complex woven patterning, and support common finishing techniques like hems and stitches.
Related Directing Cloth Draping through Blended UVs · Untangling Cloth · Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production · SNUG: Self-Supervised Neural Dynamic Garments
how to read this ▾ how to read this ▴
- Category
- Production talk: curve-based garment authoring workflow
- Contributions
-
- Demonstrates an expanded, standardized curve-based fabric authoring workflow with simplified artistic controls
- Handles complex woven patterning and varied fiber thicknesses, from fine threads to thick yarns
- Supports common finishing techniques such as hems and stitches, used for the costumes in Strange World
- Context
- Continues Disney's curve-based cloth and fiber authoring practice, building on the Encanto embroidery and cloth fiber workflows (Velasquez 2022). Builds on: Embroidery and Cloth Fiber Workflows on Disney's Encanto
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on Strange World, where the handmade-costume look was a deliberate creative goal, so transferability to other pipelines is not the claim.
- Clarity
- Accessible and practitioner-oriented; a single read conveys the workflow intent, with detail living in the demonstrated tooling.
- How to read it
- Read for the artist-facing controls and how weave patterns and finishing are authored from curves; one pass suffices unless you are replicating the fiber-authoring approach.
CFX
-
, ,
Houdini-supplemented XGen groom pipeline for Strange World's dog Legend, translating French-Belgian comic book fur aesthetics into a production groom.
abstract ▾ abstract ▴
Translating the design language of Walt Disney Animation Studios’ "Strange World" onto a fur groom for the Clade family’s lovable dog, Legend, at a level of scrutiny that is often reserved for our human characters led us to exploration outside of our regular workflows. The groom had to adhere to the artistic vision of the Visual Development and Animation teams while still working within our pipeline and staying consumable by downstream departments. We were able to leverage Houdini to supplement Disney’s XGen to meet the aesthetic needs of the show. With Houdini we were able to quickly put down fur similar to our map-based grooming methods and because we folded that into a similar workflow as our in-house hierarchical grooming tool, Tonic, we were able to maintain the ability to fine-tune at a more granular level. Through thoughtful collaboration we were able to deliver a fun and unique groom while exploring new techniques that will open up new possibilities for grooming workflows moving forward.
Related Embroidery and Cloth Fiber Workflows on Disney's Encanto · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Feathers: From Model to Groom to Render | nineteentwenty | Character FX & Crowds Production Talks · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks
how to read this ▾ how to read this ▴
- Category
- Production talk: art-directed fur groom pipeline
- Contributions
-
- Shows a groom for the dog Legend that meets a comic-book fur aesthetic at human-character levels of scrutiny
- Leverages Houdini to supplement Disney's XGen, quickly laying down fur similar to map-based grooming methods
- Folds the Houdini approach into a workflow akin to the in-house hierarchical tool Tonic to keep granular fine-tuning and downstream consumability
- Context
- Extends Disney's art-directed hair practice, building on hierarchical controls for art-directed hair (Kaur 2018) and the in-house Tonic tool. Builds on: Hierarchical Controls for Art-Directed Hair at Disney
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on Strange World, and the techniques are framed as exploratory openings for future workflows rather than a general groom solution.
- Clarity
- Accessible to grooming practitioners; one read conveys the Houdini-plus-XGen integration and the downstream constraints it had to respect.
- How to read it
- Read for how Houdini was reconciled with XGen and Tonic while keeping downstream departments happy; one pass suffices unless you grom in this pipeline.
CFX
-
, ,
Crowd optimization pipeline for Cabinet of Curiosities, detailing KineFX-based agent replacement workflows and render-time procedural techniques for fur-bearing crowd characters.
CFX / Retargeting
-
, , , , , , ,
CT scanning of real-world wigs creates density volumes used to extract guide strands and populate dense hair via neural interpolation across diverse hairstyles.
CFX
-
, , , , , , ,
Decomposes cloth deformation into static skinning, coarse dynamic, and wrinkle components predicted by three sequential network stages.
abstract ▾ abstract ▴
We propose a three‐stage network that utilizes a skinning‐based model to accurately predict dynamic cloth deformation. Our approach decomposes cloth deformation into three distinct components: static, coarse dynamic, and wrinkle dynamic components. To capture these components, we train our three‐stage network accordingly. In the first stage, the static component is predicted by constructing a static skinning model that incorporates learned joint increments and skinning weight increments. Then, in the second stage, the coarse dynamic component is added to the static skinning model by incorporating serialized skeleton information. Finally, in the third stage, the mesh sequence stage refines the prediction by incorporating the wrinkle dynamic component using serialized mesh information. We have implemented our network and used it in a Unity game scene, enabling real‐time prediction of cloth dynamics. Our implementation achieves impressive prediction speeds of approximately 3.65ms using an NVIDIA GeForce RTX 3090 GPU and 9.66ms on an Intel i7‐7700 CPU. Compared to SOTA methods, our network excels in accurately capturing fine dynamic cloth deformations.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · A Statistical Model of Human Pose and Body Shape · NiLBS: Neural Inverse Linear Blend Skinning · Learning Skeletal Articulations with Neural Blend Shapes
how to read this ▾ how to read this ▴
- Category
- Method: skinning-based learned cloth dynamics prediction
- Contributions
-
- Decomposes cloth deformation into static, coarse dynamic, and wrinkle dynamic components predicted by a three-stage network
- Builds a static skinning model with learned joint and skinning-weight increments, then adds coarse dynamics from serialized skeleton information
- Refines wrinkle detail from serialized mesh information and runs in real time, demonstrated in a Unity scene on both GPU and CPU
- Context
- Sits in the neural cloth deformation line of work, building on Neural Cloth Simulation (Bertiche et al. 2022) with an explicit skinning-based decomposition. Builds on: Neural Cloth Simulation
- Correctness
- Reports real-time speeds and finer detail than prior methods, but the staged decomposition assumes deformation separates cleanly into static, coarse, and wrinkle parts, and the demonstrated speeds are tied to the cited hardware.
- Clarity
- Reasonably clear given the explicit three-stage structure; a first pass conveys the pipeline, a second pass is needed for the per-stage inputs and increments.
- How to read it
- Focus on what each of the three stages consumes and predicts; a second pass pays off for the learned skinning increments and how serialized skeleton vs mesh data drive coarse vs wrinkle detail.
CFX / ML Deformation / Skinning
-
, , , ,
SIGGRAPH Asia 2023 course on OpenUSD best practices covering proceduralism, Hydra 2.0, NVIDIA Omniverse Fabric GPU interoperability, and end-to-end pipeline case studies.
abstract ▾ abstract ▴
OpenUSD enables digital transformation in 3D workflows by providing interoperable scene representation across diverse systems. The framework addresses integration, data management, and compatibility challenges in production pipelines through standardized scene data, Hydra rendering, and extensible material systems.
Related Universal Scene Description: Open Source Release · A Deep Dive into Universal Scene Description and Hydra · Sony Imageworks Animation Layout Workflow with Unreal Engine and OpenUSD · USD and Scene Interoperability: Demystifying the State of the Art
how to read this ▾ how to read this ▴
- Category
- Course / tutorial: OpenUSD pipeline best practices
- Contributions
-
- Surveys OpenUSD best practices for end-to-end data pipelines and interoperable scene representation
- Covers proceduralism, Hydra 2.0 rendering, and NVIDIA Omniverse Fabric GPU interoperability
- Grounds the material in end-to-end pipeline case studies addressing integration, data management, and compatibility
- Context
- A course consolidating practice around Universal Scene Description, building on the USD open-source release (Pixar 2016) and prior USD and Hydra deep dives (Elkoura 2019). Builds on: Universal Scene Description: Open Source Release · A Deep Dive into Universal Scene Description and Hydra
- Correctness
- Course material rather than a single validated result; it reflects accumulated production guidance, so usefulness depends on how closely a reader's pipeline matches the case studies presented.
- Clarity
- Accessible and tutorial-style; a single read orients you, with case studies serving as deeper reference when you implement.
- How to read it
- Read selectively by topic (proceduralism, Hydra 2.0, Fabric interop) for the parts matching your pipeline; treat the case studies as second-pass reference when implementing.
Rigging
- talk Designing Feathers Using Houdini at FOLKS | Amelie Goursat | Paris HIVE 2023 Houdini Industrial
FOLKS VFX artist presents a Houdini-native procedural pipeline for creating realistic bird plumage, covering feather design, grooming, and integration into creature shots.
abstract ▾ abstract ▴
A FOLKS VFX grooming artist demonstrates a Houdini-native procedural pipeline for creating realistic bird plumage, built on SideFX's feather digital assets (feather generator, feather weight, feather groom, feather unpack, feather detangle and feather deform). The workflow begins with heavy reference analysis to identify each feather type and growth zone, then paints density masks tied to FID/fgroup attributes, generates guide curves per feather group, and populates render, mid-res and proxy feather variations on the skin. Proxy feathers are detangled, grouped and exported for rigging, while high-res feathers are wrapped onto the animated proxy using the feather deform node. For shots with many birds, an automatic deform setup propagates feathers without per-bird CFX, with Vellum reserved for cases needing true simulation, procedural noise for wind, and automatic open/closed-wing blendshapes to fix interpenetration on folded wings.
Related CFX Cloth and Hair at Unit Image · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Hummingbird: DreamWorks Feather System · Feathers: From Model to Groom to Render | nineteentwenty | Character FX & Crowds Production Talks
how to read this ▾ how to read this ▴
- Category
- Production talk: Houdini-native procedural feather pipeline
- Contributions
-
- Demonstrates a Houdini-native plumage pipeline built on SideFX feather digital assets (generator, weight, groom, unpack, detangle, deform)
- Drives density masks from FID/fgroup attributes and guide curves per feather group, populating render, mid-res, and proxy variations on the skin
- Wraps high-res feathers onto an animated proxy and propagates many-bird shots automatically, reserving Vellum for true simulation and using blendshapes to fix folded-wing interpenetration
- Context
- Applies SideFX's Houdini feather toolset in a VFX creature-grooming context, relating to procedural fur and feather authoring practice.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven at FOLKS, and choices like proxy-based deform versus per-bird Vellum reflect shot-specific tradeoffs rather than a general rule.
- Clarity
- Accessible and demo-driven for Houdini users; one viewing conveys the node-by-node workflow, with the heavy reference-analysis step setting up everything else.
- How to read it
- Watch for the order of operations (reference analysis, density masks, guides, proxy vs high-res, deform wrap) and when simulation is and is not used; one pass suffices unless you are rebuilding the setup.
CFX
-
, ,
This learning-based clothing deformation method generates rich, plausible detailed deformations for garments worn by bodies of varying shapes across diverse animations using a single unified framework
abstract ▾ abstract ▴
This learning-based clothing deformation method generates rich, plausible detailed deformations for garments worn by bodies of varying shapes across diverse animations using a single unified framework, avoiding the many specialized models that prior methods require for different garment topologies or poses. The authors observe that the fit between garment and body strongly influences the degree of folds, and design an attribute parser that produces detail-aware encodings injected into a graph neural network to sharpen detail discrimination under varied attributes. Experiments show improved generalization and detail quality compared with existing learning-based approaches.
Related MeshGraphNetRP: Improving Generalization of GNN-based Cloth Simulation · SNUG: Self-Supervised Neural Dynamic Garments · SwinGar: Spectrum-Inspired Neural Dynamic Deformation for Free-Swinging Garments · Creating Curve-Based Garments with Custom Weave Patterns
how to read this ▾ how to read this ▴
- Category
- Method: learning-based detailed clothing deformation
- Contributions
-
- Generates rich, plausible detailed garment deformations across varied body shapes and diverse animations within a single unified framework
- Introduces an attribute parser that produces detail-aware encodings injected into a graph neural network to sharpen detail discrimination
- Exploits the observation that garment-to-body fit governs fold intensity, improving generalization and detail over prior learning-based methods
- Context
- Continues learning-based clothing animation work such as virtual try-on garment animation (Santesteban et al. 2019), aiming to replace many specialized per-topology or per-pose models with one framework. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Experiments report better generalization and detail than prior learning-based approaches, but the gains hinge on the fit-to-fold assumption and the attribute parser generalizing beyond the attributes seen in training.
- Clarity
- Moderately technical; a first pass conveys the unified, attribute-aware idea, a second pass is needed for the parser design and graph network details.
- How to read it
- Focus on what multi-source attributes feed the parser and how detail-aware encodings condition the GNN; a second pass pays off for the generalization comparisons against specialized models.
ML Deformation / CFX
-
, ,
Transformer diffusion model paired with Jukebox audio features for physically plausible, editable music-driven dance generation.
abstract ▾ abstract ▴
Dance is an important human art form, but creating new dances can be difficult and time-consuming. In this work, we introduce Editable Dance GEneration (EDGE), a state-of-the-art method for editable dance generation that is capable of creating realistic, physically-plausible dances while remaining faithful to the input music. EDGE uses a transformer-based diffusion model paired with Jukebox, a strong music feature extractor, and confers powerful editing capabilities well-suited to dance, including joint-wise conditioning, and in-betweening. We introduce a new metric for physical plausibility, and evaluate dance quality generated by our method extensively through (1) multiple quantitative metrics on physical plausibility, beat alignment, and diversity benchmarks, and more importantly, (2) a large-scale user study, demonstrating a significant improvement over previous state-of-the-art methods. Qualitative samples from our model can be found at our website.
Related Human Motion Diffusion Model · Executing Your Commands via Motion Diffusion in Latent Space · MotionCLIP: Exposing Human Motion Generation to CLIP Space · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control
how to read this ▾ how to read this ▴
- Category
- Method: music-driven dance generation (transformer diffusion)
- Contributions
-
- EDGE, a transformer-based diffusion model conditioned on Jukebox music features for realistic, physically plausible dance
- Editing capabilities suited to dance, including joint-wise conditioning and in-betweening
- A new physical-plausibility metric, plus quantitative benchmarks and a large-scale user study
- Context
- Sits in the motion-diffusion lineage, building on the Human Motion Diffusion Model (tevet-mdm-2022) and pairing it with a strong pretrained music feature extractor (Jukebox) for the music-to-motion task. Builds on: Human Motion Diffusion Model
- Correctness
- Plausibility is judged by a newly introduced metric plus a user study, so weigh the proposed metric against the subjective study rather than treating it as ground truth; physical plausibility here is a learned, evaluated proxy, not a simulated guarantee.
- Clarity
- Accessible at a high level; a first pass conveys the conditioning-plus-editing idea, do a second pass for the diffusion formulation and metric definition.
- How to read it
- Focus on the editing primitives (joint-wise conditioning, in-betweening) and the physical-plausibility metric; a second pass pays off if you care about how the metric is defined and validated against the user study.
Motion Synthesis
-
, , , ,
Generative motion matching framework combining motion database search with generative models for smooth diverse motion synthesis.
abstract ▾ abstract ▴
We present GenMM, a generative model that "mines" as many diverse motions as possible from a single or few example sequences. In stark contrast to existing data-driven methods, which typically require long offline training time, are prone to visual artifacts, and tend to fail on large and complex skeletons, GenMM inherits the training-free nature and the superior quality of the well-known Motion Matching method. GenMM can synthesize a high-quality motion within a fraction of a second, even with highly complex and large skeletal structures. At the heart of our generative framework lies the generative motion matching module, which utilizes the bidirectional visual similarity as a generative cost function to motion matching, and operates in a multi-stage framework to progressively refine a random guess using exemplar motion matches. In addition to diverse motion generation, we show the versatility of our generative framework by extending it to a number of scenarios that are not possible with motion matching alone, including motion completion, key frame-guided generation, infinite looping, and motion reassembly.
Related Neural Animation Layering for Synthesizing Martial Arts Movements · Automated Extraction and Parameterization of Motions in Large Data Sets · Learning Robust and Scalable Motion Matching with Lipschitz Continuity and Sparse Mixture of Experts · Interactive Character Control with Auto-Regressive Motion Diffusion Models
how to read this ▾ how to read this ▴
- Category
- Method: example-based motion synthesis (training-free generative framework)
- Contributions
-
- GenMM, a training-free generative framework that mines diverse motions from one or a few example sequences
- A generative motion matching module using bidirectional visual similarity as a generative cost in a multi-stage refinement
- Versatile extensions: motion completion, keyframe-guided generation, infinite looping, and motion reassembly
- Context
- Combines the training-free, high-quality nature of Motion Matching (clavet-motionmatching-2016) with generative synthesis, positioned as an alternative to learned data-driven approaches like the Human Motion Diffusion Model (tevet-mdm-2022). Builds on: Motion Matching and The Road to Next-Gen Animation · Human Motion Diffusion Model
- Correctness
- Works from very limited examples and claims fast, artifact-resistant synthesis on complex skeletons; keep in mind it mines variation from the given exemplars, so output diversity is bounded by what the example sequences contain.
- Clarity
- Fairly accessible if you know motion matching; a first pass conveys the analogy, a second pass clarifies the bidirectional-similarity cost and the multi-stage scheme.
- How to read it
- Anchor on the contrast with both classic motion matching and learned diffusion; second pass is worth it to understand the generative cost function and how the multi-stage refinement avoids artifacts.
Motion Synthesis
-
, , , , , ,
Motion Latent Diffusion (MLD) compresses motion into a VAE latent space before diffusion, achieving two-orders-of-magnitude speedup over raw-sequence diffusion.
abstract ▾ abstract ▴
We study a challenging task, conditional human motion generation, which produces plausible human motion sequences according to various conditional inputs, such as action classes or textual descriptors. Since human motions are highly diverse and have a property of quite different distribution from conditional modalities, such as textual descriptors in natural languages, it is hard to learn a probabilistic mapping from the desired conditional modality to the human motion sequences. Besides, the raw motion data from the motion capture system might be redundant in sequences and contain noises; directly modeling the joint distribution over the raw motion sequences and conditional modalities would need a heavy computational over-head and might result in artifacts introduced by the captured noises. To learn a better representation of the various human motion sequences, we first design a powerful Variational AutoEncoder (VAE) and arrive at a representative and low-dimensional latent code for a human motion sequence. Then, instead of using a diffusion model to establish the connections between the raw motion sequences and the conditional inputs, we perform a diffusion process on the motion latent space.
Related Human Motion Diffusion Model · Generating Diverse and Natural 3D Human Motions from Text · TEMOS: Generating Diverse Human Motions from Textual Descriptions · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control
how to read this ▾ how to read this ▴
- Category
- Method: conditional human motion generation (latent-space diffusion)
- Contributions
-
- Motion Latent Diffusion (MLD), running the diffusion process in a learned motion latent space rather than on raw sequences
- A VAE that compresses motion into a representative low-dimensional latent code
- Large reported speedup over raw-sequence diffusion while supporting text- and action-conditioned generation
- Context
- Extends the motion-diffusion line from the Human Motion Diffusion Model (tevet-mdm-2022) by moving diffusion into a VAE latent space, echoing the latent-diffusion idea from image generation. Builds on: Human Motion Diffusion Model
- Correctness
- Assumes a well-behaved VAE latent captures motion adequately; the efficiency gain and denoising rely on that compression, so latent quality bounds fidelity and the speedup claims should be read against the conditioning tasks evaluated.
- Clarity
- Accessible if you know latent diffusion; a first pass conveys the compress-then-diffuse idea, a second pass clarifies the VAE design and conditioning.
- How to read it
- Focus on why diffusing in latent space helps (cost and noise reduction) and on the VAE; a second pass pays off for the conditioning mechanism and the efficiency comparison.
Motion Synthesis
- Eyes Without a Face: Integrating Detached Facial Features into Pixar's Character Pipeline SIGGRAPH Pixar 0 cites
, ,
Pipeline for Win or Lose characters with unconnected floating facial features, combining 2D graphic expressiveness with 3D environment integration.
abstract ▾ abstract ▴
From an asset creation perspective, Pixar’s first long formseries, Win or Lose, had a daunting number of featured characters. In addition to the large scope of the project, the design pushed further into the stylistic trend of somerecent Pixar films - favoring graphic shape language infacial expressions. The methodology that was so successful in those projects simply wouldn’t scale to the number of characters we needed to deliver. In terms of articulation, we knew that one thing that made wide, round mouths difficult was maintaining smooth surfaces as the topology skewed and distorted around the nose and cheeks. That insight prompted the question: “What if they weren’t connected?” While we pursued this idea with a focus on the advantages in asset creation, when those assets were put in the hands of animators, we found that the approach yielded an unexpected quality to the animation: combining the graphic nature of 2D design in a 3D environment.
Related Example-Based Facial Rigging · Direct Manipulation Blendshapes · Making Souls: Methods and a Pipeline for Volumetric Characters · Reusable Facial Rigging and Animation: Create Once, Use Many
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline (Pixar character pipeline)
- Contributions
-
- Demonstrates a character pipeline for floating, unconnected facial features driven by the show's graphic shape language
- Shows how detaching features sidesteps topology distortion around the nose and cheeks for wide, round mouths
- Reports an emergent animation quality: 2D graphic design expressiveness within a 3D environment, at series scale
- Context
- Production work for Pixar's series Win or Lose, extending the graphic-shape-language facial trend from recent Pixar films to a large featured-character roster.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on this specific series and stylistic target, so the detached-feature approach is tied to that graphic design intent rather than a general-purpose rig solution.
- Clarity
- Highly accessible and motivation-driven; a single read conveys the idea and the scaling problem it solves.
- How to read it
- Read once for the asset-creation and articulation motivation; revisit only if you want the practical handoff details of how detached features behave in animators' hands.
Facial / Rigging
-
, ,
Diffusion model for generating diverse and expressive 3D facial animations driven by speech audio input.
abstract ▾ abstract ▴
Speech-driven 3D facial animation synthesis has been a challenging task both in industry and research. Recent methods mostly focus on deterministic deep learning methods meaning that given a speech input, the output is always the same. However, in reality, the non-verbal facial cues that reside throughout the face are non-deterministic in nature. In addition, majority of the approaches focus on 3D vertex based datasets and methods that are compatible with existing facial animation pipelines with rigged characters is scarce. To eliminate these issues, we present FaceDiffuser, a non-deterministic deep learning model to generate speech-driven facial animations that is trained with both 3D vertex and blendshape based datasets. Our method is based on the diffusion technique and uses the pre-trained large speech representation model HuBERT to encode the audio input. To the best of our knowledge, we are the first to employ the diffusion method for the task of speech-driven 3D facial animation synthesis. We have run extensive objective and subjective analyses and show that our approach achieves better or comparable results in comparison to the state-of-the-art methods. We also introduce a new in-house dataset that is based on a blendshape based rigged character. The code and the dataset will be publicly available on the project page1.
Related ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE · FaceFormer: Speech-Driven 3D Facial Animation with Transformers · Capture, Learning, and Synthesis of 3D Speaking Styles · CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
how to read this ▾ how to read this ▴
- Category
- Method: speech-driven 3D facial animation (diffusion)
- Contributions
-
- FaceDiffuser, a non-deterministic diffusion model for speech-driven 3D facial animation
- Trained on both 3D vertex and blendshape datasets, targeting compatibility with rigged-character pipelines
- Uses a pretrained HuBERT speech encoder; reports objective and subjective evaluation versus state of the art
- Context
- Builds on speech-driven facial animation work such as Karras et al. (karras-audio-driven-2017) and FaceFormer (faceformer-fan-2022), reframing the task with diffusion to capture non-deterministic facial cues. Builds on: Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion · FaceFormer: Speech-Driven 3D Facial Animation with Transformers
- Correctness
- The key premise is that facial motion is non-deterministic, so diffusion is appropriate; results are reported as better or comparable to prior work, meaning gains are partly about diversity rather than uniformly higher accuracy, and depend on the training datasets used.
- Clarity
- Accessible; a first pass conveys the deterministic-versus-stochastic motivation and the HuBERT-plus-diffusion setup, a second pass for the training and evaluation details.
- How to read it
- Focus on why non-determinism matters here and on the blendshape compatibility angle; a second pass is worth it to see how the objective and subjective results are balanced.
Facial / Motion Synthesis
-
, , , , ,
Fast complementary dynamics simulation using skinning eigenmodes as a reduced subspace for secondary jiggle effects on skinned characters.
abstract ▾ abstract ▴
We propose a reduced-space elastodynamic solver that is well suited for augmenting rigged character animations with secondary motion. At the core of our method is a novel deformation subspace based on Linear Blend Skinning that overcomes many of the shortcomings prior subspace methods face. Our skinning subspace is parameterized entirely by a set of scalar weights, which we can obtain through a small, material-aware and rig-sensitive generalized eigenvalue problem. The resulting subspace can easily capture rotational motion and guarantees that the resulting simulation is rotation equivariant. We further propose a simple local-global solver for linear co-rotational elasticity and propose a clustering method to aggregate per-tetrahedra nonlinear energetic quantities. The result is a compact simulation that is fully decoupled from the complexity of the mesh.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Position-Based Skinning for Soft Articulated Characters · Data-Driven Physics for Human Soft Tissue Animation · A Statistical Model of Human Pose and Body Shape
how to read this ▾ how to read this ▴
- Category
- Method: reduced-space elastodynamics for secondary motion
- Contributions
-
- A reduced-space solver that adds secondary motion to rigged character animation
- A Linear Blend Skinning deformation subspace from a material-aware, rig-sensitive generalized eigenvalue problem, giving rotation equivariance
- A local-global solver for co-rotational elasticity plus per-tet clustering for a compact simulation decoupled from mesh complexity
- Context
- Advances complementary/secondary dynamics, building on Complementary Dynamics (zhang-complementary-dynamics-2020) and the skinning-subspace ideas of Fast Automatic Skinning Transformations (jacobson-fast-auto-skinning-2012). Builds on: Fast Automatic Skinning Transformations · Complementary Dynamics
- Correctness
- The subspace is parameterized by scalar skinning weights, which yields rotation equivariance and compactness; as a reduced model it trades full-space accuracy for speed, so fidelity is bounded by the chosen eigenmode subspace and clustering.
- Clarity
- Technical; a first pass conveys the subspace-and-secondary-motion idea, but the eigenproblem and local-global solver need a second and likely third pass.
- How to read it
- Focus first on what the skinning-eigenmode subspace buys (rotation equivariance, mesh-decoupled cost); plan a second pass on the generalized eigenvalue problem and a third on the local-global solver if you intend to implement it.
Skinning / ML Deformation
-
, , , ,
Creates animatable, relightable mesh avatars from monocular video using differentiable rasterization and learned blendshapes, compatible with standard rendering pipelines.
abstract ▾ abstract ▴
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems. While 3D meshes enable efficient processing and are highly portable, they lack realism in terms of shape and appearance. Neural representations, on the other hand, are realistic but lack compatibility and are slow to train and render. Our key insight is that it is possible to efficiently learn high-fidelity 3D mesh representations via differentiable rendering by exploiting highly-optimized methods from traditional computer graphics and approximating some of the components with neural networks. To that end, we introduce FLARE, a technique that enables the creation of animatable and relightable mesh avatars from a single monocular video. First, we learn a canonical geometry using a mesh representation, enabling efficient differentiable rasterization and straightforward animation via learned blendshapes and linear blend skinning weights. Second, we follow physically-based rendering and factor observed colors into intrinsic albedo, roughness, and a neural representation of the illumination, allowing the learned avatars to be relit in novel scenes. Since our input videos are captured on a single device with a narrow field of view, modeling the surrounding environment light is non-trivial.
Related PointAvatar: Deformable Point-Based Head Avatars from Videos · Learning an Animatable Detailed 3D Face Model from In-The-Wild Images · Neural Head Avatars from Monocular RGB Videos · SPARK: Self-supervised Personalized Real-time Monocular Face Capture
how to read this ▾ how to read this ▴
- Category
- Method: animatable, relightable mesh head avatars from monocular video
- Contributions
-
- FLARE, building animatable and relightable mesh head avatars from a single monocular video
- A canonical mesh geometry learned via differentiable rasterization, animated with learned blendshapes and linear blend skinning
- Physically-based appearance factoring colors into albedo, roughness, and a neural illumination representation, keeping pipeline compatibility
- Context
- Responds to monocular neural-avatar work such as Neural Head Avatars from Monocular RGB Videos (neural-head-avatars-grassal-2022), but favors a mesh representation for efficiency, portability, and renderer compatibility. Builds on: Neural Head Avatars from Monocular RGB Videos
- Correctness
- The key insight is approximating some components with neural networks while keeping an efficient mesh and differentiable rasterization; relightability rests on the PBR factoring, so the albedo/roughness/illumination split is an approximation whose quality is demonstrated on monocular captures rather than guaranteed under arbitrary lighting.
- Clarity
- Reasonably accessible if you know differentiable rendering and PBR; a first pass conveys the mesh-versus-neural trade-off, a second pass for the factoring and training.
- How to read it
- Focus on why a mesh is chosen over a neural field (speed, compatibility) and on the intrinsic decomposition; a second pass pays off for the differentiable rasterization and relighting details.
Facial / CFX
- From Skin to Skeleton: Towards Biomechanically Accurate 3D Digital Humans SIGGRAPH Asia Academic 84 cites
, , , , , ,
SKEL model couples a biomechanical skeleton to a SMPL surface, enabling pose estimation with anatomically constrained joint axes and soft-tissue deformation.
abstract ▾ abstract ▴
Great progress has been made in estimating 3D human pose and shape from images and video by training neural networks to directly regress the parameters of parametric human models like SMPL. However, existing body models have simplified kinematic structures that do not correspond to the true joint locations and articulations in the human skeletal system, limiting their potential use in biomechanics. On the other hand, methods for estimating biomechanically accurate skeletal motion typically rely on complex motion capture systems and expensive optimization methods. What is needed is a parametric 3D human model with a biomechanically accurate skeletal structure that can be easily posed. To that end, we develop SKEL, which re-rigs the SMPL body model with a biomechanics skeleton. To enable this, we need training data of skeletons inside SMPL meshes in diverse poses. We build such a dataset by optimizing biomechanically accurate skeletons inside SMPL meshes from AMASS sequences. We then learn a regressor from SMPL mesh vertices to the optimized joint locations and bone rotations. Finally, we re-parametrize the SMPL mesh with the new kinematic parameters. The resulting SKEL model is animatable like SMPL but with fewer, and biomechanically-realistic, degrees of freedom.
Related BOSS: Bones, Organs and Skin Shape Model · HIT: Estimating Internal Human Implicit Tissues from the Body Surface · Muscles in Time: Learning to Understand Human Motion by Simulating Muscle Activations · Automatic Rigging and Animation of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method / model: biomechanically accurate parametric human body
- Contributions
-
- SKEL, re-rigging the SMPL body model with a biomechanically accurate skeleton and anatomically constrained joint axes
- A dataset of biomechanical skeletons fitted inside SMPL meshes across diverse AMASS poses
- A regressor from SMPL vertices to optimized joint locations and bone rotations, yielding an easily posable biomechanical model
- Context
- Re-parameterizes SMPL (loper-smpl-2015) toward biomechanics, bridging graphics body models and biomechanical skeletal modeling so poses respect true joint articulation. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Anatomical accuracy is established by optimizing skeletons inside SMPL meshes and learning a regressor, so SKEL's skeleton is a learned fit to that optimization rather than per-subject medical ground truth; usefulness in biomechanics depends on how well those fits and constrained axes generalize.
- Clarity
- Accessible if you know SMPL; a first pass conveys the re-rigging idea, a second pass for the optimization and regressor construction.
- How to read it
- Focus on the constrained kinematic structure and how it differs from SMPL's simplified joints; a second pass is worth it for the skeleton-in-mesh dataset construction and the regression.
Rigging / Muscles / Skinning
-
,
Production case study on transitioning a studio to Houdini with KineFX, exploring stylized character animation workflows and practical lessons learned for riggers and FX artists.
Rigging / Retargeting / Motion Synthesis
- GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations SIGGRAPH Asia Academic 42 cites
, , , ,
First generative model for dense-strand hair using a hierarchical strand-guide-complete VAE architecture; introduces Strands400, 400-subject reconstructed strand dataset.
abstract ▾ abstract ▴
Despite recent successes in hair acquisition that fits a high-dimensional hair model to a specific input subject, generative hair models, which establish general embedding spaces for encoding, editing, and sampling diverse hairstyles, are way less explored. In this paper, we present GroomGen, the first generative model designed for hair geometry composed of highly-detailed dense strands. Our approach is motivated by two key ideas. First, we construct hair latent spaces covering both individual strands and hairstyles. The latent spaces are compact, expressive, and well-constrained for high-quality and diverse sampling. Second, we adopt a hierarchical hair representation that parameterizes a complete hair model to three levels: single strands, sparse guide hairs, and complete dense hairs. This representation is critical to the compactness of latent spaces, the robustness of training, and the efficiency of inference. Based on this hierarchical latent representation, our proposed pipeline consists of a strand-VAE and a hairstyle-VAE that encode an individual strand and a set of guide hairs to their respective latent spaces, and a hybrid densification step that populates sparse guide hairs to a dense hair model.
Related Perm: A Parametric Representation for Multi-Style 3D Hair Modeling · 3D Hair Synthesis Using Volumetric Variational Autoencoders · HAAR: Text-Conditioned Generative Model of 3D Strand-Based Human Hairstyles · Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction
how to read this ▾ how to read this ▴
- Category
- Method / model: generative model for dense-strand hair
- Contributions
-
- GroomGen, presented as the first generative model for dense-strand hair geometry
- Hierarchical latent spaces over single strands, sparse guide hairs, and complete dense hairs via a strand-VAE and hairstyle-VAE plus densification
- Strands400, a dataset of reconstructed strands across 400 subjects
- Context
- Moves from per-subject hair acquisition toward a general generative embedding space, building on volumetric VAE hair synthesis (saito-vae-hair-2018). Builds on: 3D Hair Synthesis Using Volumetric Variational Autoencoders
- Correctness
- The hierarchical representation is credited with compact latents, robust training, and efficient inference; as a generative model trained on reconstructed strands, output realism and diversity are bounded by the Strands400 data and the densification step that fills sparse guides into dense hair.
- Clarity
- Moderately accessible; a first pass conveys the three-level hierarchy and the encode-edit-sample goal, a second pass for the two VAEs and the densification.
- How to read it
- Focus on the strand-to-guide-to-dense hierarchy and why it makes the latent spaces tractable; a second pass pays off for the VAE designs and the hybrid densification.
CFX / ML Deformation
- talk Grooming and Simulation Methods for Different Hair Types | Andriy Bilichenko | Paris HIVE 2023 Houdini SideFX
Paris HIVE talk on co-designing groom and simulation: how to build hair appearance in Houdini so dynamics enhance the look rather than destroy it across different hair types.
abstract ▾ abstract ▴
Andriy Bilichenko presents five downloadable Houdini hair-style presets built on abstract sphere geometry, showing how grooming and simulation must be co-designed rather than handled as separate stages. He works almost entirely from the guide groom node for manual and grouped guide control, then uses Vellum constraints to hold the main hair volume stiff while letting only the tips wave, and tackles self-intersection on long layered hair by wrapping each clump in a proxy tube simulated as Vellum cloth that then deforms the underlying guides and hairs. For curled spiral locks he stresses that the rest-pose groom must differ greatly from the simulated production frame so the look holds its stiffness, and he uses the guide deform node with rest, t-pose and animation inputs to handle pre-roll. He answers questions on close-up detailing, painting velocities for pre-roll, and using compositing as a fallback fix.
Related Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · CFX Cloth and Hair at Unit Image
how to read this ▾ how to read this ▴
- Category
- Production talk (Houdini hair grooming and simulation)
- Contributions
-
- Demonstrates co-designing groom and simulation so dynamics enhance rather than destroy the hairstyle, across different hair types
- Shares five downloadable Houdini presets on abstract sphere geometry, working mostly from the guide groom node with Vellum constraints
- Shows a proxy-tube Vellum-cloth wrap to control self-intersection on long layered hair, and guide-deform with rest/t-pose/animation inputs for pre-roll
- Context
- A Paris HIVE 2023 Houdini talk on practical groom-and-Vellum workflows, relating to standard guide-groom and Vellum constraint techniques rather than a published method.
- Correctness
- Studio practice, not peer-reviewed; techniques are production-proven in Houdini and tuned per hair type (for example, curled locks need a rest pose very different from the simulated frame), so they are recipes rather than validated general claims.
- Clarity
- Very accessible and hands-on; following along in Houdini with the presets conveys the methods directly.
- How to read it
- Watch with Houdini open and the presets loaded; focus on the guide-groom-plus-Vellum control flow, the proxy-tube self-intersection trick, and the rest-pose-versus-production-frame point for stiff styles.
CFX
-
, , , ,
Text-driven latent diffusion model generating dense strand-based hairstyles in UV space from VQA-annotated synthetic data, upsampling to 100K strands.
abstract ▾ abstract ▴
We present HAAR, the first text-conditioned generative model that produces classical 3D strand-based human hairstyles directly usable in computer graphics rendering and simulation pipelines. Unlike prior text-to-3D methods that rely on 2D priors and only recover the outer visible shell, HAAR represents a hairstyle as a UV texture map on the scalp whose pixels encode latent embeddings of individual strands learned by a VAE, and a latent diffusion model generates this map conditioned on a text embedding via cross-attention. To obtain training annotations, we render synthetic hairstyles and use an off-the-shelf visual question answering system together with a custom prompt pipeline to produce hairstyle descriptions. The generated guiding strands are upsampled in latent space by blending nearest neighbor and bilinear interpolation, yielding dense hairstyles in seconds rather than the hours required by SDS-based approaches, and the learned latent space also supports semantic editing of hairstyles from text.
Related Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction · Perm: A Parametric Representation for Multi-Style 3D Hair Modeling · 3D Hair Synthesis Using Volumetric Variational Autoencoders · Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images
how to read this ▾ how to read this ▴
- Category
- Method: a text-conditioned generative model for strand-based hair
- Contributions
-
- A latent diffusion model that generates classical 3D strand-based hairstyles from text, directly usable in rendering and simulation pipelines
- A UV-space scalp texture whose pixels encode per-strand latent embeddings learned by a VAE, conditioned on text via cross-attention
- A VQA-plus-prompt pipeline to auto-annotate synthetic hairstyles, and a latent upsampling scheme producing 100K-strand grooms in seconds with text-driven editing
- Context
- Builds on prior-guided strand reconstruction (Neural Haircut) and hierarchical generative grooming (GroomGen), but replaces SDS-style 2D priors with a full 3D strand representation. Builds on: Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction · GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations
- Correctness
- Demonstrated on synthetic hairstyles annotated by an off-the-shelf VQA system, so output diversity and text fidelity are bounded by the synthetic training distribution and the quality of auto-generated captions; treat real-world coverage and caption accuracy as open.
- Clarity
- Accessible at a high level; a first pass conveys the UV-strand-plus-diffusion idea, a second pass is needed for the VAE encoding and latent upsampling math.
- How to read it
- First pass for the representation (UV map of strand latents) and the text-conditioning route; do a second pass on the VAE and upsampling if you care about how dense grooms stay coherent.
CFX / ML Deformation
-
Live demonstration of MetaHuman Animator's monocular facial performance capture, shown with a Hellblade actor transferring facial animation to a MetaHuman in real time.
Facial / Retargeting
-
, , , ,
First framework to synthesize full-body character motion with 3D objects from text-based intent labels, supporting single and two-handed interactions.
abstract ▾ abstract ▴
Can we make virtual characters in a scene interact with their surrounding objects through simple instructions? Is it possible to synthesize such motion plausibly with a diverse set of objects and instructions? Inspired by these questions, we present the first framework to synthesize the full‐body motion of virtual human characters performing specified actions with 3D objects placed within their reach. Our system takes textual instructions specifying the objects and the associated ‘intentions’ of the virtual characters as input and outputs diverse sequences of full‐body motions. This contrasts existing works, where full‐body action synthesis methods generally do not consider object interactions, and human‐object interaction methods focus mainly on synthesizing hand or finger movements for grasping objects. We accomplish our objective by designing an intent‐driven full‐body motion generator, which uses a pair of decoupled conditional variational auto‐regressors to learn the motion of the body parts in an autoregressive manner. We also optimize the 6‐DoF pose of the objects such that they plausibly fit within the hands of the synthesized characters. We compare our proposed method with the existing methods of motion synthesis and establish a new and stronger state‐of‐the‐art for the task of intent‐driven motion synthesis.
Related MotionCLIP: Exposing Human Motion Generation to CLIP Space · T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations · Learning Robust and Scalable Motion Matching with Lipschitz Continuity and Sparse Mixture of Experts · TEMOS: Generating Diverse Human Motions from Textual Descriptions
how to read this ▾ how to read this ▴
- Category
- Method: intent-driven full-body human-object interaction synthesis
- Contributions
-
- First framework to synthesize full-body character motion interacting with 3D objects from text-based intent labels
- An intent-driven generator using a pair of decoupled conditional variational auto-regressors to learn body-part motion autoregressively
- A 6-DoF object pose optimization so objects plausibly fit within the synthesized character's hands, supporting single and two-handed interactions
- Context
- Relates to character-scene interaction work (Neural State Machine) and bridges full-body action synthesis with hand/finger grasping methods, which it argues had been treated separately. Builds on: Neural State Machine for Character-Scene Interactions
- Correctness
- Plausibility hinges on the decoupled VAE auto-regressors and the post-hoc object-fit optimization; diversity and physical correctness are bounded by the training data and the set of objects/intents used, so contact and penetration quality outside that set is uncertain.
- Clarity
- Readable framing with a clear motivating question; a first pass conveys the pipeline, a second pass is needed for the auto-regressor coupling and object-pose optimization.
- How to read it
- First pass for the problem setup and the decoupled-generator idea; second pass on how object 6-DoF fitting is coupled to the body motion if you plan to reuse the interaction model.
Motion Synthesis
-
Local-global ADMM solver for Discrete Elastic Rods with Coulomb friction fully parallelized on GPU, reproducing stick-slip behavior at interactive rates for high-resolution hair.
abstract ▾ abstract ▴
We devise a local, global solver dedicated to the simulation of Discrete Elastic Rods (DER) with Coulomb friction that can fully leverage the massively parallel compute capabilities of moderns GPUs. We verify that our simulator can reproduce analytical results on recently published cantilever, bend, twist, and stick, slip experiments, while drastically decreasing iteration times for high-resolution hair simulations. Being able to handle contacting assemblies of several thousand elastic rods in real-time, our fast solver paves the ways for new workflows such as interactive physics-based editing of digital grooms.
Related Projective Dynamics with Dry Frictional Contact · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Nonlinear Cloth Simulation with Isogeometric Analysis · Efficient and Stable Approach to Elasticity and Collisions for Hair Animation
how to read this ▾ how to read this ▴
- Category
- Method: a GPU-parallel solver for hair simulation
- Contributions
-
- A local-global (ADMM) solver for Discrete Elastic Rods with Coulomb friction designed to map onto massively parallel GPU compute
- Validation against analytical results on cantilever, bend, twist, and stick-slip experiments while drastically cutting iteration times
- Real-time handling of contacting assemblies of several thousand elastic rods, enabling interactive physics-based groom editing
- Context
- Builds on Discrete Elastic Rods and on the author's prior hybrid iterative solver for robust Coulomb friction in hair dynamics, recasting it as a parallel local-global scheme. Builds on: A Hybrid Iterative Solver for Robustly Capturing Coulomb Friction in Hair Dynamics
- Correctness
- Correctness is argued via reproduction of published analytical bend/twist/stick-slip results; interactive rates are reported for thousands of rods, so behavior at much higher strand counts or with stiffer materials/time-steps should be checked against the paper's stated regime.
- Clarity
- Concise and concrete; a first pass conveys the goal and validation, a second pass is needed for the ADMM splitting and friction handling.
- How to read it
- First pass for the validation claims and target performance regime; second pass on the local-global formulation and friction treatment if implementing on GPU.
CFX
-
, , , , , ,
Combines a pose-conditioned invertible network with differentiable LBS to animate implicit surfaces while preserving surface correspondences, outperforming state-of-the-art reposing methods.
abstract ▾ abstract ▴
Building animatable and editable models of clothed humans from raw 3D scans and poses is a challenging problem. Existing reposing methods suffer from the limited expressiveness of Linear Blend Skinning (LBS), require costly mesh extraction to generate each new pose, and typically do not preserve surface correspondences across different poses. In this work, we introduce Invertible Neural Skinning (INS) to address these shortcomings. To maintain correspondences, we propose a Pose-conditioned Invertible Network (PIN) architecture, which extends the LBS process by learning additional pose-varying deformations. Next, we combine PIN with a differentiable LBS module to build an expressive and end-to-end Invertible Neural Skinning (INS) pipeline. We demonstrate the strong performance of our method by outperforming the state-of-the-art reposing techniques on clothed humans and preserving surface correspondences, while being an order of magnitude faster. We also perform an ablation study, which shows the usefulness of our pose-conditioning formulation, and our qualitative results display that INS can rectify artefacts introduced by LBS well.
Related SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting · SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
how to read this ▾ how to read this ▴
- Category
- Method: an invertible neural skinning pipeline for implicit surfaces
- Contributions
-
- A Pose-conditioned Invertible Network (PIN) that extends LBS with learned pose-varying deformations while maintaining surface correspondences
- An end-to-end Invertible Neural Skinning pipeline combining PIN with a differentiable LBS module, avoiding costly per-pose mesh extraction
- Reported state-of-the-art reposing quality on clothed humans at roughly an order of magnitude faster, with an ablation on the pose-conditioning
- Context
- Builds on differentiable forward skinning of neural implicit shapes (SNARF) and on classic Linear Blend Skinning, targeting LBS's limited expressiveness and lack of correspondence. Builds on: SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
- Correctness
- Validated on clothed-human reposing from 3D scans and poses; gains in speed and correspondence rest on the invertibility of PIN, so generalization to poses far from training and to very loose garments is a reader caution.
- Clarity
- Moderately technical; a first pass conveys the invertible-skinning idea, a second pass is needed for the network invertibility and its coupling with LBS.
- How to read it
- First pass for why correspondence and mesh-extraction cost matter; second pass on the PIN architecture and the differentiable-LBS coupling if you intend to reimplement.
Skinning / ML Deformation
-
Santa Monica Studio described a systematic joint-based rig following muscle fiber curves for volume preservation, plus dynamic bone techniques approximating muscle, fat, and skin jiggle in combat.
Skinning / Muscles
-
, , , , ,
Neural parametric head model using an ensemble of local MLPs in canonical SDF space with forward deformation field, trained on 5,200+ scans from 255 subjects.
abstract ▾ abstract ▴
We propose a novel 3D morphable model for complete human heads based on hybrid neural fields. At the core of our model lies a neural parametric representation that disentangles identity and expressions in disjoint latent spaces. To this end, we capture a person's identity in a canonical space as a signed distance field (SDF), and model facial expressions with a neural deformation field. In addition, our representation achieves high-fidelity local detail by introducing an ensemble of local fields centered around facial anchor points. To facilitate generalization, we train our model on a newly-captured dataset of over 3700 head scans from 203 different identities using a custom high-end 3D scanning setup. Our dataset significantly exceeds comparable existing datasets, both with respect to quality and completeness of geometry, averaging around 3.5M mesh faces per scan11We will publicly release our dataset along with a public benchmark for both neural head avatar construction as well as an evaluation on a hidden test-set for inference-time fitting.. Finally, we demonstrate that our approach outperforms state-of-the-art methods in terms of fitting error and reconstruction quality.
Related PointAvatar: Deformable Point-Based Head Avatars from Videos · i3DMM: Deep Implicit 3D Morphable Model of Human Heads · EMOCA: Emotion Driven Monocular Face Capture and Animation · I M Avatar: Implicit Morphable Head Avatars from Videos
how to read this ▾ how to read this ▴
- Category
- Method plus dataset: a neural parametric head model
- Contributions
-
- A neural 3D morphable head model disentangling identity (a canonical SDF) and expression (a neural deformation field) in separate latent spaces
- An ensemble of local fields anchored at facial points to capture high-fidelity local detail
- A newly captured high-resolution head-scan dataset and benchmark, with reported state-of-the-art fitting error and reconstruction quality
- Context
- Extends implicit 3D morphable head modeling (i3DMM) by moving to hybrid neural fields with local anchor-based detail rather than a single global field. Builds on: i3DMM: Deep Implicit 3D Morphable Model of Human Heads
- Correctness
- Trained on a large custom multi-subject scan set captured with a high-end rig; the note and abstract give slightly different scan/subject counts, so verify exact dataset numbers in the paper, and expect fitting quality to reflect the capture rig's distribution.
- Clarity
- Clearly structured; a first pass conveys the identity/expression disentanglement and local-field idea, a second pass is needed for the SDF and deformation-field formulation.
- How to read it
- First pass for the representation and the dataset/benchmark contribution; second pass on the local-field ensemble and canonical-space training if fitting your own heads.
Facial
- Learning Robust and Scalable Motion Matching with Lipschitz Continuity and Sparse Mixture of Experts MIG Industrial 2 cites
,
Motion matching approach with Lipschitz-constrained networks and sparse mixture of experts for robust, scalable learned motion synthesis.
abstract ▾ abstract ▴
Motion matching(Büttner and Clavet [2015]; Clavet [2016]) has become a widely adopted technique for generating high-quality interactive animation systems in video games. However, its current implementations suffer from significant computational and memory resource overheads, limiting its scalability in the context of modern video game performance profiles. "Learned Motion Matching"[Holden et al. 2020] mitigated some of these challenges, however, whilst reducing memory requirements, it resulted in increases in performance costs. In this paper, we propose a novel method for learning motion matching that combines a Sparse Mixture of Experts model architecture and a Lipschitz-continuous latent space for representation of poses. This approach significantly reduces the computational complexity of the models, while simultaneously improving the compactness of the data that can be stored and the robustness of pose output. As a result, our method enables the efficient execution of motion matching that significantly outperforms other implementations for large character counts, by 8.5x times in CPU execution cost and at 80% of the memory requirements of "Learned Motion Matching", on contemporary video game hardware, thereby enhancing its practical applicability and scalability in the gaming industry.
Related Learned Motion Matching · Example-based Motion Synthesis via Generative Motion Matching · Real-Time Diverse Motion In-Betweening with Space-Time Control · Taming Diffusion Probabilistic Models for Character Control
how to read this ▾ how to read this ▴
- Category
- Method: a learned motion matching architecture for games
- Contributions
-
- A learned motion matching method combining a Sparse Mixture of Experts model with a Lipschitz-continuous latent pose space
- Reduced computational complexity and more compact stored data while improving robustness of pose output
- Reported large efficiency gains over prior implementations for high character counts in CPU cost and memory
- Context
- Builds directly on Learned Motion Matching (which reduced memory but raised runtime cost) and on the original motion matching technique from games practice. Builds on: Learned Motion Matching
- Correctness
- Performance claims (e.g. large CPU speedups, reduced memory at high character counts) are stated relative to specific baselines and hardware profiles; treat the headline multipliers as conditional on that comparison setup, and motion quality versus the gains is the trade to watch.
- Clarity
- Game-systems oriented and accessible; a first pass conveys the SMoE-plus-Lipschitz idea, a second pass is needed for how Lipschitz continuity is enforced and measured.
- How to read it
- First pass for the motivation (scalability of motion matching) and the architectural recipe; second pass on the Lipschitz constraint and expert routing if you target large crowds at runtime.
Motion Synthesis
-
, , ,
Improved graph neural network approach for cloth simulation with better generalization to unseen garment shapes and body poses.
abstract ▾ abstract ▴
Deep learning-based cloth simulation approaches have potential in achieving real-time simulation of complex cloth by directly learning a mapping from control input to resulting cloth movement, bypassing the need for time-consuming dynamic solving and collision processing. Recent advancements have demonstrated the effectiveness of Graph Neural Networks (GNN) in learning cloth dynamics. However, existing GNN-based models have limitations in predicting scenarios involving complex cloth movement. To overcome this limitation, we propose a novel GNN-based model that incorporates several components, including RNN-based state encoding and physics-informed features. Our model significantly improves the accuracy of cloth dynamics prediction in various scenarios, including those with complex cloth movement driven by control handles. Furthermore, our model demonstrates generalization capabilities for cloth mesh topology and control handle configurations. We validate the effectiveness of our approach through ablation studies and comparisons with a baseline model.
Related D-Cloth: Skinning-based Cloth Dynamic Prediction with a Three-stage Network · Dynamic Deformables: Implementation and Production Practicalities · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Detail-Aware Deep Clothing Animations Infused with Multi-Source Attributes
how to read this ▾ how to read this ▴
- Category
- Method: a GNN-based learned cloth simulator
- Contributions
-
- A GNN cloth model adding RNN-based state encoding and physics-informed features to improve prediction of complex cloth movement
- Improved accuracy on scenarios with control-handle-driven cloth motion versus a baseline
- Demonstrated generalization across cloth mesh topology and control-handle configurations, supported by ablations
- Context
- Builds on graph-neural-network cloth dynamics and neural cloth simulation, targeting the weak generalization of prior GNN models on complex motion. Builds on: Neural Cloth Simulation
- Correctness
- Generalization claims rest on the added recurrent encoding and physics-informed inputs and are evaluated against a baseline via ablation; coverage of garment shapes, poses, and handle setups outside the tested range remains the key uncertainty.
- Clarity
- Reasonably accessible; a first pass conveys which components were added and why, a second pass is needed for the message-passing and state-encoding details.
- How to read it
- First pass for the added components and the generalization framing; second pass on the RNN state encoding and physics-informed features if you plan to extend the GNN.
CFX / ML Deformation
- talk MetaHuman Framework & Machine Learning for Next-Gen Character Deformation | GDC 2023 Unreal Epic
,
End-to-end walkthrough of the ML Deformer framework, showing how muscle, flesh, and cloth simulation training data drives real-time deformation for MetaHuman characters.
ML Deformation / Facial / Skinning
-
,
Epic Games demonstrated end-to-end MetaHuman assembly and the UE5 ML Deformer workflow, training networks on full muscle, flesh, and cloth simulation data to approximate high-fidelity deformation at runtime.
abstract ▾ abstract ▴
Epic Games walks through the MetaHuman framework and a UE5 ML Deformer workflow for high-fidelity real-time body deformation. Rafael covers MetaHuman Creator, mesh to MetaHuman, the MetaHuman DNA format with DNA Calib, Rig Logic facial rigging in semantic space, and MetaHuman Animator for footage-based facial performance capture. Matt then shows training neural networks on full muscle, flesh and cloth simulation generated in Houdini for an anatomically correct digidouble (built from 3D scans and MRI), compressing dynamic sim deltas into learned morph targets that run in real time on PS5. He details the neural morph model (local versus global, bone associations) for flesh and the nearest neighbor model with PCA and K-means pose generation for cloth, plus training range-of-motion authoring, twist joints, and the in-editor training and testing UI.
Related Speed Up Animation Workflows With Maya's ML Deformer, Powered by Autodesk AI · FaceBaker: Baking Character Facial Rigs with Machine Learning · Motion Mastery: 4D Facial Performance Capture, KineFX and PDG-Powered MoCap Pipeline · Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer
how to read this ▾ how to read this ▴
- Category
- Production talk: MetaHuman framework and a UE5 ML Deformer workflow
- Contributions
-
- Walkthrough of the end-to-end MetaHuman framework: Creator, mesh-to-MetaHuman, the DNA format with DNA Calib, Rig Logic facial rigging, and MetaHuman Animator
- Demonstrates training neural networks on full muscle, flesh, and cloth simulation (authored in Houdini) to compress dynamic sim deltas into learned morph targets running in real time on PS5
- Details a neural morph model (local vs global, bone associations) for flesh and a nearest-neighbor PCA/K-means model for cloth, plus range-of-motion authoring and in-editor training/testing UI
- Context
- Builds on deformation-approximation work such as Fast and Deep Deformation Approximations, productionizing learned deformation inside Unreal Engine 5. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a specific anatomically correct digidouble (scans plus MRI) and target hardware (PS5), so numbers and quality reflect that pipeline rather than a controlled benchmark.
- Clarity
- Practitioner-facing and concrete; a single pass conveys the workflow, rewatch specific segments for the ML Deformer model details.
- How to read it
- Watch once end-to-end for the toolchain map, then revisit the ML Deformer (flesh neural morph and cloth PCA/K-means) and ROM-authoring sections if building a similar real-time deformation pipeline.
ML Deformation / Skinning / Facial
- talk Motion Mastery: 4D Facial Performance Capture, KineFX and PDG-Powered MoCap Pipeline FMX Industrial
3SISTERS studio presents a 4D volumetric capture pipeline for creating Sharon Stone's digital double, coupled with a PDG and KineFX-driven mocap retargeting workflow in Houdini.
abstract ▾ abstract ▴
3SISTERS presents two character pipelines built around Houdini KinFX and PDG. The first covers a digital double of Sharon Stone using photometric facial scans and 4D performance capture at 60 scans per second (over 80TB of raw data), with topology made consistent across frames via proprietary Faceform Wrap topotransfer, plus three-point face stabilization, separate eye and lower-teeth tracking, a Python shot importer, and Houdini-based grooming for hair, brows, eyelashes and peach fuzz. The second walks through an Unreal Engine cinematic using heavily modified MetaHumans, with Xsens-recorded body animation processed and procedurally retargeted in a custom KinFX/PDG tool that filters takes, sets ranges and exports at scale. Prop and instrument rigging uses auto-skeleton aim setups, corrective joint blendshapes fix folds at the hands and knees, cloth is simulated in Vellum with Marvelous Designer material-attribute splitting and PDG-driven film constraints, and JSON metadata carries Houdini layout transforms into Unreal level sequences.
Related MetaHuman Framework and Machine Learning for Next-Gen Character Deformation · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Grooming and Simulation Methods for Different Hair Types | Andriy Bilichenko | Paris HIVE 2023
how to read this ▾ how to read this ▴
- Category
- Production talk: 4D capture and a KineFX/PDG mocap pipeline
- Contributions
-
- Demonstrates a digital-double pipeline using photometric facial scans and 4D performance capture, with cross-frame topology made consistent via Faceform Wrap topotransfer plus face stabilization and separate eye/teeth tracking
- Shows a procedural KineFX/PDG mocap retargeting tool for Xsens body data that filters takes, sets ranges, and exports at scale for an Unreal cinematic using modified MetaHumans
- Covers prop/instrument rigging with auto aim setups, corrective joint blendshapes for hand/knee folds, Vellum cloth from Marvelous Designer, and JSON-carried layout metadata into Unreal
- Context
- Relates to anatomically-constrained facial performance retargeting (Chandran et al.), applied within a Houdini KineFX and PDG production setting. Builds on: Local Anatomically-Constrained Facial Performance Retargeting
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on specific shows (a Sharon Stone digital double and an Unreal cinematic) at large data scales, so the workflow reflects those projects rather than a general benchmark.
- Clarity
- Pipeline-heavy and tool-specific; a single pass conveys the two workflows, rewatch for the topotransfer and PDG retargeting specifics.
- How to read it
- Watch once for the overall 4D-capture and retargeting pipeline shape, then revisit the topology-consistency and KineFX/PDG retargeting segments if you work in Houdini-to-Unreal character pipelines.
Retargeting / Facial
-
, ,
Integrates fatigue dynamics into Hill-type muscle model and trains a VAE motion controller from large unstructured datasets.
abstract ▾ abstract ▴
In this paper, we present a simulation and control framework for generating biomechanically plausible motion for muscle-actuated characters. We incorporate a fatigue dynamics model, the 3CC-r model, into the widely-adopted Hill-type muscle model to simulate the development and recovery of fatigue in muscles, which creates a natural evolution of motion style caused by the accumulation of fatigue from prolonged activities. To address the challenging problem of controlling a musculoskeletal system with high degrees of freedom, we propose a novel muscle-space control strategy based on PD control. Our simulation and control framework facilitates the training of a generative model for muscle-based motion control, which we refer to as MuscleVAE. By leveraging the variational autoencoders (VAEs), MuscleVAE is capable of learning a rich and flexible latent representation of skills from a large unstructured motion dataset, encoding not only motion features but also muscle control and fatigue properties. We demonstrate that the MuscleVAE model can be efficiently trained using a model-based approach, resulting in the production of high-fidelity motions and enabling a variety of downstream tasks.
Related ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters · Generative GaitNet · ReGAIL: Toward Agile Character Control From a Single Reference Motion · MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations
how to read this ▾ how to read this ▴
- Category
- Method: a model-based controller for muscle-actuated characters
- Contributions
-
- A simulation/control framework integrating the 3CC-r fatigue dynamics model into a Hill-type muscle model to evolve motion style as fatigue accumulates
- A muscle-space control strategy based on PD control to handle the high-DoF musculoskeletal system
- MuscleVAE, a VAE trained model-based from large unstructured motion data, encoding motion plus muscle control and fatigue, enabling high-fidelity motions and downstream tasks
- Context
- Builds on scalable muscle-actuated human simulation and control (Lee et al. 2019), adding fatigue dynamics and a learned generative controller. Builds on: Scalable Muscle-Actuated Human Simulation and Control
- Correctness
- Plausibility rests on the Hill-type-plus-3CC-r muscle/fatigue model and a model-based VAE training scheme; results are demonstrated in simulation, so biomechanical realism is bounded by the muscle model assumptions and the unstructured dataset used.
- Clarity
- Technical but well-motivated; a first pass conveys the fatigue-plus-VAE idea, a second pass is needed for the muscle-space PD control and model-based training.
- How to read it
- First pass for the fatigue-model integration and what MuscleVAE encodes; second pass on the muscle-space control and model-based VAE training if simulating musculoskeletal characters.
Muscles / Motion Synthesis
-
, , , ,
Dynamic neural radiance fields via hash ensemble deformation for high-fidelity head reconstruction, paired with a 4,700+ sequence multi-view capture dataset.
abstract ▾ abstract ▴
We focus on reconstructing high-fidelity radiance fields of human heads, capturing their animations over time, and synthesizing re-renderings from novel viewpoints at arbitrary time steps. To this end, we propose a new multi-view capture setup composed of 16 calibrated machine vision cameras that record time-synchronized images at 7.1 MP resolution and 73 frames per second. With our setup, we collect a new dataset of over 4700 high-resolution, high-framerate sequences of more than 220 human heads, from which we introduce a new human head reconstruction benchmark1. The recorded sequences cover a wide range of facial dynamics, including head motions, natural expressions, emotions, and spoken language. In order to reconstruct high-fidelity human heads, we propose Dynamic Neural Radiance Fields using Hash Ensembles (NeRSemble). We represent scene dynamics by combining a deformation field and an ensemble of 3D multi-resolution hash encodings. The deformation field allows for precise modeling of simple scene movements, while the ensemble of hash encodings helps to represent complex dynamics. As a result, we obtain radiance field representations of human heads that capture motion over time and facilitate re-rendering of arbitrary novel viewpoints.
Related MoRF: Morphable Radiance Fields for Multiview Neural Head Modeling · PointAvatar: Deformable Point-Based Head Avatars from Videos · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · I M Avatar: Implicit Morphable Head Avatars from Videos
how to read this ▾ how to read this ▴
- Category
- Method plus dataset/benchmark: dynamic neural radiance fields for human heads
- Contributions
-
- A multi-view capture rig (16 calibrated cameras, 7.1 MP, 73 fps) and a dataset of 4,700+ sequences across 220+ subjects, released as a head reconstruction benchmark
- NeRSemble, a dynamic NeRF that pairs a deformation field with an ensemble of multi-resolution hash encodings to model both simple motion and complex dynamics
- Novel-view, novel-time re-rendering of high-fidelity animated heads
- Context
- Sits in the neural avatar lineage building on monocular approaches such as Neural Head Avatars (Grassal 2022), pushing toward dense multi-view radiance-field capture. Builds on: Neural Head Avatars from Monocular RGB Videos
- Correctness
- Validated on the authors' own controlled multi-view capture; results depend on a 16-camera calibrated studio, so quality and applicability to casual or monocular inputs are not claimed.
- Clarity
- Accessible at a high level; a first pass conveys the capture-plus-hash-ensemble idea, a second pass is needed for the deformation and hash-encoding formulation.
- How to read it
- First pass for the dataset/benchmark value and the hash-ensemble concept; second pass on the deformation-field plus hash-encoding combination if you intend to reproduce or build on the representation.
Facial
- Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild SIGGRAPH Academic 37 cites
, , , ,
End-to-end deep learning method for automatic rigging and retargeting of arbitrary in-the-wild facial meshes without manual blendshape creation.
abstract ▾ abstract ▴
We propose an end-to-end deep-learning approach for automatic rigging and retargeting of 3D models of human faces in the wild. Our approach, called Neural Face Rigging (NFR), holds three key properties: (i) NFR’s expression space maintains human-interpretable editing parameters for artistic controls; (ii) NFR is readily applicable to arbitrary facial meshes with different connectivity and expressions; (iii) NFR can encode and produce fine-grained details of complex expressions performed by arbitrary subjects. To the best of our knowledge, NFR is the first approach to provide realistic and controllable deformations of in-the-wild facial meshes, without the manual creation of blendshapes or correspondence. We design a deformation autoencoder and train it through a multi-dataset training scheme, which benefits from the unique advantages of two data sources: a linear 3DMM with interpretable control parameters as in FACS and 4D captures of real faces with fine-grained details. Through various experiments, we show NFR’s ability to automatically produce realistic and accurate facial deformations across a wide range of existing datasets and noisy facial scans in-the-wild, while providing artist-controlled, editable parameters.
Related Facial Retargeting with Automatic Range of Motion Alignment · Animating Facial Expressions · Transferring the Rig and Animations from a Character to Different Face Models · RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data
how to read this ▾ how to read this ▴
- Category
- Method: deep-learning facial rigging and retargeting
- Contributions
-
- NFR, an end-to-end approach that auto-rigs and retargets arbitrary in-the-wild facial meshes without manual blendshapes or correspondence
- A deformation autoencoder with human-interpretable, FACS-like editing parameters that applies across meshes of differing connectivity
- A multi-dataset training scheme combining a linear 3DMM (interpretable controls) with 4D real-face captures (fine detail)
- Context
- Relates to single-scan rig generation work such as Li et al. Dynamic Facial Asset and Rig Generation (2020), aiming to remove manual blendshape/correspondence setup. Builds on: Dynamic Facial Asset and Rig Generation from a Single Scan
- Correctness
- Demonstrated on existing facial datasets and noisy in-the-wild scans; interpretability and generalization rest on the 3DMM-plus-4D training mix, so behavior outside those distributions should be checked.
- Clarity
- Reasonably accessible; a first pass conveys the rig-free retargeting goal, a second pass clarifies the autoencoder and multi-dataset training design.
- How to read it
- First pass to grasp the three claimed properties (interpretable controls, arbitrary meshes, fine detail); second pass on the autoencoder architecture and training scheme if rigging/retargeting is your focus.
Facial / Rigging / Retargeting
-
, , , , ,
Two-stage strand-accurate reconstruction from monocular video or multi-view images: coarse volumetric orientation then strand optimization with learned prior.
abstract ▾ abstract ▴
Generating realistic human 3D reconstructions using image or video data is essential for various communication and entertainment applications. While existing methods achieved impressive results for body and facial regions, realistic hair modeling still remains challenging due to its high mechanical complexity. This work proposes an approach capable of accurate hair geometry reconstruction at a strand level from a monocular video or multi-view images captured in uncontrolled lighting conditions. Our method has two stages, with the first stage performing joint reconstruction of coarse hair and bust shapes and hair orientation using implicit volumetric representations. The second stage then estimates a strand-level hair reconstruction by reconciling in a single optimization process the coarse volumetric constraints with hair strand and hairstyle priors learned from the synthetic data. To further increase the reconstruction fidelity, we incorporate image-based losses into the fitting process using a new differentiable renderer. The combined system, named Neural Haircut, achieves high realism and personalization of the reconstructed hairstyles. For video results, please refer to our project page †.
Related HAAR: Text-Conditioned Generative Model of 3D Strand-Based Human Hairstyles · Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images · 3D Hair Synthesis Using Volumetric Variational Autoencoders · Dynamic Hair Modeling from Monocular Videos Using Deep Neural Networks
how to read this ▾ how to read this ▴
- Category
- Method: strand-based hair reconstruction from images/video
- Contributions
-
- Strand-level hair geometry reconstruction from monocular video or multi-view images captured in uncontrolled lighting
- A two-stage pipeline: coarse joint hair/bust shape and orientation via implicit volumes, then strand optimization reconciled with priors learned from synthetic data
- A differentiable renderer with image-based losses to raise reconstruction fidelity
- Context
- Builds on learned strand-based hair modeling such as Neural Strands (Rosu 2022), extending toward in-the-wild monocular capture. Builds on: Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images
- Correctness
- Demonstrated on monocular and multi-view captures; strand fidelity relies on hairstyle priors learned from synthetic data, so out-of-distribution or heavily occluded styles may be a limitation.
- Clarity
- Moderately technical; a first pass conveys the coarse-then-strand structure, a second pass is needed for the optimization and prior formulation.
- How to read it
- First pass for the two-stage strategy and where the synthetic prior enters; second pass on the optimization and differentiable rendering losses if reconstructing hair yourself.
CFX / ML Deformation
-
, , , , ,
Reconstructs a high-fidelity 3D facial avatar from a single source image using NeRF, enabling photo-realistic reenactment driven by arbitrary target faces.
abstract ▾ abstract ▴
3D facial avatar reconstruction has been a significant research topic in computer graphics and computer vision, where photo-realistic rendering and flexible controls over poses and expressions are necessary for many related applications. Recently, its performance has been greatly improved with the development of neural radiance fields (NeRF). However, most existing NeRF-based facial avatars focus on subject-specific reconstruction and reenactment, requiring multi-shot images containing different views of the specific subject for training, and the learned model cannot generalize to new identities, limiting its further applications. In this work, we propose a one-shot 3D facial avatar reconstruction framework that only requires a single source image to reconstruct a high-fidelity 3D facial avatar. For the challenges of lacking generalization ability and missing multi-view information, we leverage the generative prior of 3D GAN and develop an efficient encoder-decoder network to reconstruct the canonical neural volume of the source image, and further propose a compensation network to complement facial details. To enable fine-grained control over facial dynamics, we propose a deformation field to warp the canonical volume into driven expressions. Through extensive experimental comparisons, we achieve superior synthesis results compared to several state-of-the-art methods.
Related Codec Avatars: Photorealistic Telepresence at Scale · Neural Head Avatars from Monocular RGB Videos · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · MoRF: Morphable Radiance Fields for Multiview Neural Head Modeling
how to read this ▾ how to read this ▴
- Category
- Method: one-shot NeRF-based facial avatar reconstruction
- Contributions
-
- A framework that reconstructs a high-fidelity 3D facial avatar from a single source image
- Use of a 3D GAN generative prior with an encoder-decoder to recover a canonical neural volume, plus a compensation network for facial detail
- A deformation component enabling fine-grained pose/expression control and reenactment driven by arbitrary target faces
- Context
- Relates to animatable detailed face models from in-the-wild images such as DECA (Feng 2021), addressing the subject-specific, multi-shot limitation of prior NeRF avatars. Builds on: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
- Correctness
- Tackles the lack of multi-view info from a single image by leaning on a 3D-GAN prior; one-shot quality is therefore bounded by that prior's coverage, so identities far from its distribution are a caution.
- Clarity
- Accessible in motivation; a first pass conveys the one-shot pipeline, a second pass clarifies the canonical-volume and compensation-network mechanics.
- How to read it
- First pass for how the GAN prior plus compensation network enable single-image avatars; second pass on the deformation/control module if you need driven reenactment.
Facial
- Objective Evaluation Metric for Motion Generative Models: Validating Frechet Motion Distance MIG Academic
, , ,
Validates Frechet Motion Distance as an objective metric for evaluating motion generative models, addressing artifacts like foot skating.
abstract ▾ abstract ▴
Proposes Fréchet Motion Distance (FMD), an objective metric for evaluating motion-generative models that validates performance on realistic motion artifacts like foot skating and over-smoothing. Uses a Transformer-based autoencoder as feature extractor and demonstrates robustness to motion length variations, with significant correlation to human subjective ratings on gesture motion datasets.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Physically Based Motion Transformation · Character Motion Synthesis by Topology Coordinates · Neural Animation Layering for Synthesizing Martial Arts Movements
how to read this ▾ how to read this ▴
- Category
- Evaluation metric: objective measure for motion generative models
- Contributions
-
- Frechet Motion Distance (FMD), an objective metric for evaluating motion-generative models
- A Transformer-based autoencoder as the feature extractor, with robustness to varying motion length
- Validation showing sensitivity to artifacts like foot skating and over-smoothing, and significant correlation with human subjective ratings on gesture datasets
- Context
- Supports the evaluation of text-to-motion and motion-generation work such as HumanML3D (Guo 2022), adapting Frechet-distance-style metrics to human motion. Builds on: Generating Diverse and Natural 3D Human Motions from Text
- Correctness
- Correlation with human ratings is shown on gesture motion datasets; the metric inherits the feature extractor's biases, so reported validity should not be over-generalized to all motion domains.
- Clarity
- Accessible; a single careful pass conveys the metric and its validation, with a second pass only for the autoencoder details.
- How to read it
- First pass to understand what FMD measures and how it was validated against human judgment; consult details only if adopting it to score your own motion models.
Motion Synthesis
-
, , , ,
Progressive multiplicative control policy (PMCP) scales physics imitation to 10,000 AMASS clips with real-time fault recovery.
abstract ▾ abstract ▴
We present a physics-based humanoid controller that achieves high-fidelity motion imitation and fault-tolerant behavior in the presence of noisy input (e.g. pose estimates from video or generated from language) and unexpected falls. Our controller scales up to learning ten thousand motion clips without using any external stabilizing forces and learns to naturally recover from fail-state. Given reference motion, our controller can perpetually control simulated avatars without requiring resets. At its core, we propose the progressive multiplicative control policy (PMCP), which dynamically allocates new network capacity to learn harder and harder motion sequences. PMCP allows efficient scaling for learning from large-scale motion databases and adding new tasks, such as fail-state recovery, without catastrophic forgetting. We demonstrate the effectiveness of our controller by using it to imitate noisy poses from video-based pose estimators and language-based motion generators in a live and real-time multi-person avatar use case.
Related CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control · Universal Humanoid Motion Representations for Physics-Based Control · SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation · AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
how to read this ▾ how to read this ▴
- Category
- Method: physics-based humanoid control for simulated avatars
- Contributions
-
- A physics-based controller that imitates motion with high fidelity and recovers from fail-states without external stabilizing forces or resets
- Progressive Multiplicative Control Policy (PMCP) that allocates new network capacity to harder sequences and added tasks while avoiding catastrophic forgetting
- Scaling to ten thousand motion clips and real-time, multi-person control driven by noisy video- or language-based pose inputs
- Context
- Extends adversarial/imitation physics-based control such as AMP (Peng 2021) toward large-scale, perpetual, fault-tolerant tracking. Builds on: AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
- Correctness
- Demonstrated imitating large motion databases and noisy estimated poses in a simulator; results are in physics simulation, and tracking quality still depends on input pose noise.
- Clarity
- Moderately technical; a first pass conveys the PMCP scaling idea, a second pass is needed for the policy and training formulation.
- How to read it
- First pass for the PMCP capacity-allocation idea and fail-state recovery; second/third pass on the policy architecture and training if implementing physics-based imitation.
Motion Synthesis / Retargeting
-
, , , ,
RL policy retargets sparse VR sensor streams to morphologically diverse characters including non-human avatars in real time.
abstract ▾ abstract ▴
Avatars are important to create interactive and immersive experiences in virtual worlds. One challenge in animating these characters to mimic a user's motion is that commercial AR/VR products consist only of a headset and controllers, providing very limited sensor data of the user's pose. Another challenge is that an avatar might have a different skeleton structure than a human and the mapping between them is unclear. In this work we address both of these challenges. We introduce a method to retarget motions in real-time from sparse human sensor data to characters of various morphologies. Our method uses reinforcement learning to train a policy to control characters in a physics simulator. We only require human motion capture data for training, without relying on artist-generated animations for each avatar. This allows us to use large motion capture datasets to train general policies that can track unseen users from real and sparse data in real-time. We demonstrate the feasibility of our approach on three characters with different skeleton structure: a dinosaur, a mouse-like creature and a human. We show that the avatar poses often match the user surprisingly well, despite having no sensor information of the lower body available.
Related DReCon: Data-Driven Responsive Control of Physics-Based Characters · CALM: Conditional Adversarial Latent Models for Directable Virtual Characters · QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars · Perpetual Humanoid Control for Real-time Simulated Avatars
how to read this ▾ how to read this ▴
- Category
- Method: physics-based real-time motion retargeting from sparse inputs
- Contributions
-
- A method to retarget motion in real time from sparse AR/VR sensor data (headset plus controllers) to characters of varied morphology
- An RL policy trained in a physics simulator using only human mocap data, with no per-avatar artist animation
- Demonstration across non-human skeletons (dinosaur, mouse-like creature) and a human, tracking unseen users from sparse real data
- Context
- Combines contact-aware retargeting (Villegas 2021) with sparse-sensor simulated tracking such as QuestSim (Winkler 2022). Builds on: Contact-Aware Retargeting of Skinned Motion · QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars
- Correctness
- Shown on three characters with differing skeletons in simulation; the human-to-nonhuman mapping is learned and acknowledged as not fully defined, so match quality varies and is reported as often (not always) good.
- Clarity
- Accessible in setup; a first pass conveys the sparse-input-to-avatar pipeline, a second pass clarifies the RL formulation and reward design.
- How to read it
- First pass for the problem framing (sparse sensors, diverse morphologies) and feasibility claims; second pass on the RL policy and physics setup if building VR avatar retargeting.
Retargeting / Motion Synthesis
-
, , , ,
Deformable point cloud head avatar disentangling intrinsic albedo from normal-dependent shading, enabling FLAME-driven animation with topological flexibility.
abstract ▾ abstract ▴
The ability to create realistic animatable and relightable head avatars from casual video sequences would open up wide ranging applications in communication and entertainment. Current methods either build on explicit 3D morphable meshes (3DMM) or exploit neural implicit representations. The former are limited by fixed topology, while the latter are non-trivial to deform and inefficient to render. Furthermore, existing approaches entangle lighting and albedo, limiting the ability to re-render the avatar in new environments. In contrast, we propose PointAvatar, a deformable point-based representation that disentangles the source color into intrinsic albedo and normal-dependent shading. We demonstrate that PointAvatar bridges the gap between existing mesh- and implicit representations, combining high-quality geometry and appearance with topological flexibility, ease of deformation and rendering efficiency. We show that our method is able to generate animatable 3D avatars using monocular videos from multiple sources including hand-held smartphones, laptop webcams and internet videos, achieving state-of-the-art quality in challenging cases where previous methods fail, e.g., thin hair strands, while being significantly more efficient in training than competing methods.
Related I M Avatar: Implicit Morphable Head Avatars from Videos · Neural Head Avatars from Monocular RGB Videos · FLARE: Fast Learning of Animatable and Relightable Mesh Avatars · 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting
how to read this ▾ how to read this ▴
- Category
- Method: deformable point-based head avatars from video
- Contributions
-
- PointAvatar, a deformable point-based head representation that bridges mesh and implicit approaches with topological flexibility and efficient rendering
- Disentanglement of source color into intrinsic albedo and normal-dependent shading, enabling relighting in new environments
- Animatable avatars from monocular video across sources (smartphone, webcam, internet clips), driven via FLAME
- Context
- Follows monocular neural head avatar work such as Neural Head Avatars (Grassal 2022), substituting a deformable point cloud for fixed-topology mesh or implicit volumes. Builds on: Neural Head Avatars from Monocular RGB Videos
- Correctness
- Demonstrated on monocular videos including challenging cases like thin hair; the albedo/shading split is a simplified disentanglement, so relighting and very complex appearance remain limitations to weigh.
- Clarity
- Accessible; a first pass conveys the point-based representation and its trade-offs, a second pass covers the deformation and shading model.
- How to read it
- First pass for why a point representation sits between mesh and implicit and what relighting it enables; second pass on the deformation and albedo/shading formulation if adopting it.
Facial
-
, , , ,
Neural IK solver aware of skeleton topology and pose context, enabling fast and accurate pose and motion editing for character rigs.
abstract ▾ abstract ▴
Posing a 3D character for film or game is an iterative and laborious process where many control handles (e.g. joints) need to be manipulated to achieve a compelling result. Neural Inverse Kinematics (IK) is a new type of IK that enables sparse control over a 3D character pose, and leverages full body correlations to complete the un-manipulated joints of the body. While neural IK is promising, current methods are not designed to preserve previous edits in posing workflows. Current models generate a single pose from the handles only, regardless of what was there previously, making it difficult to preserve any variations and hindering tasks such as pose and motion editing. In this paper, we introduce SKEL-IK, a novel architecture and training scheme that is conditioned on a base pose, and designed to flow information directly onto the skeletal graph structure, such that hard constraints can be enforced by blocking information flows at certain joints. As a result, we are able to satisfy both hard and soft constraints, as well as preserve un-manipulated parts of the body when desired. Finally, by controlling the base pose in different ways, we demonstrate the ability of our model to perform tasks such as generating variations and quickly editing poses and motions; with less erosion of the base poses compared to the current state-of-the-art.
Related Sketch-based Motion Editing for Articulated Characters · SKEL-Betweener: a Neural Motion Rig for Interactive Motion Authoring · How the Rig Design Impacts the Animation Process · How to Train Your Dog: Neural Enhancement of Quadruped Animations
how to read this ▾ how to read this ▴
- Category
- Method: skeleton-aware neural inverse kinematics for pose/motion editing
- Contributions
-
- SKEL-IK, a neural IK architecture conditioned on a base pose so previous edits are preserved during posing workflows
- Information flow routed onto the skeletal graph, allowing hard constraints by blocking flow at chosen joints while satisfying soft constraints
- Preservation of un-manipulated body parts and support for pose and motion editing by controlling the base pose
- Context
- Builds on learned IK lineage such as Style-Based Inverse Kinematics (Grochow 2004), addressing prior neural-IK methods that ignore previous edits. Builds on: Style-Based Inverse Kinematics
- Correctness
- Targets the realistic workflow limitation that neural IK overwrites prior edits; benefits depend on the base-pose conditioning and graph-blocking design, so generalization across rigs should be confirmed.
- Clarity
- Moderately technical; a first pass conveys the base-pose-conditioned, graph-structured idea, a second pass clarifies the constraint-enforcement mechanics.
- How to read it
- First pass for the edit-preservation problem and the skeletal-graph blocking idea; second pass on the architecture and constraint handling if you build posing or IK tools.
Rigging / ML Deformation
-
, , , , ,
Overview of DreamWorks' Academy Award-winning Premo system featuring pose-centric sculpting workflows that map edits back to underlying pose shapes.
abstract ▾ abstract ▴
Premo is an Academy Award-winning animation and rigging platform developed by DreamWorks that introduced native rigging support through a new data model. The system enables pose-centric workflows where artists can sculpt deformer output directly in any pose, mapping edits back to underlying pose shapes. Its dual graph representation allows full-resolution production rigs to be loaded editably while maintaining interactive animation performance, eliminating the need for build scripts before rigging.
Related How the Rig Design Impacts the Animation Process · LibEE: A Multithreaded Dependency Graph for Character Animation · DreamWorks Animation Facial Motion and Deformation System · Abstracting Rigging Concepts for a Future Proof Framework Design
how to read this ▾ how to read this ▴
- Category
- Production talk / system overview (DreamWorks Premo rigging and animation platform)
- Contributions
-
- Demonstrates a new data model that gives Premo native rigging support
- Shows pose-centric workflows where artists sculpt deformer output in any pose and map edits back to underlying pose shapes
- Presents a dual graph that loads full-resolution production rigs editably while keeping interactive animation performance, removing pre-rigging build scripts
- Context
- Continues DreamWorks' in-house rigging/evaluation lineage following LibEE 2 (2018), an Academy Award-winning production platform. Builds on: LibEE 2: Enabling Fast Edits and Evaluation
- Correctness
- Studio practice, not peer-reviewed; the described workflows and performance are production-proven on DreamWorks' pipeline rather than independently benchmarked.
- Clarity
- Accessible to practitioners; a single read conveys the system and workflow, with familiarity in production rigging helping.
- How to read it
- Read once for the data-model and dual-graph ideas and the pose-centric sculpting workflow; revisit specific sections only for pipeline lessons relevant to your own rigging system.
Rigging
-
, , , , , , ,
Integrates hybrid semantic-kinematic retrieval into a diffusion model to improve rare-motion generation quality.
abstract ▾ abstract ▴
3D human motion generation is crucial for creative industry. Recent advances rely on generative models with domain knowledge for text-driven motion generation, leading to substantial progress in capturing common motions. However, the performance on more diverse motions remains unsatisfactory. In this work, we propose ReMoDiffuse, a diffusion-model-based motion generation framework that integrates a retrieval mechanism to refine the denoising process. ReMoDiffuse enhances the generalizability and diversity of text-driven motion generation with three key designs: 1) Hybrid Retrieval finds appropriate references from the database in terms of both semantic and kinematic similarities. 2) Semantic-Modulated Transformer selectively absorbs retrieval knowledge, adapting to the difference between retrieved samples and the target motion sequence. 3) Condition Mixture better utilizes the retrieval database during inference, overcoming the scale sensitivity in classifier-free guidance. Extensive experiments demonstrate that ReMoDiffuse outperforms state-of-the-art methods by balancing both text-motion consistency and motion quality, especially for more diverse motion generation. Project page: https://mingyuan-zhang.github.io/projects/ReMoDiffuse.html
Related MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model · Human Motion Diffusion Model · T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations · Executing Your Commands via Motion Diffusion in Latent Space
how to read this ▾ how to read this ▴
- Category
- Method: a retrieval-augmented motion diffusion model for text-driven generation
- Contributions
-
- Hybrid Retrieval that selects database references by both semantic and kinematic similarity
- Semantic-Modulated Transformer that selectively absorbs retrieved knowledge to adapt to the target sequence
- Condition Mixture to better use the retrieval database at inference and reduce scale sensitivity in classifier-free guidance
- Context
- Builds on text-to-motion diffusion in the lineage of Human Motion Diffusion Model (Tevet et al.), adding a retrieval mechanism to refine denoising. Builds on: Human Motion Diffusion Model
- Correctness
- The work argues improved diversity and rare-motion quality from retrieval, and reports outperforming prior methods on text-motion consistency and quality, but gains depend on the coverage and quality of the retrieval database, so out-of-database motions remain a fair concern.
- Clarity
- Accessible at a high level; a first pass conveys the retrieval idea, a second pass is needed for the transformer modulation and condition-mixture formulation.
- How to read it
- Focus on how retrieval is integrated into the denoising loop and why it helps rare motions; do a second pass on the Semantic-Modulated Transformer and Condition Mixture if you intend to reproduce or extend it.
Motion Synthesis
-
, , ,
Fully automatic two-stage method for transferring skin weights between geometrically dissimilar meshes, outperforming commercial software on garments.
abstract ▾ abstract ▴
We present a new method for the robust transfer of skin weights from a source mesh to a target mesh with significantly different geometric shapes. Rigging garments is a typical application of skin weight transfer where weights are copied from a source body mesh to avoid tedious weight painting from scratch. However, existing techniques struggle with non-skin-tight garments and require additional manual weight painting. We introduce a fully automatic two-stage skin weight transfer process. First, an initial transfer is performed by copying weights from the source mesh only for those vertices on the target mesh where we have high confidence in obtaining the ground truth weights from the source. Then, we automatically compute weights for all other vertices by interpolating the weights computed in stage one. This approach is robust and easy to implement in practice, yet it far outperforms the methods used in existing commercial software and previous research works.
Related Bounded Biharmonic Weights for Real-Time Deformation · Real-Time Deformation with Coupled Cages and Skeletons · Implicit Skinning: Real-Time Skin Deformation with Contact Modeling · Harmonic Coordinates for Character Articulation
how to read this ▾ how to read this ▴
- Category
- Method: an automatic skin-weight transfer algorithm
- Contributions
-
- A fully automatic two-stage skin weight transfer from a source mesh to a geometrically dissimilar target
- Stage one copies weights only for high-confidence target vertices, stage two interpolates weights for the rest (weight inpainting)
- Robustness on non-skin-tight garments, reported to outperform existing commercial software and prior research
- Context
- Addresses production garment rigging in the lineage of weight-transfer and binding methods such as Geodesic Voxel Binding (Dionne et al.). Builds on: Geodesic Voxel Binding for Production Character Meshes
- Correctness
- The key assumption is that a confident subset of correspondences gives reliable ground-truth weights from which the remainder can be interpolated; demonstrated mainly on garments over body meshes, so very loose or topologically very different targets are worth checking.
- Clarity
- Accessible and described as easy to implement; a first pass conveys the two-stage idea, a second pass clarifies the confidence criterion and inpainting.
- How to read it
- Focus on how high-confidence vertices are selected and how interpolation fills the rest; one careful pass is usually enough to apply it, a second pass for exact implementation details.
Skinning / Rigging
- SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation CVPR Academic 477 cites
, , , , , , ,
Generates 3DMM motion coefficients for head pose and expression from audio and modulates a 3D-aware face renderer for stylized talking head generation from a single image.
abstract ▾ abstract ▴
Generating talking head videos through a face image and a piece of speech audio still contains many challenges. i.e., unnatural head movement, distorted expression, and identity modification. We argue that these issues are mainly caused by learning from the coupled 2D motion fields. On the other hand, explicitly using 3D information also suffers problems of stiff expression and incoherent video. We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation. To learn the realistic motion coefficients, we explicitly model the connections between audio and different types of motion coefficients individually. Precisely, we present ExpNet to learn the accurate facial expression from audio by distilling both coefficients and 3D-rendered faces. As for the head pose, we design PoseVAE via a conditional VAE to synthesize head motion in different styles. Finally, the generated 3D motion coefficients are mapped to the unsupervised 3D keypoints space of the proposed face render to synthesize the final video. We conducted extensive experiments to demonstrate the superiority of our method in terms of motion and video quality.11The code and demo videos are available at https://sadtalker.github.io.
Related Capture, Learning, and Synthesis of 3D Speaking Styles · MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement · CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior · FaceFormer: Speech-Driven 3D Facial Animation with Transformers
how to read this ▾ how to read this ▴
- Category
- Method: audio-driven talking-head animation from a single image
- Contributions
-
- Generates 3DMM motion coefficients (head pose and expression) from audio and uses them to drive a 3D-aware face renderer
- ExpNet learns accurate expression coefficients from audio by distilling coefficients and 3D-rendered faces
- PoseVAE, a conditional VAE, synthesizes stylized head motion, with coefficients mapped to unsupervised 3D keypoints for the final video
- Context
- Builds on audio-driven facial reenactment work such as Neural Voice Puppetry (Thies et al.), moving from coupled 2D motion fields toward explicit 3DMM-mediated motion. Builds on: Neural Voice Puppetry: Audio-driven Facial Reenactment
- Correctness
- The premise is that decoupling pose and expression through 3DMM coefficients reduces unnatural motion and identity drift; results rest on a 3DMM and a learned renderer, so identity fidelity and expression range remain bounded by those models.
- Clarity
- Reasonably accessible; a first pass conveys the audio-to-coefficient-to-render pipeline, a second pass is needed for ExpNet and PoseVAE specifics.
- How to read it
- Trace the path from audio to 3DMM coefficients to the rendered frame; do a second pass on ExpNet and PoseVAE if you care about how realism and head-motion style are achieved.
Facial / Motion Synthesis
-
, , , , ,
Four-stage framework computing stable quasistatic configurations for hybrid strand-based hair so simulations start without gravity-induced sag.
abstract ▾ abstract ▴
Lagrangian/Eulerian hybrid strand-based hair simulation techniques have quickly become a popular approach in VFX and real-time graphics applications. With Lagrangian hair dynamics, the inter-hair contacts are resolved in the Eulerian grid using the continuum method, i.e., the MPM scheme with the granular Drucker-Prager rheology, to avoid expensive collision detection and handling. This fuzzy collision handling makes the authoring process significantly easier. However, although current hair grooming tools provide a wide range of strand-based modeling tools for this simulation approach, the crucial sag-free initialization functionality remains often ignored. Thus, when the simulation starts, gravity would cause any artistic hairstyle to sag and deform into unintended and undesirable shapes. This paper proposes a novel four-stage sag-free initialization framework to solve stable quasistatic configurations for hybrid strand-based hair dynamic systems. These four stages are split into two global-local pairs. The first one ensures static equilibrium at every Eulerian grid node with additional inequality constraints to prevent stress from exiting the yielding surface. We then derive several associated closed-form solutions in the local stage to compute segment rest lengths, orientations, and particle deformation gradients in parallel.
Related Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Estimating Cloth Elasticity Parameters From Homogenized Yarn-Level Models · Interactive Hair Simulation on the GPU Using ADMM · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function
how to read this ▾ how to read this ▴
- Category
- Method: sag-free initialization for hybrid hair simulation
- Contributions
-
- A four-stage framework that solves stable quasistatic configurations for hybrid (Lagrangian/Eulerian) strand-based hair
- Two global-local solver pairs, the first enforcing static equilibrium at Eulerian grid nodes with constraints to keep stress within the yield surface
- Lets simulations start from the authored hairstyle without gravity-induced sag
- Context
- Targets MPM-based hybrid hair with Drucker-Prager granular contact, grounded in strand dynamics descending from Discrete Elastic Rods (Bergou et al.). Builds on: Discrete Elastic Rods
- Correctness
- Assumes a quasistatic equilibrium exists and can be solved under the yield-surface constraints for the hybrid model; demonstrated for strand-based hair grooms, so applicability to other materials or solvers is not claimed.
- Clarity
- Technically dense; a first pass conveys the goal and four-stage structure, but the constraint formulation needs a careful second or third pass.
- How to read it
- First pass for the problem and the staged global-local structure; budget a slow second and third pass on the equilibrium constraints and yield-surface handling if implementing.
CFX
-
, , , ,
Learns an embedding space that disentangles skeleton structure from motion semantics, enabling cross-skeleton motion retargeting.
abstract ▾ abstract ▴
Learning deep neural networks on human motion data has become common in computer graphics research, but the heterogeneity of available datasets poses challenges for training large-scale networks. This paper presents a framework that allows us to solve various animation tasks in a skeleton-agnostic manner. The core of our framework is to learn an embedding space to disentangle skeleton-related information from input motion while preserving semantics, which we call Skeleton-Agnostic Motion Embedding (SAME). To efficiently learn the embedding space, we develop a novel autoencoder with graph convolution networks and provide new formulations of various animation tasks operating in the SAME space. We showcase various examples, including retargeting, reconstruction, and interactive character control, and conduct an ablation study to validate design choices made during development.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds · Perpetual Humanoid Control for Real-time Simulated Avatars · Universal Humanoid Motion Representations for Physics-Based Control
how to read this ▾ how to read this ▴
- Category
- Method: a skeleton-agnostic motion embedding for animation
- Contributions
-
- Skeleton-Agnostic Motion Embedding (SAME) that disentangles skeleton structure from motion semantics
- A graph-convolution autoencoder to learn the embedding plus new formulations of animation tasks operating in SAME space
- Demonstrations of retargeting, reconstruction, and interactive character control with an ablation of design choices
- Context
- Addresses dataset skeleton heterogeneity in deep motion learning, in the lineage of Skeleton-Aware Networks for deep motion retargeting (Aberman et al.). Builds on: Skeleton-Aware Networks for Deep Motion Retargeting
- Correctness
- Assumes skeleton-related information can be cleanly separated from semantics while preserving motion meaning; validated across several tasks with an ablation, though the degree of disentanglement and behavior on very exotic skeletons should be read critically.
- Clarity
- Moderately accessible; a first pass conveys the disentanglement idea and applications, a second pass clarifies the graph-conv autoencoder and per-task formulations.
- How to read it
- Focus on what the embedding disentangles and how each task is recast in SAME space; a second pass on the autoencoder and ablation is worthwhile if comparing retargeting methods.
Retargeting / Motion Synthesis
-
, , , , ,
Shaping rig for Pixar's curvenet system that auto-generates surface-aligned animation controls per knot for layered shot-level rigging on Elemental.
abstract ▾ abstract ▴
We present a new shaping rig for authoring layers of animation control that facilitate surface editing in shot work. Our approach expands the curvenet rigging technology [de Goes et al. 2022] by introducing new tools that auto-generate a surface-aligned direct manipulator per curvenet knot. As a result, we obtain a mapping from animation controls into curvenet adjustments relative to the deforming surface with minimal setup. We showcase our curvenet shaping rig using results from Pixar’s feature film Elemental (2023).
Related Harmonic Coordinates for Character Articulation · Pixar's Inside Out 2: Character Rig Challenges and Techniques · Hair Emoting with Style Guides in Turning Red · How the Rig Design Impacts the Animation Process
how to read this ▾ how to read this ▴
- Category
- Production method: a shaping rig for curvenet animation controls
- Contributions
-
- A shaping rig that authors layers of animation control for surface editing in shot work
- Auto-generation of a surface-aligned direct manipulator per curvenet knot with minimal setup
- A mapping from animation controls into curvenet adjustments relative to the deforming surface, shown on Pixar's Elemental
- Context
- Extends the curvenet rigging technology of Character Articulation through Profile Curves (de Goes et al. 2022) toward layered shot-level control. Builds on: Character Articulation through Profile Curves
- Correctness
- Presented as production-validated on a feature film rather than benchmarked; it assumes a curvenet representation already drives the surface, so it is specific to that articulation system.
- Clarity
- Accessible to riggers; a first pass conveys the per-knot manipulator idea, a second pass clarifies the control-to-curvenet mapping.
- How to read it
- Read for the workflow concept (per-knot surface-aligned controls and layered shot edits); a single pass suffices unless you work directly with curvenet rigs, then revisit the mapping details.
Rigging
-
Digital Domain VFX supervisor details facial animation, muscle and cloth simulation for She-Hulk, including Charlatan, a neural network for ML-based face replacement in production.
Facial / ML Deformation / CFX
- talk Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 FMX Industrial
Untold Studios FX Supervisor describes rebuilding their internal fur workflow in Houdini to deliver the CGI highland cow groom for Virgin Media's Highland Rider commercial.
abstract ▾ abstract ▴
Georges Kyparissous, FX Supervisor at Untold Studios, describes the non-destructive Houdini fur workflow built for Virgin Media's Highland Rider commercial featuring a photoreal Highland cow on a motorbike. The setup splits the incoming groom into long hairs to be simulated and short hairs to be deformed, sanitizes attributes via an OTL, generates lighter resampled guides from the longest hair in each clump, and stores per-curve offset vectors to map the full groom to its guides. Simulation uses Vellum hair with the animation localized at the origin and a generated POP wind field standing in for character motion, casting to 64-bit to avoid floating-point artifacts, plus CurveU-driven angular attributes for matting and root differentiation. A minimal-input deformer reapplies the groom using parametric UVs and transform-matrix differences with delta mush smoothing, while muscle, skin sliding, and rendering are handled in Houdini with Arnold, and caches are written to delay-loaded split Alembic for efficiency.
Related Creating a Photorealistic Hyena · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · The Boy and The Octopus: CFX Workflows · Creatures in Houdini | Ahmed Gharraph | FMX 2019
how to read this ▾ how to read this ▴
- Category
- Production talk: a fur grooming and simulation workflow breakdown
- Contributions
-
- Demonstrates a non-destructive Houdini fur workflow that splits the groom into simulated long hairs and deformed short hairs
- Shows guide generation, per-curve offset vectors, OTL attribute sanitization, and Vellum hair simulated at the origin with a POP wind field standing in for character motion
- Demonstrates a minimal-input deformer reapplying the groom via parametric UVs and transform-matrix differences with delta mush, casting to 64-bit to avoid floating-point artifacts
- Context
- Relates to strand-based grooming and Vellum/Houdini fur pipelines, applied to a photoreal Highland cow for Virgin Media's Highland Rider commercial.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven for one commercial and the choices are pragmatic rather than generally validated.
- Clarity
- Accessible to FX artists; a single viewing conveys the pipeline, with Houdini fluency helping for the node-level specifics.
- How to read it
- Watch for the reusable tricks (long/short split, guide offsets, sim-at-origin with wind, 64-bit casting, delta mush deform); no second pass needed beyond noting techniques to adapt.
CFX
-
, ,
Computationally efficient physics-based facial animation method integrating soft tissue dynamics with the DECA morphable face model.
abstract ▾ abstract ▴
Facial animation on computationally weak systems is still mostly dependent on linear blendshape models. However, these models suffer from typical artifacts such as loss of volume, self-collisions, or erroneous soft tissue elasticity. In addition, while extensive effort is required to personalize blendshapes, there are limited options to simulate or manipulate physical and anatomical properties once a model has been crafted. Finally, second-order dynamics can only be represented to a limited extent. For decades, physics-based facial animation has been investigated as an alternative to linear blendshapes but is still cumbersome to deploy and results in high computational cost at runtime. We propose SoftDECA, an approach that provides the benefits of physics-based simulation while being as effortless and fast to use as linear blendshapes. SoftDECA is a novel hypernetwork that efficiently approximates a FEM-based facial simulation while generalizing over the comprehensive DECA model of human identities, facial expressions, and a wide range of material properties that can be locally adjusted without re-training. Along with SoftDECA, we introduce a pipeline for creating the needed high-resolution training data. Part of this pipeline is a novel layered head model that densely positions the biomechanical anatomy within a skin surface while avoiding self-intersections.
Related Art-Directed Muscle Simulation for High-End Facial Animation · Phace: Physics-based Face Modeling and Animation · BlendForces: A Dynamic Framework for Facial Animation · Building and Animating User-Specific Volumetric Face Rigs
how to read this ▾ how to read this ▴
- Category
- Method: efficient physics-based facial animation
- Contributions
-
- A hypernetwork (SoftDECA) that approximates FEM-based facial simulation at speeds comparable to linear blendshapes
- Generalization over the DECA model of identities, expressions, and a range of materials, with locally adjustable properties without retraining
- A pipeline for creating the high-resolution data the approach needs
- Context
- Builds on the DECA morphable face model (Feng et al.) to bring physics-based soft-tissue behavior to blendshape-speed runtimes. Builds on: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
- Correctness
- It assumes a learned hypernetwork can faithfully stand in for FEM simulation across identities and materials; presented as addressing blendshape artifacts (volume loss, self-collision, limited dynamics), though fidelity to true FEM and behavior outside the training distribution warrant scrutiny.
- Clarity
- Moderately technical; a first pass conveys the blendshape-versus-physics motivation and the hypernetwork idea, a second pass clarifies the FEM approximation and data pipeline.
- How to read it
- Focus on what the hypernetwork approximates and which blendshape artifacts it removes; a second pass on the FEM setup and material parameterization pays off if you build facial rigs.
Facial / Muscles
- Somigliana Coordinates: an Elasticity-Derived Approach for Cage Deformation SIGGRAPH Academic 12 cites
, ,
Derives matrix-valued cage coordinates from the Somigliana boundary integral identity, generalizing Green coordinates with elastic volume control.
abstract ▾ abstract ▴
In this paper, we present a novel cage deformer based on elasticity-derived matrix-valued coordinates. In order to bypass the typical shearing artifacts and lack of volume control of existing cage deformers, we promote a more elastic behavior of the cage deformation by deriving our coordinates from the Somigliana identity, a boundary integral formulation based on the fundamental solution of linear elasticity. Given an initial cage and its deformed pose, the deformation of the cage interior is deduced from these Somigliana coordinates via a corotational scheme, resulting in a matrix-weighted combination of both vertex positions and face normals of the cage. Our deformer thus generalizes Green coordinates, while producing physically-plausible spatial deformations that are invariant under similarity transformations and with interactive bulging control. We demonstrate the efficiency and versatility of our method through a series of examples in 2D and 3D.
Related Biharmonic Coordinates · Interactive Skeleton-Driven Dynamic Deformations · Physically Based Rigging for Deformable Characters · Green Coordinates
how to read this ▾ how to read this ▴
- Category
- Method: an elasticity-derived cage deformation scheme
- Contributions
-
- Somigliana coordinates, matrix-valued cage coordinates derived from the Somigliana boundary integral identity of linear elasticity
- A corotational scheme yielding a matrix-weighted combination of cage vertex positions and face normals, with interactive bulging control
- Generalizes Green coordinates while producing similarity-invariant, physically plausible deformations in 2D and 3D
- Context
- Generalizes Green Coordinates (Lipman et al. 2008) by grounding cage coordinates in the fundamental solution of linear elasticity. Builds on: Green Coordinates
- Correctness
- Rests on the linear-elasticity boundary-integral formulation and a corotational handling of rotations; demonstrated on 2D and 3D examples to reduce shearing and add volume control, with realism bounded by the linear-elasticity assumption.
- Clarity
- Mathematically dense; a first pass conveys the motivation and what the coordinates buy you, the derivation needs a careful second and likely third pass.
- How to read it
- First pass for the intuition (elasticity-based coordinates, bulging control, Green-coordinate generalization); reserve a slow second and third pass for the Somigliana identity and corotational derivation.
Skinning
-
, , , , ,
RL with adversarial imitation trains physically simulated characters to perform carrying, sitting, and lying down from unstructured mocap data.
abstract ▾ abstract ▴
Movement is how people interact with and affect their environment. For realistic character animation, it is necessary to synthesize such interactions between virtual characters and their surroundings. Despite recent progress in character animation using machine learning, most systems focus on controlling an agent’s movements in fairly simple and homogeneous environments, with limited interactions with other objects. Furthermore, many previous approaches that synthesize human-scene interactions require significant manual labeling of the training data. In contrast, we present a system that uses adversarial imitation learning and reinforcement learning to train physically-simulated characters that perform scene interaction tasks in a natural and life-like manner. Our method learns scene interaction behaviors from large unstructured motion datasets, without manual annotation of the motion data. These scene interactions are learned using an adversarial discriminator that evaluates the realism of a motion within the context of a scene. The key novelty involves conditioning both the discriminator and the policy networks on scene context. We demonstrate the effectiveness of our approach through three challenging scene interaction tasks: carrying, sitting, and lying down, which require coordination of a character’s movements in relation to objects in the environment.
Related AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control · ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Neural State Machine for Character-Scene Interactions
how to read this ▾ how to read this ▴
- Category
- Method: physics-based character-scene interaction via RL
- Contributions
-
- A system that synthesizes physically simulated character-scene interactions such as carrying, sitting, and lying down
- Learns these behaviors from large unstructured motion datasets without manual annotation
- Conditions both the adversarial discriminator and the policy on scene context to judge motion realism within a scene
- Context
- Combines scene-conditioned control in the spirit of the Neural State Machine (Starke et al.) with adversarial imitation from AMP (Peng et al.). Builds on: Neural State Machine for Character-Scene Interactions · AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
- Correctness
- The key idea is that a scene-conditioned discriminator can supply a realism signal without labels; demonstrated on interaction tasks like sitting, carrying, and lying down, with generality across novel objects and scene layouts being the natural thing to probe.
- Clarity
- Accessible at the conceptual level; a first pass conveys the scene-conditioned adversarial-imitation idea, a second pass clarifies the policy and discriminator design.
- How to read it
- Focus on how scene context enters the discriminator and policy and why that removes manual labeling; a second pass on the RL setup and reward structure helps if reproducing or extending.
Motion Synthesis
- T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations CVPR Academic 673 cites
, , , , , , ,
VQ-VAE plus GPT framework for text-to-motion that outperforms diffusion baselines with FID 0.116 on HumanML3D.
abstract ▾ abstract ▴
In this work, we investigate a simple and must-known conditional generative framework based on Vector Quantised-Variational AutoEncoder (VQ-VAE) and Generative Pre-trained Transformer (GPT) for human motion generation from textural descriptions. We show that a simple CNN-based VQ-VAE with commonly used training recipes (EMA and Code Reset) allows us to obtain high-quality discrete representations. For GPT, we incorporate a simple corruption strategy during the training to alleviate training-testing discrepancy. Despite its simplicity, our T2M-GPT shows better performance than competitive approaches, including recent diffusion-based approaches. For example, on HumanML3D, which is currently the largest dataset, we achieve comparable performance on the consistency between text and generated motion (R-Precision), but with FID 0.116 largely outperforming MotionDiffuse of 0.630. Additionally, we conduct analyses on HumanML3D and observe that the dataset size is a limitation of our approach. Our work suggests that VQ-VAE still remains a competitive approach for human motion generation. Our implementation is available on the project page: https://mael-zys.github.io/T2M-GPT/.
Related MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model · ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model · Generating Diverse and Natural 3D Human Motions from Text · Human Motion Diffusion Model
how to read this ▾ how to read this ▴
- Category
- Method: text-to-motion generation (VQ-VAE plus GPT)
- Contributions
-
- A two-stage framework: a CNN-based VQ-VAE that learns discrete motion tokens, then a GPT that generates token sequences from text.
- Training recipes (EMA and Code Reset) for high-quality codebooks, plus a corruption strategy to reduce the train-test gap.
- Reports strong FID on HumanML3D, presented as competitive with or better than diffusion-based approaches.
- Context
- Builds on the HumanML3D text-to-motion dataset (Guo et al., 2022) and adapts the VQ-VAE plus autoregressive transformer recipe to human motion as an alternative to diffusion baselines like MotionDiffuse. Builds on: Generating Diverse and Natural 3D Human Motions from Text
- Correctness
- Validated quantitatively on HumanML3D using FID and R-Precision; the authors themselves flag dataset size as a limiting factor, so generalization claims should be read with that caveat.
- Clarity
- Accessible framing (deliberately 'simple'); a first pass conveys the architecture, a second pass is needed for the VQ-VAE training details and corruption strategy.
- How to read it
- Focus on Section/figures for the two-stage pipeline and the metrics table; do a second pass on the codebook recipes (EMA, Code Reset) and corruption trick if you plan to reproduce or compare against diffusion methods.
Motion Synthesis
- talk The Facial Animation Pipeline of 'Call of Duty: Modern Warfare II' (Presented by DI4D) GDC Industrial
,
Infinity Ward and DI4D showed how Light Stage scans and 4D expression data trained rig parameter coefficients, enabling scalable automatic transfer of HMC stereo facial performances without human intervention.
Facial / ML Deformation
-
Framestore's Head of Groom and CFX covers their feather pipeline, complex grooming for furry creatures, and CFX workflows simulating muscles, fat, skin, and hair at episodic scale.
abstract ▾ abstract ▴
Framestore's head of groom and CFX details the studio's fully Houdini-based pipeline for CG animals across episodic and advertising work. She walks through Barbarella, their in-house feather system, covering feather design from drawn shafts and barbs, curvature profiles, barb-bend and screw-finesse breakups, split tools, a procedural pattern generator with stripes and dots, plus a sock-mesh splatting and balloon-style intersection cleanup for placing and folding feathers. The talk then covers fur work including layered sub-clump curling for wool, dynamic parting lines, probability maps for fur coloration, and per-shot groom variants (wet, messy, bouffant) driven by HDAs with particle layers for dust and droplets. She closes on making CFX efficient via reusable templates that assemble muscle, fat, skin-sliding and fur setups, with tetrahedron-based flesh simulation, REST caches and a beta CFX tracker for spotting out-of-date caches, rendering primarily in Redshift with Mantra for some FX.
Related KineFX: Creature FX | Gabriela Salmeron - Framestore | KineFEST 2025 · Stags & Stripes, Creating Photoreal Characters | Framestore | FMX HIVE Europe 2021 · Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks · Creating a Photorealistic Hyena
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline breakdown (groom and CFX)
- Contributions
-
- Walks through Framestore's fully Houdini-based feather system (Barbarella): feather design from shafts and barbs, curvature profiles, breakups, split tools and a procedural pattern generator.
- Covers fur work (layered sub-clump curling, dynamic parting lines, probability maps for coloration) and per-shot groom variants driven by HDAs.
- Presents CFX efficiency via reusable templates assembling muscle, fat, skin-sliding and fur, with tetrahedron-based flesh sim, REST caches and a beta CFX cache tracker.
- Context
- Relates to muscle/flesh simulation lineage (cf. Teran et al., 2005 on simulating skeletal muscle) and to in-house feather/groom systems, applied at episodic and advertising scale. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Studio practice, not peer-reviewed; the workflows are production-proven on real shows but tooling is specific to Framestore's Houdini/Redshift/Mantra pipeline and may not transfer directly.
- Clarity
- Accessible and demo-driven; a single viewing conveys the workflow, no formal derivations to revisit.
- How to read it
- Watch for the reusable-template and cache-tracking ideas if you run a CFX pipeline; treat specific tool names as inspiration rather than a spec, and note which tricks are feather-specific versus general groom.
CFX / Muscles
-
,
Omni-directional rig for Strange World's alien Splat allowing limbs to serve as both arms and legs, with versatile setup supporting unanticipated character poses.
abstract ▾ abstract ▴
The physical and kinetic complexities presented by the character Splat in Walt Disney Animation Studios’ recent film "Strange World" are unlike anything the studio typically encounters from a rigging perspective. Splat’s unique physiology - an omni-directional character setup where up could become down and arms and legs were completely interchangeable - presented many unique obstacles for the character teams. In order to meet the challenges presented by the scope of Splat, new rigging systems were created, enabling limbs to be used for a large variety of situations, as well as an overall versatility of the entire rig to help support unanticipated uses in the film.
Related A.C.M.E. Multilimb System · Sliding the Pieces into Place: Rigging the Pigeons of Spies in Disguise · Abstracting Rigging Concepts for a Future Proof Framework Design · Stable and Efficient Differential IK
how to read this ▾ how to read this ▴
- Category
- Production paper / rigging case study
- Contributions
-
- An omni-directional character rig for Splat in 'Strange World' where up/down are interchangeable and limbs can act as either arms or legs.
- New rigging systems letting limbs serve a wide variety of functions.
- An overall versatile setup designed to support unanticipated poses and uses across the film.
- Context
- Relates to freeform/flexible production rigging approaches (cf. Hunt et al., 2020 on freeform animation rigging) applied to an unusually non-standard creature physiology. Builds on: Technical Artist Summit: Freeform Animation Rigging: Evolving the Animation Pipeline
- Correctness
- Production case study from Walt Disney Animation Studios; results are proven on a shipped film but the rig is bespoke to Splat's physiology, so generality is limited and quantitative evaluation is not the point.
- Clarity
- Accessible to riggers; a first pass conveys the problem and design intent, a second pass helps for the specific rig-system mechanics.
- How to read it
- Read once for the conceptual problem (interchangeable limbs, symmetry of up/down) and design tradeoffs; revisit only if you face a similarly non-anthropomorphic rig and want to borrow the system structure.
Rigging
-
, , , , ,
This hybrid physics-based method targets real-time hair simulation for games, avatar streaming, and metaverse applications.
abstract ▾ abstract ▴
This hybrid physics-based method targets real-time hair simulation for games, avatar streaming, and metaverse applications. It treats inter-strand collisions as a means to preserve overall hair volume, resolving them with an explicit Material Point Method while simulating individual strands with a semi-implicit Discrete Elastic Rods model. The GPU pipeline reaches up to 260 frames per second with more than two thousand simulated strands on an Nvidia GeForce RTX 3080.
Related Adaptive Nonlinearity for Collisions in Complex Rod Assemblies · Wrinkle Meshes · Interactive Hair Simulation on the GPU Using ADMM · Loki: A Unified Multiphysics Simulation Framework for Production
how to read this ▾ how to read this ▴
- Category
- Method: real-time hair simulation on GPU (hybrid physics)
- Contributions
-
- A hybrid scheme that simulates individual strands with a semi-implicit Discrete Elastic Rods model.
- Treats inter-strand collisions as a volume-preserving process resolved with an explicit Material Point Method.
- A GPU pipeline reported to reach up to 260 FPS with more than two thousand simulated strands on an RTX 3080.
- Context
- Builds on continuum/MPM hair simulation (cf. McAdams et al., 2009 on detail-preserving continuum simulation of straight hair) combined with Discrete Elastic Rods, targeting games, avatar streaming and metaverse use. Builds on: Detail-Preserving Continuum Simulation of Straight Hair
- Correctness
- Performance is demonstrated on specific hardware (RTX 3080) at a given strand count; the volume-preservation framing for collisions is an approximation, and behavior outside the tested regime (denser hair, other GPUs) is not claimed.
- Clarity
- Moderately technical; a first pass conveys the MPM-plus-DER split, a second pass is needed for the coupling and the semi-implicit integration.
- How to read it
- Read for the architecture (which solver handles strands vs collisions) first; do a second pass on the GPU pipeline and time-stepping if real-time performance is your goal.
CFX
-
,
This interactive framework unifies skinning transformations and kinematic simulation using position-based dynamics, letting an arbitrarily skinned character be partially manipulated by the user while
abstract ▾ abstract ▴
This interactive framework unifies skinning transformations and kinematic simulation using position-based dynamics, letting an arbitrarily skinned character be partially manipulated by the user while a kinematic solver automatically complements the motion of the whole character. It adds two steps to the PBD algorithm: a lightweight optimization that identifies skinning transformations similar to inverse kinematics, and a position-based constraint that restricts the solver to the complementary subspace of the skinning deformation. The result combines the controllability and shape preservation of skinning with the efficiency, simplicity, and unconditional stability of PBD.
Related Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Rig-Space Physics · Fast Complementary Dynamics via Skinning Eigenmodes · Wrinkle Meshes
how to read this ▾ how to read this ▴
- Category
- Method: interactive deformation (skinning coupled with PBD)
- Contributions
-
- An interactive framework that unifies skinning transformations with kinematic simulation via position-based dynamics.
- A lightweight optimization that identifies skinning transformations in an IK-like manner, plus a position-based constraint confining the solver to the complementary subspace of the skinning deformation.
- Lets a user partially manipulate an arbitrarily skinned character while the solver automatically completes the rest of the motion.
- Context
- Builds directly on Position Based Dynamics (Muller et al., 2007), adding two steps to couple artist-style skinning control with PBD's stability. Builds on: Position Based Dynamics
- Correctness
- Demonstrated as an interactive framework; it inherits PBD's well-known unconditional stability and simplicity, but PBD stiffness is iteration- and timestep-dependent, so 'physical accuracy' is not the claim and behavior depends on the two added steps being well-tuned.
- Clarity
- Accessible if you know PBD; a first pass conveys the two-way coupling idea, a second pass is needed for the subspace constraint formulation.
- How to read it
- Focus on the two added PBD steps and what 'complementary subspace' means; a second pass on the optimization and constraint math pays off if you want to implement the coupling.
Skinning / CFX
-
, , , , ,
Multimodal studio course covering USD asset structures, Maya and Houdini API integration, and context-driven production workflows from Animal Logic, Autodesk, Pixar, SideFX, Weta Digital, and WDAS.
Rigging
2022
71-
,
Production pipeline covering offline blendshape generation from facial scans, sticky-lips correction, and real-time dynamic skin microstructure for digital humans.
abstract ▾ abstract ▴
Digital humans and digital doubles are core products at Goodbye Kansas Studios and in order to further improve their quality and related workflows, a number of new tools have been developed and integrated into our existing pipeline. This talk will cover some of these implementations, such as an offline generation of blendshapes based on facial scans, our take on the sticky lips problem and an efficient implementation of render time dynamic skin microstructure. Furthermore, we look into future usages and how Universal Scene Description (USD) [Pixar Animation Studios 2021] can serve as a helpful tool for facial rigs.
Related Smooth Contact-Aware Facial Blendshapes Transfer · Direct Manipulation Blendshapes · Animating Facial Expressions · Transferring the Rig and Animations from a Character to Different Face Models
how to read this ▾ how to read this ▴
- Category
- Production talk / digital-human pipeline breakdown
- Contributions
-
- Offline blendshape generation derived from facial scans
- A studio take on the sticky-lips problem and render-time dynamic skin microstructure
- Exploration of USD as a carrier for facial rigs
- Context
- A practical digital-double pipeline talk in the lineage of photoreal-actor work such as The Digital Emily Project (Alexander et al. 2010), tying scan-based shapes into an existing production workflow. Builds on: The Digital Emily Project: Achieving a Photorealistic Digital Actor
- Correctness
- Studio practice rather than peer-reviewed research; the tools are production-proven inside Goodbye Kansas's pipeline, so treat specifics as integration choices for their stack rather than generally validated methods.
- Clarity
- Accessible overview; a single first pass conveys what was built and why, with little formalism to revisit.
- How to read it
- Read once for the workflow shape (scan-to-blendshape, sticky lips, microstructure, USD); skim for ideas to port, and consult the Digital Emily reference if you want the photoreal-skin grounding.
Facial / Skinning
-
How The Sims 4 animates fully customizable characters and objects: IK retargeting across body shapes, block modeling, and content verification with the SAGE art graph editor.
Retargeting / Rigging
- talk Animation Summit: FIFA 22's Hypermotion: Full-Match Mocap Driving Machine Learning Technology GDC Industrial
, ,
EA Sports describes capturing a 90-minute 11v11 professional soccer match with XSens suits to feed the ML Flow machine learning procedural animation system, overcoming traditional optical capture constraints.
Motion Synthesis / Retargeting
- talk Animation Summit: The Facial Animation Pipeline of 'Marvel's Guardians of The Galaxy' GDC Industrial
Eidos Montreal used in-house mocap and Faceware batch processing for five main characters, with photogrammetry scanning ensuring anatomically correct expressions at game production volume and quality.
Facial / Retargeting
- Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer SIGGRAPH Asia Weta FX 32 cites
, , , , , , , , , , ,
Replaces FACS shapes with a muscle-strain basis for the face, the system behind Weta's Avatar 2 facial pipeline.
abstract ▾ abstract ▴
We present Animatomy, a novel anatomic+animator centric representation of the human face. Present FACS-based systems are plagued with problems of face muscle separation, coverage, opposition, and redundancy. We, therefore, propose a collection of muscle fiber curves as an anatomic basis, whose contraction and relaxation provide us with a fine-grained parameterization of human facial expression. We build an end-to-end modular deformation architecture using this representation that enables: automatic optimization of the parameters of a specific face from high-quality dynamic facial scans; face animation driven by performance capture, keyframes, or dynamic simulation; interactive and direct manipulation of facial expression; and animation transfer from an actor to a character. We validate our facial system by showing compelling animated results, applications, and a quantitative comparison of our facial reconstruction to ground truth performance capture. Our system is being intensively used by a large creative team on Avatar: The Way of Water. We report feedback from these users as qualitative evaluation of our system.
Related Animating Facial Expressions · Learning an Animatable Detailed 3D Face Model from In-The-Wild Images · Anatomically Constrained Implicit Face Models · Lessons from the Evolution of an Anatomical Facial Muscle Model
how to read this ▾ how to read this ▴
- Category
- Method: an anatomic, animator-centric facial representation and deformation system
- Contributions
-
- A muscle-fiber-curve basis whose contraction/relaxation parameterizes facial expression, addressing FACS muscle separation, coverage, opposition and redundancy
- An end-to-end modular deformation architecture supporting per-face parameter optimization from dynamic scans, performance/keyframe/simulation-driven animation, direct manipulation and actor-to-character transfer
- Quantitative comparison of facial reconstruction to ground-truth performance capture plus production-team feedback
- Context
- Reframes the FACS action-unit paradigm (Ekman & Friesen 1978) on an anatomical footing related to Anatomy Transfer (Dicko et al. 2013), substituting muscle-strain fibers for blendshape action units. Builds on: Facial Action Coding System · Anatomy Transfer
- Correctness
- Validated by a quantitative reconstruction comparison to captured ground truth and by qualitative feedback from the Avatar: The Way of Water team; note that 'intensively used in production' is real-world evidence but evaluation leans on one large project's context.
- Clarity
- Conceptually clear at the idea level; the fiber-curve parameterization and deformation optimization reward a careful second pass.
- How to read it
- First pass for the muscle-fiber-versus-FACS argument and the architecture diagram; do a second pass on the parameterization and optimization if you intend to implement or compare against it.
Facial / Muscles
- ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters SIGGRAPH Academic 14 cites
, , , ,
Large-scale adversarial skill embedding space for physically simulated characters enabling diverse and composable motion skill reuse.
abstract ▾ abstract ▴
The incredible feats of athleticism demonstrated by humans are made possible in part by a vast repertoire of general-purpose motor skills, acquired through years of practice and experience. These skills not only enable humans to perform complex tasks, but also provide powerful priors for guiding their behaviors when learning new tasks. This is in stark contrast to what is common practice in physics-based character animation, where control policies are most typically trained from scratch for each task. In this work, we present a large-scale data-driven framework for learning versatile and reusable skill embeddings for physically simulated characters. Our approach combines techniques from adversarial imitation learning and unsupervised reinforcement learning to develop skill embeddings that produce life-like behaviors, while also providing an easy to control representation for use on new downstream tasks. Our models can be trained using large datasets of unstructured motion clips, without requiring any task-specific annotation or segmentation of the motion data. By leveraging a massively parallel GPU-based simulator, we are able to train skill embeddings using over a decade of simulated experiences, enabling our model to learn a rich and versatile repertoire of skills. We show that a single pre-trained model can be effectively applied to perform a diverse set of new tasks.
Related CALM: Conditional Adversarial Latent Models for Directable Virtual Characters · AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters · SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation
how to read this ▾ how to read this ▴
- Category
- Method: large-scale reusable skill embeddings for physics-based character control
- Contributions
-
- A data-driven framework that learns versatile, reusable skill embeddings for physically simulated characters
- Combines adversarial imitation learning with unsupervised reinforcement learning to produce life-like, controllable behaviors
- Trains from large unstructured motion-clip datasets with no task-specific annotation or segmentation, leveraging massively parallel GPU simulation
- Context
- Extends adversarial-motion-prior control (AMP, Peng et al. 2021) from per-task imitation toward a pre-trained, composable skill latent space reused across downstream tasks. Builds on: AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
- Correctness
- The premise is that an unsupervised skill space learned from unstructured clips transfers to new tasks; demonstrated in simulation with a GPU-parallel simulator, so reader caveats are sim-only results and dependence on motion-data coverage and reward/adversary tuning.
- Clarity
- Idea is graspable on a first pass; the adversarial plus unsupervised-RL objective and the embedding/encoder design need a slower read.
- How to read it
- First pass for the pretrain-skills-then-reuse idea and how it differs from AMP; second pass on the training objectives if you plan to build downstream controllers on top.
Motion Synthesis
-
Bacon X Co-head of CG shows how Houdini's FEM and Vellum solvers automate reliable skin and fur simulation for a fully digital baby wild hog creature in a small-studio pipeline.
abstract ▾ abstract ▴
David Lessel of Copenhagen studio Bacon X walks through the CFX setup built for a fully CG baby wild hog commercial across 25 shots, combining Maya, Yeti and V-Ray for asset development, animation and lighting with Houdini's FEM and Vellum solvers for skin and fur. He details a single reusable Houdini setup driven by a shot manager that pulls animation, guides and collision caches from Shotgun via Alembic, plus a rest-transform trick using bounding-box polylines so the rest pose lines up with any incoming animation. The FEM workflow remeshes and tetrahedralizes the high-res mesh, constrains skeleton points with hard FEM target constraints, paints stiffness and damping via group and attribute-blur nodes, then transfers the sim back to the subdivided Maya mesh and masks out face, ears, tail and hooves. He covers the Vellum guide-curve simulation for stiff fur with disable-self and disable-external collision attributes, optional self and environment collisions, and an in-house Cache Fire tool that daisy-chains farm jobs through Royal Render to update rigs, animation, Houdini sims, Yeti caches, lighting and slap comps overnight.
Related Creating a Photorealistic Hyena · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023
how to read this ▾ how to read this ▴
- Category
- Production talk / creature-FX automation breakdown
- Contributions
-
- A single reusable Houdini setup driven by a shot manager that pulls animation, guides and collision caches from Shotgun via Alembic, with a bounding-box-polyline rest-transform trick to align the rest pose to incoming animation
- An FEM skin workflow that remeshes and tetrahedralizes the mesh, constrains skeleton points with hard target constraints, paints stiffness/damping, and transfers the sim back to the subdivided Maya mesh while masking face, ears, tail and hooves
- A Vellum guide-curve fur sim with self/external collision controls plus an in-house Cache Fire tool that daisy-chains farm jobs through Royal Render to update the full chain overnight
- Context
- A small-studio creature-FX pipeline combining Maya/Yeti/V-Ray with Houdini FEM and Vellum solvers, situated in the standard tet-FEM-flesh and position-based-cloth/fur simulation tradition.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a 25-shot CG baby wild hog commercial, so the value is reliable workflow recipes rather than generalizable claims, and choices are scaled to a small team.
- Clarity
- Highly concrete and accessible; a single pass conveys the setup, with node-level detail useful as a reference.
- How to read it
- Read once end-to-end for the reusable-setup and overnight-farm automation patterns; revisit the rest-transform trick and FEM constraint/masking specifics if you are building a comparable small-studio CFX rig.
CFX
- Building Scalable and Evolutive USD Pipelines on Distributed Architecture at Ubisoft DigiPro Ubisoft 1 cites
, , , , ,
Ubisoft deploys multi-site USD pipelines using BPMN workflow notation and microservices, enabling scalable character and asset interchange across distributed game studios.
abstract ▾ abstract ▴
This paper presents how we built scalable and evolutive USD pipelines on distributed architecture at Ubisoft. We use BPMN as a nodal representation to allow our supervisors to build new or modify existing workflows. Our processes are designed using industry standards and USD file format for interchangeability and are easily scalable and ready to deploy to our multi-site studios and teams. Using Microservices running on our internal cloud computing infrastructure and their language-neutrality, we can leverage existing in-house and new technologies developed on multiple platforms by our teams worldwide.
Related A Deep Dive into Universal Scene Description and Hydra · Sony Imageworks Animation Layout Workflow with Unreal Engine and OpenUSD · USD and Scene Interoperability: Demystifying the State of the Art · USD in Production
how to read this ▾ how to read this ▴
- Category
- Pipeline / systems paper: scalable distributed USD architecture
- Contributions
-
- Scalable, evolutive USD pipelines built on a distributed architecture for multi-site studios
- BPMN used as a nodal representation so supervisors can build or modify workflows directly
- Language-neutral microservices on internal cloud infrastructure to integrate in-house and new technologies across worldwide teams
- Context
- Builds on the open USD interchange format (Pixar 2016), applying it as the backbone for cross-studio asset and character exchange in a games-production setting. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Described as deployed at Ubisoft scale; as an industrial systems paper the evidence is architectural and experiential rather than benchmarked, so read it for design rationale and keep in mind it reflects one organization's infrastructure and constraints.
- Clarity
- Accessible at the architecture level; a first pass conveys the BPMN-plus-microservices-over-USD design without heavy formalism.
- How to read it
- First pass for the architecture (BPMN authoring, microservices, USD interchange) and the scaling rationale; revisit only the parts relevant to your own multi-site or USD-adoption decisions.
Rigging
-
,
Unit Image CFX supervisors trace the evolution of their cloth and hair simulation department, covering Houdini-based character FX workflows across multiple cinematic projects.
abstract ▾ abstract ▴
Unit Image CFX supervisors trace their department's move from a Maya plus Marvelous Designer and Ornatrix pipeline to a unified Houdini and Vellum workflow, sharing a common input/solver/output setup hierarchy, resource manager and exporter between cloth and hair. For cloth, they show a custom group maker and group split that color-code and order garment meshes from body-out, wrap high-res render meshes onto simulated low-res, and expose tweak nodes for per-shot edits. For hair, they build grooms with perimeter-based groups and low/high guide variations, then drive them by three methods: full Vellum simulation with per-group bend stiffness, mass, glue and pin settings against cleaned VDB collision meshes; noise-based false drag and turbulence/curl for short hair and fur; and a wrap method that deforms hair onto animation-provided proxy geometry such as braids. Examples are drawn from Love Death and Robots Season 2 episode Snow in the Desert, plus God of War and other game-cinematic projects, with recurring root-recovery ramps to preserve scalp UVs.
Related Designing Feathers Using Houdini at FOLKS | Amelie Goursat | Paris HIVE 2023 · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · Automation of Creature FX in a Small Studio Pipeline · Grooming and Simulation Methods for Different Hair Types | Andriy Bilichenko | Paris HIVE 2023
how to read this ▾ how to read this ▴
- Category
- Production talk / cloth and hair CFX department breakdown
- Contributions
-
- Department evolution from a Maya plus Marvelous Designer and Ornatrix pipeline to a unified Houdini/Vellum workflow with a shared input/solver/output hierarchy, resource manager and exporter
- A cloth workflow with a custom group maker and group split that color-code and order garments body-out, wrap high-res render meshes onto simulated low-res, and expose per-shot tweak nodes
- Three hair-driving methods (full Vellum sim with per-group settings against cleaned VDB collision meshes, noise-based false drag/turbulence/curl, and a wrap method onto proxy geometry), with root-recovery ramps to preserve scalp UVs
- Context
- A character-FX cloth-and-hair pipeline grounded in position-based / compliant-constraint dynamics (XPBD, Macklin et al. 2016) as realized through Houdini's Vellum solver. Builds on: XPBD: Position-Based Simulation of Compliant Constrained Dynamics
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on cinematics including Love Death and Robots Season 2 and game projects, so take it as transferable workflow structure rather than validated technique.
- Clarity
- Concrete and accessible; one pass conveys the shared setup and the three hair methods, with node-level details useful for reference.
- How to read it
- Read once for the unified input/solver/output structure and the three hair-driving strategies; revisit the group/wrap and root-recovery details if you are standardizing a cloth-and-hair department.
CFX
-
, ,
Spline-based rigging system where artists draw sparse curvenets that profile the surface, producing detail-preserving deformations used in Turning Red.
abstract ▾ abstract ▴
Computer animation relies heavily on rigging setups that articulate character surfaces through a broad range of poses. Although many deformation strategies have been proposed over the years, constructing character rigs is still a cumbersome process that involves repetitive authoring of point weights and corrective sculpts with limited and indirect shaping controls. This paper presents a new approach for character articulation that produces detail-preserving deformations fully controlled by 3D curves that profile the deforming surface. Our method starts with a spline-based rigging system in which artists can draw and articulate sparse curvenets that describe surface profiles. By analyzing the layout of the rigged curvenets, we quantify the deformation along each curve side independent of the mesh connectivity, thus separating the articulation controllers from the underlying surface representation. To propagate the curvenet articulation over the character surface, we formulate a deformation optimization that reconstructs surface details while conforming to the rigged curvenets. In this process, we introduce a cut-cell algorithm that binds the curvenet to the surface mesh by cutting mesh elements into smaller polygons possibly with cracks, and then derive a cut-aware numerical discretization that provides harmonic interpolations with curve discontinuities.
Related Harmonic Coordinates for Character Articulation · Mesh-Based Inverse Kinematics · Avatar Reshaping and Automatic Rigging Using a Deformable Model · Shaping the Elements: Curvenet Animation Controls in Pixar's Elemental
how to read this ▾ how to read this ▴
- Category
- Method: a curve-based character articulation and deformation system
- Contributions
-
- Detail-preserving deformations fully controlled by 3D curves that profile the deforming surface, via a spline-based rigging system of sparse articulated curvenets
- Quantifying deformation along each curve side independent of mesh connectivity, decoupling articulation controllers from the surface representation
- A deformation optimization that reconstructs surface detail while conforming to the rigged curvenets, with a cut-cell algorithm binding the curvenet to the mesh
- Context
- Continues the authors' rigging line including Sculpt Processing for Character Rigging (de Goes et al. 2020), shifting from point-weight and corrective-sculpt authoring toward direct profile-curve control. Builds on: Sculpt Processing for Character Rigging
- Correctness
- The key assumption is that sparse profile curves can capture and propagate desired deformation while staying mesh-independent; demonstrated as a production-grade rig (used on Turning Red per the note), with the usual caveat that authoring effort and behavior on extreme or unanticipated poses warrant scrutiny.
- Clarity
- Motivation and controls are intuitive; the curvenet analysis, deformation optimization and cut-cell binding need a careful second pass.
- How to read it
- First pass for the profile-curve control idea and how it replaces weight painting; second pass on the optimization and cut-cell algorithm if you want the formulation or to compare deformers.
Rigging / Skinning
- ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters SIGGRAPH Asia Academic 63 cites
, , ,
VAE-based framework learns a latent skill space from unstructured mocap and trains a generative physics-based motion controller.
abstract ▾ abstract ▴
In this paper, we introduce ControlVAE, a novel model-based framework for learning generative motion control policies based on variational autoencoders (VAE). Our framework can learn a rich and flexible latent representation of skills and a skill-conditioned generative control policy from a diverse set of unorganized motion sequences, which enables the generation of realistic human behaviors by sampling in the latent space and allows high-level control policies to reuse the learned skills to accomplish a variety of downstream tasks. In the training of ControlVAE, we employ a learnable world model to realize direct supervision of the latent space and the control policy. This world model effectively captures the unknown dynamics of the simulation system, enabling efficient model-based learning of high-level downstream tasks. We also learn a state-conditional prior distribution in the VAE-based generative control policy, which generates a skill embedding that outperforms the non-conditional priors in downstream tasks. We demonstrate the effectiveness of ControlVAE using a diverse set of tasks, which allows realistic and interactive control of the simulated characters.
Related MuscleVAE: Model-Based Controllers of Muscle-Actuated Characters · Character Controllers Using Motion VAEs · SFV: Reinforcement Learning of Physical Skills from Video · PADL: Language-Directed Physics-Based Character Control
how to read this ▾ how to read this ▴
- Category
- Method: model-based VAE framework for physics-based motion control
- Contributions
-
- A VAE framework that learns a rich latent skill representation and a skill-conditioned generative control policy from unorganized motion sequences
- A learnable world model that supervises the latent space and policy, enabling efficient model-based learning of downstream tasks
- A state-conditional prior in the generative policy that yields skill embeddings outperforming non-conditional priors on downstream tasks
- Context
- Builds on latent-skill motion modeling such as Character Controllers Using Motion VAEs (Ling et al. 2020), adding a learned world model for direct latent/policy supervision in a physics-based setting. Builds on: Character Controllers Using Motion VAEs
- Correctness
- Premise is that a learned world model can stand in for unknown simulation dynamics to supervise the latent space; demonstrated across a diverse set of simulated tasks, so caveats are reliance on world-model fidelity and that results are interactive-simulation evaluations.
- Clarity
- The high-level VAE-plus-world-model story is followable; the model-based supervision and conditional prior reward a second pass.
- How to read it
- First pass for how the world model supervises skills versus a plain motion VAE; second pass on the training scheme and state-conditional prior if you plan to reuse or extend the controller.
Motion Synthesis
-
The Mill FX Supervisor walks through grooming, muscle and skin creation, and Vellum-based fur simulation in Houdini to produce Hattie the Hyena for an Amazon holiday commercial.
abstract ▾ abstract ▴
The Mill London FX supervisor Tony Atherton details the creation of Hattie the hyena for an Amazon Prime Christmas advert, all done in Houdini and rendered in Arnold via h2a. He covers grooming with guide curves and the hair generate node, driving length and variation from painted attribute maps and a Labs curvature measure, plus micro grooming controls like clumping and frizz. The muscle pipeline uses the Houdini 19 beta muscle tools with anatomically correct bones and muscles, fiber groom contraction painting, automatic muscle tension lines, and a Vellum-based solver, layered into successive tissue (tetrahedral) and skin simulations. He also shows fur simulation in Vellum with guide-to-form sticking and resampling, fake tips and tricks using lag chops and delta mush to add bounce, wind and collisions, and a TOPs-based scene builder for templated shot lighting and assembly.
Related Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Feathers: From Model to Groom to Render | nineteentwenty | Character FX & Crowds Production Talks · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks
how to read this ▾ how to read this ▴
- Category
- Production talk / photoreal creature CFX breakdown
- Contributions
-
- A full grooming workflow in Houdini using guide curves and the hair-generate node, driving length and variation from painted maps and a curvature measure, plus micro-grooming controls like clumping and frizz
- A muscle pipeline on the Houdini 19 beta muscle tools (anatomical bones and muscles, fiber-groom contraction painting, automatic tension lines, Vellum solver) layered into successive tissue (tet) and skin simulations
- Vellum fur simulation with guide-to-form sticking and resampling, fake bounce via lag chops and delta mush, wind and collisions, and a TOPs scene builder for templated shot lighting and assembly
- Context
- A photoreal creature pipeline drawing on anatomically based muscle simulation in the tradition of Teran et al. (2005), realized through Houdini's muscle and Vellum tools and rendered in Arnold. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a single hyena character (Hattie) for an Amazon commercial, and it relies on beta tooling, so treat it as a concrete recipe rather than a stable or general method.
- Clarity
- Very concrete and accessible; one pass conveys the groom-muscle-tissue-skin-fur stack, with node-level tips useful as reference.
- How to read it
- Read once for the layered anatomical-to-skin-to-fur workflow and the fake-bounce/TOPs tricks; revisit the muscle and Vellum specifics if you are setting up a comparable creature in Houdini.
CFX / Muscles
-
Cinesite CG Supervisor details the concept-driven asset pipeline for The Basilisk and Chernobog creatures, covering shape language, anatomy, and adaptability between concept and final shot.
Rigging
-
, ,
Periodic Autoencoder extracts unsupervised spatial-temporal phase features from large unstructured motion datasets, winning SIGGRAPH 2022 Best Paper.
abstract ▾ abstract ▴
Learning the spatial-temporal structure of body movements is a fundamental problem for character motion synthesis. In this work, we propose a novel neural network architecture called the Periodic Autoencoder that can learn periodic features from large unstructured motion datasets in an unsupervised manner. The character movements are decomposed into multiple latent channels that capture the non-linear periodicity of different body segments while progressing forward in time. Our method extracts a multi-dimensional phase space from full-body motion data, which effectively clusters animations and produces a manifold in which computed feature distances provide a better similarity measure than in the original motion space to achieve better temporal and spatial alignment. We demonstrate that the learned periodic embedding can significantly help to improve neural motion synthesis in a number of tasks, including diverse locomotion skills, style-based movements, dance motion synthesis from music, synthesis of dribbling motions in football, and motion query for matching poses within large animation databases.
Related Character Controllers Using Motion VAEs · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Graphs · Neural Animation Layering for Synthesizing Martial Arts Movements
how to read this ▾ how to read this ▴
- Category
- Method: neural architecture for unsupervised motion phase manifolds
- Contributions
-
- A Periodic Autoencoder that learns periodic motion features from large unstructured datasets in an unsupervised manner
- Decomposition of movement into multiple latent channels capturing non-linear periodicity of body segments over time, yielding a multi-dimensional phase space
- A learned phase embedding whose feature distances give a better similarity measure than raw motion space, improving temporal/spatial alignment across synthesis tasks
- Context
- Advances the authors' phase-based motion line, from the Neural State Machine (Starke et al. 2019) and Local Motion Phases (Starke et al. 2020), toward fully unsupervised, learned phase extraction. Builds on: Neural State Machine for Character-Scene Interactions · Local Motion Phases for Learning Multi-Contact Character Movements
- Correctness
- Key idea is that an unsupervised periodic embedding aligns and clusters motion better than the original space; demonstrated to help several synthesis tasks (locomotion, style, dance-from-music, dribbling, pose matching), with the reader caveat that gains depend on dataset richness and the chosen number of phase channels.
- Clarity
- Well written and influential (noted as SIGGRAPH 2022 Best Paper); the concept lands on a first pass while the periodic encoder formulation rewards a second.
- How to read it
- First pass for the phase-manifold idea and why phase distance beats motion-space distance; second pass on the Periodic Autoencoder formulation if you intend to use phases as features in your own synthesis pipeline.
Motion Synthesis
-
, , , ,
Incorporates muscle inertia into musculoskeletal simulation via a chain-of-Jacobians formulation enabling gradient-based optimization.
abstract ▾ abstract ▴
We propose a simple and practical approach for incorporating the effects of muscle inertia, which has been ignored by previous musculoskeletal simulators in both graphics and biomechanics. We approximate the inertia of the muscle by assuming that muscle mass is distributed along the centerline of the muscle. We express the motion of the musculotendons in terms of the motion of the skeletal joints using a chain of Jacobians, so that at the top level, only the reduced degrees of freedom of the skeleton are used to completely drive both bones and musculotendons. Our approach can handle all commonly used musculotendon path types, including those with multiple path points and wrapping surfaces. For muscle paths involving wrapping surfaces, we use neural networks to model the Jacobians, trained using existing wrapping surface libraries, which allows us to effectively handle the Jacobian discontinuities that occur when musculotendon paths collide with wrapping surfaces. We demonstrate support for higher-order time integrators, complex joints, inverse dynamics, Hill-type muscle models, and differentiability. In the limit, as the muscle mass is reduced to zero, our approach gracefully degrades to traditional simulators without support for muscle inertia. Finally, it is possible to mix and match inertial and non-inertial musculotendons, depending on the application.
Related Generative GaitNet · MuscleVAE: Model-Based Controllers of Muscle-Actuated Characters · Anatomy Transfer · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method: a differentiable musculoskeletal simulation technique
- Contributions
-
- Incorporates muscle inertia (previously ignored) by distributing muscle mass along the musculotendon centerline
- Expresses musculotendon motion from reduced skeletal degrees of freedom via a chain of Jacobians
- Uses neural networks to model Jacobians for wrapping-surface paths, handling discontinuities, and supports higher-order integrators, inverse dynamics, Hill-type muscles, and differentiability
- Context
- Extends reduced-coordinate musculotendon simulation in the lineage of Sueda et al.'s Musculotendon Simulation for Hand Animation, adding inertial effects and gradient-based optimization. Builds on: Musculotendon Simulation for Hand Animation
- Correctness
- Rests on approximating muscle inertia as mass along the centerline and on neural Jacobians trained from existing wrapping-surface libraries, so accuracy near path-collision discontinuities and the validity of the centerline assumption for bulky muscles are the things to watch; the authors note it degrades gracefully to the massless case.
- Clarity
- Moderately technical; a first pass conveys the inertia-plus-Jacobian idea, but a second pass is needed for the chain-of-Jacobians formulation.
- How to read it
- First pass for why muscle inertia matters and the high-level reduced-coordinate framing; do a second pass on the Jacobian chain and the neural wrapping-surface model if you intend to implement or differentiate through it.
Muscles
-
, , ,
Procedural Houdini workflow modeling individual cloth fibers and binding them to clothing geometry via XGen, supporting embroidery styles for Encanto costumes.
abstract ▾ abstract ▴
Walt Disney Animation Studios’ ”Encanto” tells the tale of an extraordinary family, the Madrigals, who live in the hidden mountains of Colombia. The garments are an important aspect of the characters’ design and express their individual personalities. Accurate cloth looks have been difficult to achieve with our traditional look development workflow. We present the techniques utilized to create the varied and complex fiber-level cloth features in the film. In order to produce the desired level of geometric detail, we developed a new workflow that procedurally models each cloth fiber in Houdini and then binds the resulting curves to the clothing geometry via Disney’s XGen. We also extended our embroidery workflow to support a wide variety of embroidery types and styles which are exemplified by Mirabel’s outfit and include needle paint, knot, and rope stitching.
Related Creating the Art-Directed Groom for Legend in Disney's Strange World · XGen: Arbitrary Primitive Generator · Hair and Fur in an Evolving Pipeline · Creating Curve-Based Garments with Custom Weave Patterns
how to read this ▾ how to read this ▴
- Category
- Production talk / workflow breakdown (fiber-level cloth)
- Contributions
-
- A new workflow that procedurally models each cloth fiber in Houdini and binds the resulting curves to clothing geometry via Disney's XGen
- Extends an embroidery workflow to support varied types and styles including needle paint, knot, and rope stitching
- Demonstrates fiber-level cloth detail on Encanto costumes such as Mirabel's outfit
- Context
- A production look-development pipeline for Walt Disney Animation Studios' Encanto, relating to the studio's cloth and groom tooling lineage (e.g., Tamstorf et al.'s Smoothed Aggregation Multigrid for Cloth Simulation). Builds on: Smoothed Aggregation Multigrid for Cloth Simulation
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a shipped film, so takeaways are pipeline recipes rather than benchmarked claims, and generality beyond Disney's Houdini/XGen toolset is not asserted.
- Clarity
- Accessible and example-driven; a single read conveys the workflow without heavy formalism.
- How to read it
- Read once for the procedural-fiber and embroidery approach; focus on how curves are bound to garment geometry and which stitch types map to which techniques rather than expecting algorithms.
CFX
-
, ,
Introduces a deep perceptual emotion consistency loss during monocular 3D face reconstruction training, significantly improving the fidelity of reconstructed facial expressions over prior methods.
abstract ▾ abstract ▴
As 3D facial avatars become more widely used for communication, it is critical that they faithfully convey emotion. Unfortunately, the best recent methods that regress parametric 3D face models from monocular images are unable to capture the full spectrum of facial expression, such as subtle or extreme emotions. We find the standard reconstruction metrics used for training (landmark reprojection error, photometric error, and face recognition loss) are insufficient to capture high-fidelity expressions. The result is facial geometries that do not match the emotional content of the input image. We address this with EMOCA (EMOtion Capture and Animation), by introducing a novel deep perceptual emotion consistency loss during training, which helps ensure that the reconstructed 3D expression matches the expression depicted in the input image. While EMOCA achieves 3D reconstruction errors that are on par with the current best methods, it significantly outperforms them in terms of the quality of the reconstructed expression and the perceived emotional content. We also directly regress levels of valence and arousal and classify basic expressions from the estimated 3D face parameters. On the task of in-the-wild emotion recognition, our purely geometric approach is on par with the best image-based methods, highlighting the value of 3D geometry in analyzing human behavior.
Related SPARK: Self-supervised Personalized Real-time Monocular Face Capture · Learning an Animatable Detailed 3D Face Model from In-The-Wild Images · I M Avatar: Implicit Morphable Head Avatars from Videos · Towards Metrical Reconstruction of Human Faces
how to read this ▾ how to read this ▴
- Category
- Method: monocular 3D face reconstruction with an emotion-aware training loss
- Contributions
-
- Introduces a deep perceptual emotion consistency loss so the reconstructed 3D expression matches the emotion in the input image
- Achieves geometric reconstruction error on par with prior best methods while improving perceived expression fidelity
- Additionally regresses valence and arousal and classifies basic expressions from the estimated 3D face parameters
- Context
- Builds on regression-based parametric face capture, extending the in-the-wild detailed-model line of Feng et al.'s DECA with an emotion-driven supervision signal. Builds on: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
- Correctness
- The central claim is that standard metrics (landmark, photometric, recognition) under-capture expression and that an emotion-consistency loss helps; this is validated largely through reconstruction error parity plus perceptual/emotion measures, so the gain is in perceived expressiveness rather than lower geometric error, and it depends on the quality of the emotion network used for supervision.
- Clarity
- Accessible; a first pass conveys the motivation and the loss idea, with a second pass for the network and training details.
- How to read it
- First pass for the insight that reconstruction metrics miss emotion and how the perceptual loss fixes it; a second pass pays off mainly if you need the training setup or the valence/arousal regression head.
Facial
-
Guerrilla Games' Gameplay Animation Director detailed supporting animation systems for Horizon Forbidden West covering player mechanics, robotic creatures, and human adversaries while preserving series visual identity.
Motion Synthesis / Rigging
-
, , , ,
Transformer autoregressive model encoding long-term audio context with biased cross-modal attention and self-supervised speech representations for 3D facial animation.
abstract ▾ abstract ▴
Speech-driven 3D facial animation is challenging due to the complex geometry of human faces and the limited availability of 3D audio-visual data. Prior works typically focus on learning phoneme-level features of short audio windows with limited context, occasionally resulting in inaccurate lip movements. To tackle this limitation, we propose a Transformer-based autoregressive model, Face-Former, which encodes the long-term audio context and autoregressively predicts a sequence of animated 3D face meshes. To cope with the data scarcity issue, we integrate the self-supervised pre-trained speech representations. Also, we devise two biased attention mechanisms well suited to this specific task, including the biased cross-modal multi-head (MH) attention and the biased causal MH self-attention with a periodic positional encoding strategy. The former effectively aligns the audio-motion modalities, whereas the latter offers abilities to generalize to longer audio sequences. Extensive experiments and a perceptual user study show that our approach outperforms the existing state-of-the-arts. The code and the video are available at: https://evelynfan.github.io/audio2face/
Related FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion · CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior · Capture, Learning, and Synthesis of 3D Speaking Styles · MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement
how to read this ▾ how to read this ▴
- Category
- Method: speech-driven 3D facial animation (Transformer)
- Contributions
-
- A Transformer-based autoregressive model that encodes long-term audio context and predicts a sequence of animated 3D face meshes
- Integrates self-supervised pre-trained speech representations to cope with scarce 3D audio-visual data
- Devises two biased attention mechanisms (biased cross-modal multi-head attention and biased causal self-attention with periodic positional encoding) to align audio-motion and generalize to longer audio
- Context
- Advances speech-to-3D-face animation beyond short-window phoneme features, building on the speaking-styles capture line of Cudeiro et al.'s VOCA with a Transformer formulation. Builds on: Capture, Learning, and Synthesis of 3D Speaking Styles
- Correctness
- Assumes long-range audio context and biased attention improve lip accuracy, and leans on pre-trained speech features to offset limited 3D data; claims rest on experiments plus a perceptual user study, so reported superiority is relative to compared baselines and tied to the training corpora used.
- Clarity
- Accessible if you know Transformers; a first pass conveys the architecture, a second pass for the two biased-attention formulations.
- How to read it
- First pass for the autoregressive audio-to-mesh framing and why long context matters; second pass on the biased cross-modal and causal attention if you plan to reimplement or adapt the model.
Facial / Motion Synthesis
- FaceVerse: A Fine-Grained and Detail-Controllable 3D Face Morphable Model from a Hybrid Dataset CVPR Academic 143 cites
, , , , ,
Coarse-to-fine 3DMM built from 60K RGB-D images and 2K high-fidelity scans, with controllable StyleGAN-based detail generation for both base and fine modules.
abstract ▾ abstract ▴
We present FaceVerse, a fine-grained 3D Neural Face Model, which is built from hybrid East Asian face datasets containing 60K fused RGB-D images and 2K high-fidelity 3D head scan models. A novel coarse-to-fine structure is proposed to take better advantage of our hybrid dataset. In the coarse module, we generate a base parametric model from large-scale RGB-D images, which is able to predict accurate rough 3D face models in different genders, ages, etc. Then in the fine module, a conditional StyleGAN architecture trained with high-fidelity scan models is introduced to enrich elaborate facial geometric and texture details. Note that different from previous methods, our base and detailed modules are both changeable, which enables an innovative application of adjusting both the basic attributes and the facial details of 3D face models. Furthermore, we propose a single-image fitting framework based on differentiable rendering. Rich experiments show that our method outperforms the state-of-the-art methods.
Related EMOCA: Emotion Driven Monocular Face Capture and Animation · Codec Avatars: Photorealistic Telepresence at Scale · Deep Appearance Models for Face Rendering · FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
how to read this ▾ how to read this ▴
- Category
- Method / model: a detail-controllable 3D face morphable model
- Contributions
-
- A coarse-to-fine 3D neural face model built from a hybrid dataset of 60K fused RGB-D images and 2K high-fidelity 3D head scans
- A coarse base parametric model plus a fine module using a conditional StyleGAN to enrich geometric and texture detail, with both modules independently adjustable
- A single-image fitting framework based on differentiable rendering
- Context
- A 3D morphable model in the data-driven, riggable-face tradition of Yang et al.'s FaceScape, here combining RGB-D capture with high-fidelity scans and a generative detail module. Builds on: FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
- Correctness
- The hybrid dataset is built from East Asian faces, so demographic coverage is a stated scope to keep in mind; reported gains over prior methods come from experiments and the differentiable-rendering fit, and the conditional-StyleGAN detail is generative (plausible) rather than measured ground truth.
- Clarity
- Accessible; a first pass conveys the coarse-to-fine design and the changeable-modules idea, a second pass for the fitting and StyleGAN conditioning.
- How to read it
- First pass for the two-module architecture and what the hybrid dataset buys you; second pass if you need the differentiable single-image fitting or want to reuse the model, noting the dataset demographics.
Facial
- Facial Animation with Disentangled Identity and Motion using Transformers SCA Disney Research 23 cites
, , , ,
Transformer-based model disentangling facial identity and motion for retargeted animation of novel characters from a reference performance.
abstract ▾ abstract ▴
We propose a 3D+time framework for modeling dynamic sequences of 3D facial shapes, representing realistic non‐rigid motion during a performance. Our work extends neural 3D morphable models by learning a motion manifold using a transformer architecture. More specifically, we derive a novel transformer‐based autoencoder that can model and synthesize 3D geometry sequences of arbitrary length. This transformer naturally determines frame‐to‐frame correlations required to represent the motion manifold, via the internal self‐attention mechanism. Furthermore, our method disentangles the constant facial identity from the time‐varying facial expressions in a performance, using two separate codes to represent neutral identity and the performance itself within separate latent subspaces. Thus, the model represents identity‐agnostic performances that can be paired with an arbitrary new identity code and fed through our new identity‐modulated performance decoder; the result is a sequence of 3D meshes for the performance with the desired identity and temporal length. We demonstrate how our disentangled motion model has natural applications in performance synthesis, performance retargeting, key‐frame interpolation and completion of missing data, performance denoising and retiming, and other potential applications that include full 3D body modeling.
Related Deep Appearance Models for Face Rendering · CANRIG: Cross-Attention Neural Face Rigging with Variable Local Control · FaceBaker: Baking Character Facial Rigs with Machine Learning · Monocular Facial Performance Capture via Deep Expression Matching
how to read this ▾ how to read this ▴
- Category
- Method: a Transformer model for 3D facial motion with disentangled identity and motion
- Contributions
-
- A Transformer-based autoencoder that models and synthesizes 3D facial geometry sequences of arbitrary length, learning a motion manifold via self-attention
- Disentangles constant facial identity from time-varying expression using two separate latent codes
- An identity-modulated performance decoder enabling performance synthesis, retargeting to new identities, and keyframing applications
- Context
- Extends neural 3D morphable models into the temporal domain, continuing the identity/expression-separation idea of Vlasic et al.'s Face Transfer with Multilinear Models using a Transformer. Builds on: Face Transfer with Multilinear Models
- Correctness
- Hinges on whether self-attention cleanly separates an identity-agnostic motion manifold from identity, and on retargeting being faithful when an arbitrary identity code is paired with a learned performance; demonstrated through synthesis and retargeting examples, so quality of disentanglement on unseen identities is the reader's main caveat.
- Clarity
- Accessible with Transformer/autoencoder background; a first pass conveys the disentanglement idea, a second pass for the latent-subspace and decoder formulation.
- How to read it
- First pass for the 3D+time disentanglement concept and its retargeting use cases; second pass on the autoencoder structure and identity-modulated decoder if you want to apply it to new characters.
Facial / ML Deformation
- FDLS: A Deep Learning Approach to Production Quality, Controllable, and Retargetable Facial Performances DigiPro Weta FX 0 cites
, , , ,
Deep learning facial performance system combining controllability and retargetability for production-quality digital human faces.
abstract ▾ abstract ▴
Visual effects commonly requires both the creation of realistic synthetic humans as well as retargeting actors’ performances to humanoid characters such as aliens and monsters. Achieving the expressive performances demanded in entertainment requires manipulating complex models with hundreds of parameters. Full creative control requires the freedom to make edits at any stage of the production, which prohibits the use of a fully automatic “black box” solution with uninterpretable parameters. On the other hand, producing realistic animation with these sophisticated models is difficult and laborious. This paper describes FDLS (Facial Deep Learning Solver), which is Weta Digital’s solution to these challenges. FDLS adopts a coarse-to-fine and human-in-the-loop strategy, allowing a solved performance to be verified and (if needed) edited at several stages in the solving process. To train FDLS, we first transform the raw motion-captured data into robust graph features. The feature extraction algorithms were devised after carefully observing the artists’ interpretation of the 3d facial landmarks. Secondly, based on the observation that the artists typically finalize the jaw pass animation before proceeding to finer detail, we solve for the jaw motion first and predict fine expressions with region-based networks conditioned on the jaw position.
Related Facial Retargeting with Automatic Range of Motion Alignment · Facial Retargeting Using Neural Networks · A Facial Motion Retargeting Pipeline for Appearance Agnostic 3D Characters · Optimal and Interactive Keyframe Selection for Motion Capture
how to read this ▾ how to read this ▴
- Category
- Production method: a deep-learning facial performance solver
- Contributions
-
- FDLS, Weta Digital's deep-learning solver producing production-quality, controllable, and retargetable facial performances
- A coarse-to-fine, human-in-the-loop strategy letting artists verify and edit a solved performance at several stages
- Transforms raw motion-captured data into robust graph features designed around how artists interpret 3D facial landmarks
- Context
- A VFX production system for retargeting actor performances to humanoid characters, building on deep facial-capture work such as Laine et al.'s Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks. Builds on: Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks
- Correctness
- Production-oriented and validated through studio use rather than as a public benchmark; the design explicitly rejects a black-box solver in favor of interpretable, editable stages, so its strength is controllability in a pipeline and generality outside Weta's tools and data is not claimed.
- Clarity
- Accessible and motivated by practical artist needs; a first pass conveys the coarse-to-fine, human-in-the-loop approach.
- How to read it
- First pass for how controllability and retargetability are balanced and where artists intervene; a second pass pays off for the graph-feature extraction and the staged solving pipeline.
Facial / ML Deformation / Retargeting
-
, , , , , ,
Introduces HumanML3D dataset (14,616 clips, 44,970 descriptions) and a temporal VAE text-to-motion pipeline.
abstract ▾ abstract ▴
This paper tackles automated generation of diverse and natural 3D human motions from text descriptions using a two-stage approach of text2length sampling and text2motion generation. Text2length samples from a learned distribution of motion lengths conditioned on the input text, after which a temporal variational autoencoder synthesizes a diverse set of human motions of the sampled length, operating on motion snippet codes as an internal representation that captures local semantic motion contexts. A large-scale dataset, HumanML3D, is constructed with 14,616 motion clips and 44,970 text descriptions. Experiments on HumanML3D and KIT-ML demonstrate that the approach generates stochastic motions of variable length that are more faithful to the input text than prior deterministic methods.
Related Executing Your Commands via Motion Diffusion in Latent Space · TEMOS: Generating Diverse Human Motions from Textual Descriptions · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control · Human Motion Diffusion Model
how to read this ▾ how to read this ▴
- Category
- Dataset and method: text-to-3D-human-motion generation
- Contributions
-
- Introduces HumanML3D, a large-scale dataset of 14,616 motion clips with 44,970 text descriptions
- A two-stage pipeline: text2length sampling of motion length conditioned on text, then a temporal VAE that synthesizes motions
- Operates on motion-snippet codes capturing local semantic contexts to produce diverse, variable-length motions more faithful to text than deterministic baselines
- Context
- Sits in the text-conditioned human-motion generation area, contributing a new large dataset alongside KIT-ML and a stochastic VAE alternative to deterministic text-to-motion methods.
- Correctness
- Claims of greater diversity and faithfulness are evaluated on HumanML3D and KIT-ML against deterministic baselines, so results are tied to those datasets and metrics; the snippet-code representation and learned length distribution are assumptions whose effect on out-of-distribution prompts is not the focus.
- Clarity
- Accessible; a first pass conveys the dataset and two-stage idea, with a second pass for the VAE and snippet-code details.
- How to read it
- First pass for the dataset scale and the text2length-then-text2motion structure; second pass if you plan to use HumanML3D or build on the temporal VAE, noting the dataset is a contribution in itself.
Motion Synthesis
- Generating Upper-Body Motion for Real-Time Characters Making their Way through Dynamic Environments SCA Academic 14 cites
, ,
Neural method generating reactive upper-body secondary motion for locomoting characters navigating dynamic obstacles in real time.
abstract ▾ abstract ▴
Real‐time character animation in dynamic environments requires the generation of plausible upper‐body movements regardless of the nature of the environment, including non‐rigid obstacles such as vegetation. We propose a flexible model for upper‐body interactions, based on the anticipation of the character's surroundings, and on antagonistic controllers to adapt the amount of muscular stiffness and response time to better deal with obstacles. Our solution relies on a hybrid method for character animation that couples a keyframe sequence with kinematic constraints and lightweight physics. The dynamic response of the character's upper‐limbs leverages antagonistic controllers, allowing us to tune tension/relaxation in the upper‐body without diverging from the reference keyframe motion. A new sight model, controlled by procedural rules, enables high‐level authoring of the way the character generates interactions by adapting its stiffness and reaction time. As results show, our real‐time method offers precise and explicit control over the character's behavior and style, while seamlessly adapting to new situations. Our model is therefore well suited for gaming applications.
Related Joint-Dependent Local Deformations for Hand Animation and Object Grasping · Rig-Space Physics · SKEL-Betweener: a Neural Motion Rig for Interactive Motion Authoring · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks
how to read this ▾ how to read this ▴
- Category
- Method: real-time reactive upper-body motion for characters in dynamic environments
- Contributions
-
- A flexible upper-body interaction model based on anticipating the character's surroundings, including non-rigid obstacles such as vegetation
- Antagonistic controllers that tune muscular stiffness and response time to adapt to obstacles without diverging from a reference keyframe motion
- A procedurally controlled sight model enabling high-level authoring of interaction style, stiffness, and reaction time
- Context
- A hybrid keyframe-plus-lightweight-physics approach to secondary character motion for games, relating to data-driven locomotion such as Clavet's Motion Matching and the Road to Next-Gen Animation. Builds on: Motion Matching and The Road to Next-Gen Animation
- Correctness
- Couples a keyframe sequence with kinematic constraints and lightweight physics, so plausibility rather than physical exactness is the goal; the antagonistic-controller and sight-model design are validated by demonstration and aimed at gaming, so behavior under extreme or unanticipated environments is the caveat.
- Clarity
- Accessible; a first pass conveys the anticipation-plus-antagonistic-controller idea without heavy math.
- How to read it
- First pass for how anticipation, stiffness control, and the sight model combine for reactive motion; a second pass for the controller formulation if you want to author or tune the behavior in a real-time system.
Motion Synthesis / Rigging
-
, , , , ,
A deep RL network controls 304 Hill-type musculotendons across a 618-dimensional anatomy-condition space, generating healthy and pathological gaits in real-time physics simulation.
abstract ▾ abstract ▴
Understanding the relation between anatomy and gait is key to successful predictive gait simulation. In this paper, we present Generative GaitNet, which is a novel network architecture based on deep reinforcement learning for controlling a comprehensive, full-body, musculoskeletal model with 304 Hill-type musculotendons. The Generative GaitNet is a pre-trained, integrated system of artificial neural networks learned in a 618-dimensional continuous domain of anatomy conditions (e.g., mass distribution, body proportion, bone deformity, and muscle deficits) and gait conditions (e.g., stride and cadence). The pre-trained GaitNet takes anatomy and gait conditions as input and generates a series of gait cycles appropriate to the conditions through physics-based simulation. We will demonstrate the efficacy and expressive power of Generative GaitNet to generate a variety of healthy and pathological human gaits in real-time physics-based simulation.
Related SoftCon: Simulation and Control of Soft-Bodied Animals with Biomimetic Actuators · MuscleVAE: Model-Based Controllers of Muscle-Actuated Characters · Physical Based Motion Reconstruction From Videos Using Musculoskeletal Model · Character Controllers Using Motion VAEs
how to read this ▾ how to read this ▴
- Category
- Method: deep-RL control of a full-body musculoskeletal model for gait
- Contributions
-
- Generative GaitNet, a deep-RL architecture controlling a full-body musculoskeletal model with 304 Hill-type musculotendons
- A pre-trained, integrated network learned over a 618-dimensional continuous space of anatomy conditions (mass, proportions, bone deformity, muscle deficits) and gait conditions (stride, cadence)
- Generates a range of healthy and pathological gaits in real-time physics-based simulation from anatomy and gait inputs
- Context
- Builds on muscle-actuated character control, extending the scalable muscle simulation line of Lee et al.'s Scalable Muscle-Actuated Human Simulation and Control toward a generative, condition-parameterized network. Builds on: Scalable Muscle-Actuated Human Simulation and Control
- Correctness
- Relies on Hill-type muscle modeling and learning across a high-dimensional anatomy-gait space; efficacy and expressiveness are shown through simulated healthy and pathological gaits, so results are demonstrations within the simulated model rather than clinically validated predictions, and coverage at the edges of the 618-D condition space is the caveat.
- Clarity
- Fairly technical; a first pass conveys the conditioned-generative-control idea, a second pass for the RL setup and the anatomy/gait parameterization.
- How to read it
- First pass for the scope of the condition space and what generative control over gait means; second pass on the network architecture and RL training if you work on musculoskeletal control or predictive gait simulation.
Muscles / Motion Synthesis
- Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function SIGGRAPH Industrial 1 cites
Optimization algorithm that preloads hair rest shapes to compensate for gravitational sag, preserving the groomed design shape during simulation.
abstract ▾ abstract ▴
In animation, hair styles can often be modeled without the consideration of physics. One of the side effects of this workflow is that external forces such as gravity will deform the groom from the designed shape when simulated. We present a simple optimization algorithm that preloads the rest shape to compensate for the external forces to maintain the groom shape during simulation. The algorithm provides artistic control over how much force is compensated for at each vertex.
Related Rest Shape Optimization for Sag-Free Discrete Elastic Rods · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Scriptable Character FX Solution · Hair Emoting with Style Guides in Turning Red
how to read this ▾ how to read this ▴
- Category
- Method: a hair rest-shape preloading algorithm
- Contributions
-
- An optimization that preloads the hair rest shape so simulated gravity does not deform the groomed design
- Per-vertex artistic control over how much external force is compensated
- Context
- Sits in the production hair-simulation lineage (related to The Art and Technology of Hair Simulation in Disney's Moana), addressing the shape-preservation problem where physics pulls a groom away from its artist-designed shape. Builds on: The Art and Technology of Hair Simulation in Disney's Moana
- Correctness
- Framed around treating the simulator as a closed-box function, so it should transfer across solvers, but the abstract reports no numbers; a reader should keep in mind it targets gravitational sag specifically and that compensation is a tuned, artist-controlled choice rather than a guaranteed physical inverse.
- Clarity
- Accessible and short; a first pass conveys the preloading idea, a second pass is needed for the optimization formulation.
- How to read it
- Read the problem framing and the closed-box optimization setup first; do a second pass only if you need the per-vertex compensation math to reimplement it.
CFX
-
,
Extends Green coordinates to quadrilateral cage faces in 3D, providing quasi-conformal deformations with closed-form expressions for production cages.
abstract ▾ abstract ▴
We introduce Green coordinates for triquad cages in 3D. Based on Green’s third identity, Green coordinates allow defining the harmonic deformation of a 3D point inside a cage as a linear combination of its vertices and face normals. Using appropriate Neumann boundary conditions, the resulting deformations are quasi-conformal in 3D, and thus best-preserve the local deformed geometry, in that volumetric conformal 3D deformations do not exist unless rigid. Most coordinate systems use cages made of triangles, yet quads are in general favored by artists as those align naturally onto important geometric features of the 3D shapes, such as the limbs of a character, without introducing arbitrary asymmetric deformations and representation. While triangle cages admit per-face constant normals and result in a single Green normal-coordinate per triangle, the case of quad cages is at the same time more involved (as the normal varies along non-planar quads) and more flexible (as many different mathematical models allow defining the smooth geometry of a quad interpolating its four edges). We consider bilinear quads, and we introduce a new Neumann boundary condition resulting in a simple set of four additional normal-coordinates per quad. Our coordinates remain quasi-conformal in 3D, and we demonstrate their superior behavior under non-trivial deformations of realistic triquad cages.
Related Green Coordinates · Mean Value Coordinates for Closed Triangular Meshes · Biharmonic Coordinates · Somigliana Coordinates: an Elasticity-Derived Approach for Cage Deformation
how to read this ▾ how to read this ▴
- Category
- Method: a cage-based deformation coordinate system
- Contributions
-
- Green coordinates extended to triquad (bilinear quad) cages in 3D
- A new Neumann boundary condition giving a simple set of closed-form coordinates per quad
- Quasi-conformal volumetric deformations that best-preserve local deformed geometry
- Context
- Directly extends Lipman et al.'s Green Coordinates (2008) from triangle cages to quad cages, grounded in Green's third identity and harmonic deformation theory. Builds on: Green Coordinates
- Correctness
- Builds on the established result that volumetric conformal 3D deformations do not exist unless rigid, so it targets quasi-conformality; the method assumes bilinear (non-planar) quads and the varying normal makes the formulation more involved, a limitation to expect when reading the derivation.
- Clarity
- Mathematically dense; a first pass conveys why quads are favored and what quasi-conformal buys you, but the boundary-condition derivation needs a careful second or third pass.
- How to read it
- First pass for the motivation (quads align to character features) and the quasi-conformal claim; reserve a slow second pass for the Neumann condition and closed-form expressions if you intend to implement.
Skinning
- Groom Styles Interpolation with Features Preservation for Digital Creatures Effects DigiPro DreamWorks 4 cites
, , , , ,
Groom style interpolation technique preserving strand features during blending between different hairstyles for digital creature effects.
abstract ▾ abstract ▴
The Visual Effects industry has experienced a strong shift towards the creation of many shots with several digital creatures and complex grooms (i.e. fur and hair). It is common for these grooms to change in shape over the duration of a shot (e.g. from dry to wet), which requires specific techniques to interpolate across different sets of curves. While researchers have extensively focused on algorithms for polygonal mesh deformation, very little can be found for curves, and basic linear interpolation techniques produce unrealistic results that do not emulate the expected behavior of groom filaments changing in shape and style. In this paper we present an iterative algorithm that allows for the interpolation across different curve shapes with the ability to preserve key features like strand’s curvature and segment lengths. We also introduce the concept of partial blendshape linear interpolation to help the system to rapidly converge to the optimal solution in few iterations. We present detailed results and production use-cases in creatures’ Visual Effects.
Related Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Two-Way Coupling of Skinning Transformations and Position Based Dynamics · Pseudo-Collisions: A Method for Preventing Fur-Skin Intersections Without Physical Simulation · Embroidery and Cloth Fiber Workflows on Disney's Encanto
how to read this ▾ how to read this ▴
- Category
- Method: a curve (groom) interpolation algorithm
- Contributions
-
- An iterative algorithm to interpolate between different curve shapes while preserving strand curvature and segment lengths
- A partial blendshape linear interpolation concept to converge to the optimal solution in few iterations
- Production use-cases for digital-creature grooms changing shape (e.g. dry to wet)
- Context
- Positioned against the rich literature on polygonal mesh deformation, which it notes is sparse for curves, and against naive linear interpolation that yields unrealistic strand blends.
- Correctness
- Validated on creature VFX production use-cases rather than a formal benchmark; the key assumption is that preserving curvature and length yields plausible filament motion, so readers should treat the quality evidence as production examples, not quantitative comparison.
- Clarity
- Practitioner-oriented and readable; a first pass conveys the approach and motivation, a second pass clarifies the iterative scheme.
- How to read it
- Focus on why linear interpolation fails for strands and on the feature-preservation constraints; second pass on the iteration and partial-blendshape trick if you handle grooms.
CFX
-
, ,
Describes Pixar's style-guide system for art-directing millions of hair and fur curves to reflect characters' emotional states in Turning Red.
abstract ▾ abstract ▴
For Pixar’s feature film Turning Red, the grooming and simulation teams faced the challenge of handling characters with millions of fur and hair curves, which often needed to behave differently in each shot reflecting the characters’ emotional states. This work describes new tools developed to assist artists in managing and sculpting these large amounts of fur and hair. In particular, we present a novel surface-aware technique for curve deformation that interpolates hair sculpts at varying levels of detail, accompanied by a customized user interface for interactively browsing hair layers.
Related The Art and Technology of Hair Simulation in Disney's Moana · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Hair Effects in Trolls World Tour · Scriptable Character FX Solution
how to read this ▾ how to read this ▴
- Category
- Production talk / tooling description
- Contributions
-
- A style-guide system for art-directing millions of fur and hair curves to reflect characters' emotional states (Turning Red)
- A surface-aware curve-deformation technique that interpolates hair sculpts at varying levels of detail
- A custom UI for interactively browsing hair layers
- Context
- Builds on prior hair shape-control work (related to Holding the Shape in Hair Simulation, Iben 2019), extending it toward emotional, per-shot art direction at production scale. Builds on: Holding the Shape in Hair Simulation
- Correctness
- Studio practice on a single feature film; results are production-proven rather than benchmarked, so the surface-aware interpolation should be read as fit-for-purpose for Pixar's pipeline rather than a generalized, validated algorithm.
- Clarity
- Accessible and tool-focused; a single pass conveys the workflow, with a second pass paying off mainly for the surface-aware deformation idea.
- How to read it
- Read for the artist-workflow and style-guide concept; skim once unless the surface-aware multi-LOD curve deformation is directly relevant to your pipeline.
CFX
-
, , , , ,
Transformer-based diffusion model for text-to-motion and action-to-motion synthesis achieving state-of-the-art while predicting samples not noise.
abstract ▾ abstract ▴
Natural and expressive human motion generation is the holy grail of computer animation. It is a challenging task, due to the diversity of possible motion, human perceptual sensitivity to it, and the difficulty of accurately describing it. Therefore, current generative solutions are either low-quality or limited in expressiveness. Diffusion models, which have already shown remarkable generative capabilities in other domains, are promising candidates for human motion due to their many-to-many nature, but they tend to be resource hungry and hard to control. In this paper, we introduce Motion Diffusion Model (MDM), a carefully adapted classifier-free diffusion-based generative model for the human motion domain. MDM is transformer-based, combining insights from motion generation literature. A notable design-choice is the prediction of the sample, rather than the noise, in each diffusion step. This facilitates the use of established geometric losses on the locations and velocities of the motion, such as the foot contact loss. As we demonstrate, MDM is a generic approach, enabling different modes of conditioning, and different generation tasks. We show that our model is trained with lightweight resources and yet achieves state-of-the-art results on leading benchmarks for text-to-motion and action-to-motion. https://guytevet.github.io/mdm-page/ .
Related Executing Your Commands via Motion Diffusion in Latent Space · MotionCLIP: Exposing Human Motion Generation to CLIP Space · MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control
how to read this ▾ how to read this ▴
- Category
- Method: a generative model for human motion
- Contributions
-
- MDM, a transformer-based classifier-free diffusion model adapted to the human-motion domain
- Predicting the sample rather than the noise each step, enabling geometric losses such as a foot-contact loss
- A generic approach supporting multiple conditioning modes and tasks (text-to-motion, action-to-motion)
- Context
- Combines diffusion-model advances from other domains with motion-generation literature and text-to-motion datasets (related to Guo et al.'s HumanML3D, 2022). Builds on: Generating Diverse and Natural 3D Human Motions from Text
- Correctness
- The central design choice (predict the sample, not the noise) is motivated by enabling location/velocity losses; the abstract claims state-of-the-art and light training but a reader should withhold judgment on numbers until the experiments section and watch how human perceptual sensitivity to motion is evaluated.
- Clarity
- Conceptually accessible if you know diffusion basics; a first pass conveys the architecture and key design choice, a second pass for the loss formulation and conditioning.
- How to read it
- First pass on the sample-prediction choice and conditioning modes; do a focused second pass on the geometric losses (especially foot contact) and the diffusion mechanics if you plan to build on it.
Motion Synthesis
-
, , , , ,
Implicit head avatar learning from monocular video via neural blendshapes and skinning fields in canonical space with end-to-end analytical gradient training.
abstract ▾ abstract ▴
Traditional 3D morphable face models (3DMMs) provide fine-grained control over expression but cannot easily capture geometric and appearance details. Neural volumetric representations approach photorealism but are hard to animate and do not generalize well to unseen expressions. To tackle this problem, we propose IMavatar (Implicit Morphable avatar), a novel method for learning implicit head avatars from monocular videos. Inspired by the fine-grained control mechanisms afforded by conventional 3DMMs, we represent the expression- and pose-related deformations via learned blendshapes and skinning fields. These attributes are pose-independent and can be used to morph the canonical geometry and texture fields given novel expression and pose parameters. We employ ray marching and iterative root-finding to locate the canonical surface intersection for each pixel. A key contribution is our novel analytical gradient formulation that enables end-to-end training of IMavatars from videos. We show quantitatively and qualitatively that our method improves geometry and covers a more complete expression space compared to state-of-the-art methods. Code and data can be found at https://ait.ethz.ch/projects/2022/IMavatar/.
Related PointAvatar: Deformable Point-Based Head Avatars from Videos · SPARK: Self-supervised Personalized Real-time Monocular Face Capture · 3D Gaussian Blendshapes for Head Avatar Animation · Learning a Model of Facial Shape and Expression from 4D Scans
how to read this ▾ how to read this ▴
- Category
- Method: implicit head-avatar learning from monocular video
- Contributions
-
- IMavatar, learning an implicit morphable head avatar from a monocular video
- Expression and pose deformations represented via learned blendshapes and skinning fields in a pose-independent canonical space
- A novel analytical gradient formulation (with ray marching and iterative root-finding) enabling end-to-end training
- Context
- Bridges traditional 3DMM control with neural volumetric representations, building on detailed 3D face modeling (related to Feng et al.'s DECA, 2021). Builds on: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
- Correctness
- Demonstrated quantitatively and qualitatively to improve geometry and expression coverage over prior methods per the abstract; readers should note it is trained per-subject from monocular video and that extrapolating to unseen expressions and finding the canonical surface intersection are the hard, assumption-laden parts.
- Clarity
- Dense (neural fields plus root-finding plus analytical gradients); a first pass conveys the canonical-space blendshape/skinning idea, deeper passes needed for the gradient derivation.
- How to read it
- Focus first on the canonical representation and why blendshapes plus skinning fields give control; reserve a careful second/third pass for the analytical gradient and root-finding if reimplementing.
Facial
-
, , ,
Retargets facial performances across diverse rigs using local anatomical constraints that preserve expression fidelity and prevent implausible deformations.
abstract ▾ abstract ▴
Generating realistic facial animation for CG characters and digital doubles is one of the hardest tasks in animation. A typical production workflow involves capturing the performance of a real actor using mo-cap technology, and transferring the captured motion to the target digital character. This process, known as retargeting, has been used for over a decade, and typically relies on either large blendshape rigs that are expensive to create, or direct deformation transfer algorithms that operate on individual geometric elements and are prone to artifacts. We present a new method for high-fidelity offline facial performance retargeting that is neither expensive nor artifact-prone. Our two step method first transfers local expression details to the target, and is followed by a global face surface prediction that uses anatomical constraints in order to stay in the feasible shape space of the target character. Our method also offers artists with familiar blendshape controls to perform fine adjustments to the retargeted animation. As such, our method is ideally suited for the complex task of human-to-human 3D facial performance retargeting, where the quality bar is extremely high in order to avoid the uncanny valley, while also being applicable for more common human-to-creature settings.
Related A Facial Motion Retargeting Pipeline for Appearance Agnostic 3D Characters · Transferring Facial Expressions to Different Face Models · Facial Retargeting with Automatic Range of Motion Alignment · Neural Facial Deformation Transfer
how to read this ▾ how to read this ▴
- Category
- Method: facial performance retargeting
- Contributions
-
- A two-step high-fidelity offline facial retargeting that is neither expensive nor artifact-prone
- First transfers local expression details, then runs a global anatomically-constrained surface prediction to stay in the target's feasible shape space
- Familiar blendshape controls for artists to make fine adjustments
- Context
- Builds on anatomically constrained face modeling (related to Wu et al.'s An Anatomically Constrained Local Deformation Model for Monocular Face Capture, 2016), positioned against costly large blendshape rigs and artifact-prone deformation-transfer methods. Builds on: An Anatomically Constrained Local Deformation Model for Monocular Face Capture
- Correctness
- Targets the demanding human-to-human 3D retargeting case where the quality bar is high; the key assumption is that anatomical constraints keep predictions plausible, so a reader should remember it is an offline method and that fidelity rests on the local-then-global decomposition behaving well across diverse rigs.
- Clarity
- Clearly structured around the two steps; a first pass conveys the local-then-global idea, a second pass for the constraint formulation.
- How to read it
- Read the two-step pipeline and the anatomical-constraint rationale first; second pass on the local detail transfer and global prediction math if you work on retargeting.
Facial / Retargeting
-
, , , , , , , , ,
Weta's unified solver where muscles, flesh, cloth, hair, and fluids all couple in one framework instead of chained single-physics passes.
abstract ▾ abstract ▴
We introduce Loki, a new framework for robust simulation of fluid, rigid, and deformable objects with non-compromising fidelity on any single element, and capabilities for coupling and representation transitions across multiple elements. Loki adapts multiple best-in-class solvers into a unified framework driven by a declarative state machine where users declare 'what' is simulated but not 'when,' so an automatic scheduling system takes care of mixing any combination of objects. This leads to intuitive setups for coupled simulations such as hair in the wind or objects transitioning from one representation to another, for example bulk water FLIP particles to SPH spray particles to volumetric mist. We also provide a consistent treatment for components used in several domains, such as unified collision and attachment constraints across 1D, 2D, 3D deforming and rigid objects. Distribution over MPI, custom linear equation solvers, and aggressive application of sparse techniques keep performance within production requirements. We demonstrate a variety of solvers within the framework and their interactions, including FLIPstyle liquids, spatially adaptive volumetric fluids, SPH, MPM, and mesh-based solids, including but not limited to discrete elastic rods, elastons, and FEM with state-of-the-art constitutive models.
Related Fast Corotated FEM using Operator Splitting · Towards Realtime: A Hybrid Physics-based Method for Hair Animation on GPU · Homogenized Yarn-Level Cloth · Fast Simulation of Deformable Characters with Articulated Skeletons in Projective Dynamics
how to read this ▾ how to read this ▴
- Category
- Production system: a unified multiphysics simulation framework
- Contributions
-
- Loki, a unified framework coupling fluid, rigid, and deformable objects with high single-element fidelity
- A declarative state machine where users declare what is simulated, with automatic scheduling of when
- Consistent collision/attachment constraints across 1D/2D/3D deforming and rigid objects, plus representation transitions (e.g. FLIP to SPH to volumetric mist)
- Context
- Adapts multiple best-in-class solvers (FLIP, adaptive volumetric fluids, SPH, MPM, finite-element solids/muscle) into one framework, with roots in musculoskeletal simulation (related to Teran et al.'s Creating and Simulating Skeletal Muscle, 2005). Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Engineering at production scale (MPI distribution, custom linear solvers, sparse techniques) rather than a single new algorithm; results are production-proven, and a reader should view the fidelity and coupling claims as system-integration achievements meeting production requirements rather than head-to-head benchmarks.
- Clarity
- Broad and systems-oriented; a first pass conveys the unified architecture and declarative model, deeper passes for individual solver coupling are optional and selective.
- How to read it
- Read for the architecture: the declarative state machine, automatic scheduling, and unified constraints; dive deeper only into the specific solver couplings relevant to your work.
Muscles / CFX
- talk Machine Learning Summit: 4 Years of Bringing Characters to Life with Computer Brains GDC Academic
Four years of Edinburgh and EA research on deep learning for character control, covering quadruped locomotion, scene interactions, basketball, and martial arts from motion capture training data.
Motion Synthesis / ML Deformation
-
NetEase Games presented a lightweight real-time motion completion network that generates smooth game animation without a large pre-built motion library, outperforming interpolation and standard motion matching.
Motion Synthesis / ML Deformation
- talk Machine Learning Summit: Walk Lizzie, Walk! Emergent Physics-Based Animation through Reinforcement Learning GDC Industrial
Embark Studios shows reinforcement learning where user-created creatures autonomously learn locomotion, treating animation as an emergent property of anatomy derived from robotics principles.
Motion Synthesis / ML Deformation
-
, , ,
Deep learning pipeline for monocular video facial capture that matches expressions from RGB input to rigged face models at production quality.
abstract ▾ abstract ▴
Facial performance capture is the process of automatically animating a digital face according to a captured performance of an actor. Recent developments in this area have focused on high‐quality results using expensive head‐scanning equipment and camera rigs. These methods produce impressive animations that accurately capture subtle details in an actor's performance. However, these methods are accessible only to content creators with relatively large budgets. Current methods using inexpensive recording equipment generally produce lower quality output that is unsuitable for many applications. In this paper, we present a facial performance capture method that does not require facial scans and instead animates an artist‐created model using standard blendshapes. Furthermore, our method gives artists high‐level control over animations through a workflow similar to existing commercial solutions. Given a recording, our approach matches keyframes of the video with corresponding expressions from an animated library of poses. A Gaussian process model then computes the full animation by interpolating from the set of matched keyframes. Our expression‐matching method computes a low‐dimensional latent code from an image that represents a facial expression while factoring out the facial identity.
Related FaceLab: Scalable Facial Performance Capture for Visual Effects · Vdub: Modifying Face Video of Actors for Plausible Visual Alignment to a Dubbed Audio Track · Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks · Reconstruction of Personalized 3D Face Rigs from Monocular Video
how to read this ▾ how to read this ▴
- Category
- Method: monocular facial performance capture
- Contributions
-
- A facial capture method that needs no facial scans and animates an artist-created model using standard blendshapes
- Keyframe expression matching from video against an animated library of poses, with Gaussian-process interpolation for the full animation
- High-level artist control through a workflow similar to existing commercial solutions
- Context
- Aims to lower the cost barrier of deep facial capture (related to Laine et al.'s Production-Level Facial Performance Capture Using Deep CNNs, 2017), targeting creators without large head-scanning budgets. Builds on: Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks
- Correctness
- Trades some fidelity for accessibility using inexpensive recording; the assumption is that a pose library plus expression matching plus Gaussian-process interpolation suffices for production-usable output, so readers should expect quality bounded by the library coverage and the low-dimensional matching.
- Clarity
- Readable and pipeline-focused; a first pass conveys the match-and-interpolate idea, a second pass for the expression-matching and GP details.
- How to read it
- Focus on the keyframe-matching plus GP-interpolation pipeline and the cost-vs-quality tradeoff; second pass on the low-dimensional matching if accuracy matters for your use.
Facial
- MoRF: Morphable Radiance Fields for Multiview Neural Head Modeling SIGGRAPH Disney Research 70 cites
, , , ,
Extends NeRF into a generative morphable model producing multiview-consistent photorealistic head images with controllable identity parameters.
abstract ▾ abstract ▴
Recent research work has developed powerful generative models (e.g., StyleGAN2) that can synthesize complete human head images with impressive photorealism, enabling applications such as photorealistically editing real photographs. While these models can be trained on large collections of unposed images, their lack of explicit 3D knowledge makes it difficult to achieve even basic control over 3D viewpoint without unintentionally altering identity. On the other hand, recent Neural Radiance Field (NeRF) methods have already achieved multiview-consistent, photorealistic renderings but they are so far limited to a single facial identity. In this paper, we propose a new Morphable Radiance Field (MoRF) method that extends a NeRF into a generative neural model that can realistically synthesize multiview-consistent images of complete human heads, with variable and controllable identity. MoRF allows for morphing between particular identities, synthesizing arbitrary new identities, or quickly generating a NeRF from few images of a new subject, all while providing realistic and consistent rendering under novel viewpoints. We train MoRF in a supervised fashion by leveraging a high-quality database of multiview portrait images of several people, captured in studio with polarization-based separation of diffuse and specular reflection.
Related NeRSemble: Multi-view Radiance Field Reconstruction of Human Heads · Codec Avatars: Photorealistic Telepresence at Scale · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting
how to read this ▾ how to read this ▴
- Category
- Method: a generative morphable neural radiance field for heads
- Contributions
-
- MoRF, extending a NeRF into a generative neural model synthesizing multiview-consistent photorealistic complete-head images
- Variable, controllable identity: morphing between identities, synthesizing new ones, or quickly fitting a NeRF from few images
- Supervised training leveraging a high-quality dataset
- Context
- Positioned between StyleGAN2-style head generators (lacking explicit 3D control) and single-identity NeRFs, drawing on animatable face modeling (related to Feng et al.'s DECA, 2021). Builds on: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
- Correctness
- Addresses the known failure where 2D generators alter identity when changing viewpoint; the key dependency is the high-quality supervised dataset, so readers should keep in mind that identity control and novel-view consistency are tied to that training data and to the morphable parameterization.
- Clarity
- Moderately dense (NeRF plus generative morphable model); a first pass conveys the goal and capability, a second pass for the conditioning and training setup.
- How to read it
- First pass on what MoRF enables (3D-consistent identity control) versus prior GAN/NeRF tradeoffs; second pass on the morphable parameterization and supervised training if you build head avatars.
Facial
-
, , ,
Rigs and animates 3D character meshes by encoding motion cues from single-view point cloud streams of a performing subject.
abstract ▾ abstract ▴
We present MoRig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters. Our method is also able to animate the 3D meshes according to the captured point cloud motion. MoRig’s neural network encodes motion cues from the point clouds into features that are informative about the articulated parts of the performing character. These motion-aware features guide the inference of an appropriate skeletal rig for the input mesh, which is then animated based on the point cloud motion. Our method can rig and animate diverse characters, including humanoids, quadrupeds, and toys with varying articulation. It accounts for occluded regions in the point clouds and mismatches in the part proportions between the input mesh and captured character. Compared to other rigging approaches that ignore motion cues, MoRig produces more accurate rigs, well-suited for re-targeting motion from captured characters.
Related RigNet: Neural Rigging for Articulated Characters · S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · One Model to Rig Them All: Diverse Skeleton Rigging with UniRig
how to read this ▾ how to read this ▴
- Category
- Method: neural rigging and animation from point clouds
- Contributions
-
- Encodes motion cues from single-view point cloud streams into motion-aware features informative about articulated parts
- Uses those features to infer a skeletal rig for an input mesh and animate it from the captured motion
- Handles diverse characters (humanoids, quadrupeds, toys) plus occlusions and proportion mismatches between mesh and captured subject
- Context
- Extends neural rigging in the vein of RigNet (Xu et al. 2020) by adding motion cues from point cloud streams rather than relying on the static mesh alone. Builds on: RigNet: Neural Rigging for Articulated Characters
- Correctness
- Validated qualitatively on a range of character types and against motion-agnostic rigging baselines, with explicit handling of occlusion and proportion mismatch; a reader should remember it is driven by single-view point clouds, so capture quality and viewpoint coverage bound the result.
- Clarity
- Accessible at a high level; a first pass conveys the motion-guided idea, a second pass is needed for the network design and rig inference.
- How to read it
- First pass to grasp how motion cues inform rig inference; second pass on the feature encoding and the rigging-then-animation pipeline if you plan to compare against RigNet-style static methods.
Rigging
-
Overview of MotionBuilder 2023 updates including Python 3 migration, new animation layer colour hints, custom key sets for character rigs, and improved pose controls for mocap data transfer workflows.
Retargeting / Motion Synthesis
-
Practical walkthrough of MotionBuilder HumanIK characterization and full-body mocap retargeting, covering biped joint mapping, HIK solve settings, and baking animation onto custom character skeletons.
abstract ▾ abstract ▴
This MotionBuilder tutorial demonstrates how to characterize a custom biped character (the Volund rig from the Unity Blacksmith demo) so it can receive motion capture data through HumanIK. It covers dragging the character template onto correctly named bones, defining the biped, renaming the definition, and switching the source between none, stance, and an FK/IK control rig with selection, body-part, and full-body IK modes. The workflow then imports a circle-run mocap take, cleans out unneeded marker and solver nodes, characterizes the mocap skeleton, sets it as the character source for retargeting, and bakes (plots) the animation onto the control rig for editing via animation layers. Finally it shows baking the rig animation back down to the skeleton for export to engines like Unity, Unreal, or Maya.
Related MotionBuilder: Essentials Characterization, Retargeting and Baking Animations · How to Retarget Motion Capture in MotionBuilder · Meet MotionMaker: New AI Animation Tool In Maya · Autodesk MotionBuilder 2022
how to read this ▾ how to read this ▴
- Category
- Production talk / tutorial (MotionBuilder HumanIK retargeting)
- Contributions
-
- Walks through characterizing a custom biped rig (Volund) so it can receive mocap via HumanIK
- Shows switching the source between stance, FK/IK control rig, and full-body IK modes, then importing and cleaning a mocap take
- Demonstrates baking (plotting) animation onto the control rig and back onto the skeleton for export to Unity, Unreal, or Maya
- Context
- Applies the motion-retargeting principle established by Gleicher (1998) inside the production HumanIK workflow of MotionBuilder. Builds on: Retargeting Motion to New Characters
- Correctness
- Studio/tool practice, not peer-reviewed; the workflow is production-proven but specific to MotionBuilder's HumanIK and assumes correctly named, properly posed bones.
- Clarity
- Accessible hands-on tutorial; a single watch-through conveys the workflow, with replay value at the characterization and baking steps.
- How to read it
- Follow along in MotionBuilder rather than reading passively; focus on the characterization (bone mapping) and the plot/bake steps, which are where mistakes propagate.
Retargeting
-
, ,
Motion-conditioned neural model generalizes garment dynamics to unseen body shapes and motions by disentangling body and garment representations.
abstract ▾ abstract ▴
Realistic dynamic garments on animated characters have many AR/VR applications. While authoring such dynamic garment geometry is still a challenging task, data-driven simulation provides an attractive alternative, especially if it can be controlled simply using the motion of the underlying character. In this work, we focus on motion guided dynamic 3D garments, especially for loose garments. In a data-driven setup, we first learn a generative space of plausible garment geometries. Then, we learn a mapping to this space to capture the motion dependent dynamic deformations, conditioned on the previous state of the garment as well as its relative position with respect to the underlying body. Technically, we model garment dynamics, driven using the input character motion, by predicting per-frame local displacements in a canonical state of the garment that is enriched with frame-dependent skinning weights to bring the garment to the global space. We resolve any remaining per-frame collisions by predicting residual local displacements. The resultant garment geometry is used as history to enable iterative roll-out prediction. We demonstrate plausible generalization to unseen body shapes and motion inputs, and show improvements over multiple state-of-the-art alternatives. Code and data is released in https://geometry.cs.ucl.ac.uk/projects/2022/MotionDeepGarment/
Related PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · GarMatNet: A Learning-Based Method for Predicting 3D Garment Mesh with Parameterized Materials · SMPLicit: Topology-aware Generative Model for Clothed People · SNUG: Self-Supervised Neural Dynamic Garments
how to read this ▾ how to read this ▴
- Category
- Method: data-driven dynamic garment model
- Contributions
-
- Learns a generative space of plausible garment geometries, then a motion-dependent mapping into it conditioned on prior garment state and relative body position
- Predicts per-frame local displacements in a canonical garment state enriched with frame-dependent skinning weights, plus residual displacements to resolve collisions
- Enables iterative roll-out prediction that generalizes to unseen body shapes and motions, targeting loose garments
- Context
- Builds on learning-based garment animation such as Santesteban et al. (2019) virtual try-on, extending it toward loose garments with motion-dependent dynamics. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Demonstrated to generalize plausibly to unseen body shapes and motions, but it is data-driven and focused on loose garments; collision handling is approximate (residual displacements), so a reader should not expect hard physical guarantees.
- Clarity
- Moderately technical; a first pass gives the disentanglement idea, a second pass is needed for the canonical-state displacement and roll-out formulation.
- How to read it
- First pass for the body/garment disentanglement and roll-out scheme; second pass on the canonical displacement plus skinning-weight enrichment if you care about how dynamics are encoded.
CFX / ML Deformation
- talk MotionBuilder: Essentials Characterization, Retargeting and Baking Animations MotionBuilder Industrial
Comprehensive guide to MotionBuilder essentials covering biped and quadruped characterization differences, HumanIK retargeting configuration, and animation baking for export-ready mocap data.
abstract ▾ abstract ▴
This MotionBuilder essentials tutorial explains what characterization is and how to characterize a character so it can be driven by mocap, a control rig, or retargeted animation. Using a downloaded Mixamo character imported in T-pose via FBX merge, it shows organizing the skeleton and mesh into groups, dropping the character template onto the hips in the schematic view, choosing biped versus quadruped, renaming the definition, and validating the automatic bone mapping. It then imports a tutorial animation, characterizes the source skeleton, sets it as the source on the target character to retarget the walk, enables looping via match, and bakes the result by plotting to the target skeleton.
Related Motion Builder Characterization and Retargeting Tutorial · How to Retarget Motion Capture in MotionBuilder · Meet MotionMaker: New AI Animation Tool In Maya · Autodesk MotionBuilder 2022
how to read this ▾ how to read this ▴
- Category
- Production talk / tutorial (MotionBuilder essentials)
- Contributions
-
- Explains what characterization is and how to characterize a Mixamo character (biped vs quadruped) so it can be driven by mocap, a control rig, or retargeted animation
- Covers importing in T-pose via FBX merge, grouping skeleton and mesh, dropping the template on the hips, and validating automatic bone mapping
- Shows retargeting a walk by setting the source skeleton as the character source, looping via match, and baking by plotting to the target skeleton
- Context
- Companion essentials walkthrough to other MotionBuilder retargeting tutorials (e.g. How to Retarget Motion Capture in MotionBuilder, 2019), grounded in the same HumanIK characterization workflow. Builds on: How to Retarget Motion Capture in MotionBuilder
- Correctness
- Studio/tool practice, not peer-reviewed; production-proven but tied to MotionBuilder HumanIK and to a clean T-pose with correctly named bones (notably the biped vs quadruped distinction).
- Clarity
- Beginner-friendly and accessible; one pass conveys the essentials, with replay value at the characterization and baking steps.
- How to read it
- Watch hands-on; pay attention to the biped versus quadruped choice and the plot-to-skeleton bake, and treat it as the entry point before the more advanced characterization tutorial.
Retargeting / Motion Synthesis
-
, , , ,
Aligns a 3D human motion auto-encoder latent space with CLIP, enabling out-of-domain text and image-driven motion synthesis.
abstract ▾ abstract ▴
We introduce MotionCLIP, a 3D human motion auto-encoder featuring a latent embedding that is disentangled, well behaved, and supports highly semantic textual descriptions. MotionCLIP gains its unique power by aligning its latent space with that of the Contrastive Language-Image Pre-training (CLIP) model. Aligning the human motion manifold to CLIP space implicitly infuses the extremely rich semantic knowledge of CLIP into the manifold. In particular, it helps continuity by placing semantically similar motions close to one another, and disentanglement, which is inherited from the CLIP-space structure. MotionCLIP comprises a transformer-based motion auto-encoder, trained to reconstruct motion while being aligned to its text label's position in CLIP-space. We further leverage CLIP's unique visual understanding and inject an even stronger signal through aligning motion to rendered frames in a self-supervised manner. We show that although CLIP has never seen the motion domain, MotionCLIP offers unprecedented text-to-motion abilities, allowing out-of-domain actions, disentangled editing, and abstract language specification. For example, the text prompt"couch"is decoded into a sitting down motion, due to lingual similarity, and the prompt"Spiderman"results in a web-swinging-like solution that is far from seen during training.
Related Human Motion Diffusion Model · TEMOS: Generating Diverse Human Motions from Textual Descriptions · PDP: Physics-Based Character Animation via Diffusion Policy · CLoSD: Closing the Loop between Simulation and Diffusion for Multi-Task Character Control
how to read this ▾ how to read this ▴
- Category
- Method: text/image-driven human motion generation via CLIP alignment
- Contributions
-
- A transformer-based 3D human motion auto-encoder whose latent space is aligned with CLIP space, infusing CLIP's semantic structure into the motion manifold
- Adds a self-supervised signal by aligning motion to rendered frames, leveraging CLIP's visual understanding
- Enables out-of-domain text-to-motion, disentangled editing, and abstract language specification
- Context
- Builds on transformer VAE motion synthesis such as Petrovich et al. ACTOR (2021) and bridges it to CLIP's language-image embedding space. Builds on: Action-Conditioned 3D Human Motion Synthesis with Transformer VAE
- Correctness
- Shows that aligning to CLIP yields a well-behaved, disentangled latent enabling out-of-domain prompts; results inherit CLIP's semantic structure, so quality and biases are bounded by CLIP and by the motion training data, and abstract prompts give plausible rather than ground-truth motions.
- Clarity
- Conceptually elegant and accessible; a first pass conveys the alignment idea, a second pass covers the auto-encoder and the alignment losses.
- How to read it
- First pass for why aligning a motion manifold to CLIP buys semantic editing; second pass on the dual alignment (text label plus rendered-frame) losses if you want to reproduce or extend it.
Motion Synthesis
-
, , , , , ,
First diffusion-based text-driven motion framework enabling fine-grained body-part control and arbitrary-length synthesis.
abstract ▾ abstract ▴
Human motion modeling is important for many modern graphics applications, which typically require professional skills. In order to remove the skill barriers for laymen, recent motion generation methods can directly generate human motions conditioned on natural languages. However, it remains challenging to achieve diverse and fine-grained motion generation with various text inputs. To address this problem, we propose MotionDiffuse, one of the first diffusion model-based text-driven motion generation frameworks, which demonstrates several desired properties over existing methods. 1) Probabilistic Mapping. Instead of a deterministic language-motion mapping, MotionDiffuse generates motions through a series of denoising steps in which variations are injected. 2) Realistic Synthesis. MotionDiffuse excels at modeling complicated data distribution and generating vivid motion sequences. 3) Multi-Level Manipulation. MotionDiffuse responds to fine-grained instructions on body parts, and arbitrary-length motion synthesis with time-varied text prompts. Our experiments show MotionDiffuse outperforms existing SoTA methods by convincing margins on text-driven motion generation and action-conditioned motion generation. A qualitative analysis further demonstrates MotionDiffuse's controllability for comprehensive motion generation.
Related ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model · T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations · Human Motion Diffusion Model · Executing Your Commands via Motion Diffusion in Latent Space
how to read this ▾ how to read this ▴
- Category
- Method: diffusion-based text-driven motion generation
- Contributions
-
- One of the first diffusion-model frameworks for text-driven human motion, giving a probabilistic (non-deterministic) language-to-motion mapping
- Models complex motion distributions for realistic synthesis
- Supports multi-level manipulation: fine-grained body-part instructions and arbitrary-length synthesis with time-varied text prompts
- Context
- Builds on text-to-motion learned from paired datasets such as Guo et al. HumanML3D (2022), bringing the denoising-diffusion paradigm to the motion domain. Builds on: Generating Diverse and Natural 3D Human Motions from Text
- Correctness
- Reports outperforming prior state of the art on text-driven and action-conditioned generation by the paper's account; as a generative diffusion model it produces diverse plausible motions rather than unique ground truth, and quality depends on the text-motion training data.
- Clarity
- Accessible if you know diffusion models; a first pass conveys the properties, a second pass covers the denoising formulation and the body-part/time-varied conditioning.
- How to read it
- First pass for the three claimed properties (probabilistic, realistic, multi-level); second pass on the conditioned denoising and how body-part and time-varied text control are injected.
Motion Synthesis
-
, , , , , , ,
Graph convolution on arbitrary-topology cloth and obstacle meshes predicts plausible 3D cloth deformation at 30-45 fps for up to 100K triangles.
abstract ▾ abstract ▴
We present a novel mesh‐based learning approach (N‐Cloth) for plausible 3D cloth deformation prediction. Our approach is general and can handle cloth or obstacles represented by triangle meshes with arbitrary topologies. We use graph convolution to transform the cloth and object meshes into a latent space to reduce the non‐linearity in the mesh space. Our network can predict the target 3D cloth mesh deformation based on the initial state of the cloth mesh template and the target obstacle mesh. Our approach can handle complex cloth meshes with up to 100K triangles and scenes with various objects corresponding to SMPL humans, non‐SMPL humans or rigid bodies. In practice, our approach can be used to generate plausible cloth simulation at 30, 45 fps on an NVIDIA GeForce RTX 3090 GPU. We highlight its benefits over prior learning‐based methods and physically‐based cloth simulators.
Related GarMatNet: A Learning-Based Method for Predicting 3D Garment Mesh with Parameterized Materials · SMPLicit: Topology-aware Generative Model for Clothed People · A Pixel-Based Framework for Data-Driven Clothing · 3D Hair Synthesis Using Volumetric Variational Autoencoders
how to read this ▾ how to read this ▴
- Category
- Method: mesh-based neural cloth deformation predictor
- Contributions
-
- Graph-convolution network that maps arbitrary-topology cloth and obstacle meshes into a latent space to reduce mesh-space non-linearity
- Predicts target 3D cloth deformation from the cloth template's initial state and the target obstacle mesh
- Handles complex cloth (up to 100K triangles) and SMPL/non-SMPL humans and rigid bodies at interactive rates (30 to 45 fps on an RTX 3090)
- Context
- Sits among learning-based cloth methods such as Bertiche et al. Neural Cloth Simulation (2022), differentiating by handling arbitrary mesh topologies via graph convolution. Builds on: Neural Cloth Simulation
- Correctness
- Claims plausible deformation and benefits over prior learning-based methods and physically-based simulators on the tested cloth and obstacle meshes; results are plausible predictions, not guaranteed physically accurate, and the reported fps is hardware-specific.
- Clarity
- Moderately technical; a first pass gives the graph-convolution latent idea, a second pass covers the network and topology handling.
- How to read it
- First pass for the arbitrary-topology graph-convolution framing; second pass on the latent-space mapping and the speed/accuracy tradeoff versus PBS if performance matters to you.
CFX / ML Deformation
-
, ,
First unsupervised deep learning framework for garment dynamics using physics-inspired losses without ground-truth simulation data.
abstract ▾ abstract ▴
We present a general framework for the garment animation problem through unsupervised deep learning inspired in physically based simulation. Existing trends in the literature already explore this possibility. Nonetheless, these approaches do not handle cloth dynamics. Here, we propose the first methodology able to learn realistic cloth dynamics unsupervisedly, and henceforth, a general formulation for neural cloth simulation. The key to achieve this is to adapt an existing optimization scheme for motion from simulation based methodologies to deep learning. Then, analyzing the nature of the problem, we devise an architecture able to automatically disentangle static and dynamic cloth subspaces by design. We will show how this improves model performance. Additionally, this opens the possibility of a novel motion augmentation technique that greatly improves generalization. Finally, we show it also allows to control the level of motion in the predictions. This is a useful, never seen before, tool for artists. We provide of detailed analysis of the problem to establish the bases of neural cloth simulation and guide future research into the specifics of this domain.
Related PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · SNUG: Self-Supervised Neural Dynamic Garments · Learning-Based Animation of Clothing for Virtual Try-On · Hair Modeling and Simulation by Style
how to read this ▾ how to read this ▴
- Category
- Method: unsupervised neural garment dynamics
- Contributions
-
- First unsupervised deep-learning framework for realistic cloth dynamics, using physics-inspired losses with no ground-truth simulation data
- An architecture that by design disentangles static and dynamic cloth subspaces, with a motion-augmentation technique that improves generalization
- Provides artist control over the level of motion in predictions
- Context
- Extends the unsupervised physically-based-neural-simulation line, notably the authors' PBNS (2021), from static pose-space deformation to true cloth dynamics. Builds on: PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation
- Correctness
- Trained without ground-truth data via physics-inspired losses, so realism depends on the loss formulation rather than reference simulation; the paper positions itself as establishing foundations, so treat results as a general formulation more than an exhaustively benchmarked solver.
- Clarity
- Analytical and reasonably accessible; a first pass conveys the unsupervised dynamics idea, a second pass covers the static/dynamic disentanglement and the optimization-from-simulation adaptation.
- How to read it
- First pass for the unsupervised dynamics premise and the static/dynamic split; second pass on the physics-inspired losses and motion augmentation if you intend to build on PBNS-style training.
CFX
-
, , , , ,
Hybrid avatar representation combining a coarse morphable face model with two networks predicting mesh vertex offsets and view- and expression-dependent textures.
abstract ▾ abstract ▴
We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar that can be used for teleconferencing in AR/VR or other applications in the movie or games industry that rely on a digital human.11philgras.github.io/neural_head_avatars/neural_head_avatars.html Our representation can be learned from a monocular RGB portrait video that features a range of different expressions and views. Specifically, we propose a hybrid representation consisting of a morphable model for the coarse shape and expressions of the face, and two feed-forward networks, predicting vertex offsets of the underlying mesh as well as a view- and expression-dependent texture. We demonstrate that this representation is able to accurately extrapolate to unseen poses and view points, and generates natural expressions while providing sharp texture details. Compared to previous works on head avatars, our method provides a disentangled shape and appearance model of the complete human head (including hair) that is compatible with the standard graphics pipeline. Moreover, it quantitatively and qualitatively outperforms current state of the art in terms of reconstruction quality and novel-view synthesis.
Related PointAvatar: Deformable Point-Based Head Avatars from Videos · Learning an Animatable Detailed 3D Face Model from In-The-Wild Images · Reconstruction of Personalized 3D Face Rigs from Monocular Video · FLARE: Fast Learning of Animatable and Relightable Mesh Avatars
how to read this ▾ how to read this ▴
- Category
- Method: animatable neural head avatar from monocular video
- Contributions
-
- A hybrid head representation combining a coarse morphable face model with two feed-forward networks for mesh vertex offsets and a view- and expression-dependent texture
- Learns a complete head avatar (including hair) from a single monocular RGB portrait video
- Produces a disentangled shape/appearance model compatible with the standard graphics pipeline that extrapolates to unseen poses and viewpoints
- Context
- Builds on monocular morphable-model face reconstruction such as Feng et al. DECA (2021), adding learned geometry offsets and neural texture for a full animatable head. Builds on: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
- Correctness
- Reported to quantitatively and qualitatively outperform prior head-avatar work and extrapolate to unseen poses/views; it relies on a morphable-model prior and a single subject's monocular video, so coverage of expressions and views in that video bounds quality.
- Clarity
- Accessible; a first pass conveys the hybrid (model + two networks) design, a second pass covers the offset and texture network formulations.
- How to read it
- First pass for the morphable-model-plus-networks hybrid and why it stays graphics-pipeline compatible; second pass on the vertex-offset and view/expression texture networks if reconstructing avatars.
Facial
-
, , , , ,
Joint learning of explicit hair geometry and view-dependent appearance using a neural scalp texture encoding individual strand properties at each texel.
abstract ▾ abstract ▴
We present Neural Strands, a novel learning framework for modeling accurate hair geometry and appearance from multi-view image inputs. The learned hair model can be rendered in real-time from any viewpoint with high-fidelity view-dependent effects. Our model achieves intuitive shape and style control unlike volumetric counterparts. To enable these properties, we propose a novel hair representation based on a neural scalp texture that encodes the geometry and appearance of individual strands at each texel location. Furthermore, we introduce a novel neural rendering framework based on rasterization of the learned hair strands. Our neural rendering is strand-accurate and anti-aliased, making the rendering view-consistent and photorealistic. Combining appearance with a multi-view geometric prior, we enable, for the first time, the joint learning of appearance and explicit hair geometry from a multi-view setup. We demonstrate the efficacy of our approach in terms of fidelity and efficiency for various hairstyles.
Related Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction · FLARE: Fast Learning of Animatable and Relightable Mesh Avatars · 3D Hair Synthesis Using Volumetric Variational Autoencoders · HAAR: Text-Conditioned Generative Model of 3D Strand-Based Human Hairstyles
how to read this ▾ how to read this ▴
- Category
- Method: neural hair geometry and appearance from multi-view images
- Contributions
-
- A neural scalp texture representation encoding the geometry and appearance of individual strands at each texel
- A strand-accurate, anti-aliased neural rendering framework based on rasterizing the learned strands, giving view-consistent photorealistic results in real time
- Enables, for the first time, joint learning of explicit hair geometry and appearance from a multi-view setup, with intuitive shape and style control
- Context
- Builds on multi-view strand-level hair capture such as Nam et al. Strand-Accurate Multi-View Hair Capture (2019), adding learned appearance and neural rendering atop explicit geometry. Builds on: Strand-Accurate Multi-View Hair Capture
- Correctness
- Demonstrated across various hairstyles for fidelity and efficiency and contrasted with volumetric approaches for controllability; it depends on a multi-view capture setup and a geometric prior, so results are bounded by capture coverage and the prior's accuracy.
- Clarity
- Fairly technical; a first pass conveys the neural-scalp-texture plus strand-rasterization idea, a second pass covers the rendering and joint-learning formulation.
- How to read it
- First pass for the explicit-strand-plus-neural-appearance framing and why it beats volumetric for control; second pass on the scalp-texture encoding and strand-accurate rasterization if implementing hair capture.
CFX / ML Deformation
- NeuralHDHair: Automatic High-Fidelity Hair Modeling from a Single Image Using Implicit Neural Representations CVPR Academic 43 cites
, , , , ,
Implicit neural representation (IRHairNet) hierarchically infers 3D orientation and occupancy; GrowingNet generates high-fidelity strands from a single image.
abstract ▾ abstract ▴
Undoubtedly, high-fidelity 3D hair plays an indispensable role in digital humans. However, existing monocular hair modeling methods are either tricky to deploy in digital systems (e.g., due to their dependence on complex user interactions or large databases) or can produce only a coarse geometry. In this paper, we introduce NeuralHDHair, a flexible, fully automatic system for modeling high-fidelity hair from a single image. The key enablers of our system are two carefully designed neural networks: an IRHairNet (Im-plicit representation for hair using neural network) for inferring high-fidelity 3D hair geometric features (3D orientation field and 3D occupancy field) hierarchically and a GrowingNet (Growing hair strands using neural network) to efficiently generate 3D hair strands in parallel. Specifically, we perform a coarse-to-fine manner and propose a novel voxel-aligned implicit function (VIFu) to represent the global hair feature, which is further enhanced by the local details extracted from a hair luminance map. To improve the efficiency of a traditional hair growth algorithm, we adopt a local neural implicit function to grow strands based on the estimated 3D hair geometric features.
Related Single-View Hair Modeling Using a Hairstyle Database · Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction · Structure-Aware Hair Capture · SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
how to read this ▾ how to read this ▴
- Category
- Method: single-image hair modeling via implicit neural representations
- Contributions
-
- IRHairNet, which hierarchically infers a 3D orientation field and 3D occupancy field from one image in a coarse-to-fine manner
- A voxel-aligned implicit function (VIFu) for the global hair feature, enhanced by local details from a hair luminance map
- GrowingNet, a local neural implicit function that grows 3D strands in parallel for efficiency
- Context
- Builds on learned single-view hair reconstruction such as HairNet (Zhou et al. 2018), moving from convolutional regression toward implicit-function representations of hair geometry. Builds on: HairNet: Single-View Hair Reconstruction Using Convolutional Neural Networks
- Correctness
- Demonstrated as a fully automatic single-image system aiming for high-fidelity strands without heavy user interaction or large databases, but single-view inference of occluded interior hair remains an inherently ill-posed reconstruction problem to keep in mind.
- Clarity
- Reasonably accessible; a first pass conveys the coarse-to-fine pipeline, a second pass is needed for the VIFu and GrowingNet formulations.
- How to read it
- Focus first on how the orientation and occupancy fields are defined and on the VIFu idea; do a second pass on GrowingNet if you care about strand-synthesis efficiency.
CFX / ML Deformation
-
, , , , , , , ,
Parametric hand model comprising 20 bone meshes and 7 tetrahedral muscle groups built from annotated MRI data, enabling anatomy-aware hand pose estimation.
abstract ▾ abstract ▴
Emerging Metaverse applications demand reliable, accurate, and photorealistic reproductions of human hands to perform sophisticated operations as if in the physical world. While real human hand represents one of the most intricate coordination between bones, muscle, tendon, and skin, state-of-the-art techniques unanimously focus on modeling only the skeleton of the hand. In this paper, we present NIMBLE, a novel parametric hand model that includes the missing key components, bringing 3D hand model to a new level of realism. We first annotate muscles, bones and skins on the recent Magnetic Resonance Imaging hand (MRI-Hand) dataset [Li et al. 2021] and then register a volumetric template hand onto individual poses and subjects within the dataset. NIMBLE consists of 20 bones as triangular meshes, 7 muscle groups as tetrahedral meshes, and a skin mesh. Via iterative shape registration and parameter learning, it further produces shape blend shapes, pose blend shapes, and a joint regressor. We demonstrate applying NIMBLE to modeling, rendering, and visual inference tasks. By enforcing the inner bones and muscles to match anatomic and kinematic rules, NIMBLE can animate 3D hands to new poses at unprecedented realism. To model the appearance of skin, we further construct a photometric HandStage to acquire high-quality textures and normal maps to model wrinkles and palm print.
Related Data-Driven Physics for Human Soft Tissue Animation · BOSS: Bones, Organs and Skin Shape Model · Steklov-Poincare Skinning · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging
how to read this ▾ how to read this ▴
- Category
- Method: anatomy-aware parametric hand model
- Contributions
-
- NIMBLE, a parametric hand model adding 20 bone meshes and 7 tetrahedral muscle groups beneath a skin mesh, beyond skeleton-only models
- Annotation of muscles, bones, and skin on an MRI-Hand dataset plus registration of a volumetric template across poses and subjects
- Learned shape and pose blend shapes and a joint regressor that enforce anatomic and kinematic rules for modeling, rendering, and inference
- Context
- Extends prior MRI-based hand modeling (e.g., Wang et al. 2019) by going from skeleton-only representations to a full bones-muscles-skin parametric model. Builds on: Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging
- Correctness
- Built and registered from a specific MRI hand dataset, so realism is tied to that captured population and annotation quality, and animation plausibility depends on the enforced anatomic and kinematic constraints rather than full soft-tissue simulation.
- Clarity
- Accessible motivation; a first pass conveys the anatomy-aware idea, a second pass is needed for the registration and blend-shape learning details.
- How to read it
- Read first for what the model contains and how it differs from skeleton-only hands; do a second pass on the iterative registration and parameter learning if you intend to use or rebuild it.
Muscles / Skinning
-
, , ,
Learns to predict internal skeletal bone geometry from an external SMPL body surface, trained on 2,000 DXA scans pairing body shape with skeleton.
abstract ▾ abstract ▴
We present OSSO, the first method to learn the mapping from the 3D body surface to the internal skeletal anatomy from real data. Using 2,000 dual-energy X-ray absorptiometry (DXA) scans, a parametric 3D body shape model (STAR) captures the body surface and a novel part-based 3D skeleton model captures the bones. OSSO can predict a realistic skeleton for arbitrary body shapes and poses, satisfying physical plausibility constraints. Code and the paired skin/bone mesh dataset are publicly available.
Related Data-driven Modeling of Skin and Muscle Deformation · Capturing and Animating Skin Deformation in Human Motion · BOSS: Bones, Organs and Skin Shape Model · TailorMe: Self-Supervised Learning of an Anatomically Constrained Volumetric Human Shape Model
how to read this ▾ how to read this ▴
- Category
- Method: predicting internal skeleton from body surface
- Contributions
-
- OSSO, presented as the first method to learn the mapping from a 3D body surface to internal skeletal anatomy from real data
- A novel part-based 3D skeleton model paired with the STAR body model, learned from 2,000 DXA scans
- Public release of code and a paired skin/bone mesh dataset
- Context
- Builds on the STAR parametric body model (Osman et al. 2020), pairing its surface shape with a learned skeleton to recover inside-the-body anatomy. Builds on: STAR: Sparse Trained Articulated Human Body Regressor
- Correctness
- Trained on 2,000 DXA scans and constrained for physical plausibility, so predictions reflect that scan population and the imposed constraints; bone geometry is inferred from the outside rather than directly measured for arbitrary new subjects.
- Clarity
- Accessible and well-scoped; a first pass conveys the surface-to-skeleton goal, a second pass covers the part-based skeleton model.
- How to read it
- Focus on the data pairing and the part-based skeleton representation; a single careful pass suffices unless you plan to use the released dataset, then do a second pass on the training setup.
Muscles / Skinning
-
, , ,
Combines NLP with physics-based character control so natural language commands specify tasks and low-level motion skills.
abstract ▾ abstract ▴
Developing systems that can synthesize natural and life-like motions for simulated characters has long been a focus for computer animation. But in order for these systems to be useful for downstream applications, they need not only produce high-quality motions, but must also provide an accessible and versatile interface through which users can direct a character’s behaviors. Natural language provides a simple-to-use and expressive medium for specifying a user’s intent. Recent breakthroughs in natural language processing (NLP) have demonstrated effective use of language-based interfaces for applications such as image generation and program synthesis. In this work, we present PADL, which leverages recent innovations in NLP in order to take steps towards developing language-directed controllers for physics-based character animation. PADL allows users to issue natural language commands for specifying both high-level tasks and low-level skills that a character should perform. We present an adversarial imitation learning approach for training policies to map high-level language commands to low-level controls that enable a character to perform the desired task and skill specified by a user’s commands.
Related SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation · CALM: Conditional Adversarial Latent Models for Directable Virtual Characters · AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control · UniCon: Universal Neural Controller for Physics-Based Character Motion
how to read this ▾ how to read this ▴
- Category
- Method: language-directed physics-based character control
- Contributions
-
- PADL, a system that lets users issue natural-language commands specifying both high-level tasks and low-level skills for a simulated character
- An adversarial imitation learning approach that maps language commands to low-level controls
- Steps toward an accessible language-based interface for physics-based animation, drawing on recent NLP advances
- Context
- Combines NLP language interfaces with adversarial skill embeddings for physically simulated characters in the lineage of ASE (Peng et al. 2022). Builds on: ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters
- Correctness
- Validated within simulated physics-based control, so behavior coverage is bounded by the imitation data and the skills the policy learned, and language grounding is only as broad as the trained command set.
- Clarity
- Accessible framing; a first pass conveys the language-to-control idea, a second pass is needed for the adversarial imitation training.
- How to read it
- Read first for how language maps to high-level tasks versus low-level skills; second pass on the imitation-learning objective if you want to reproduce or extend the controller.
Motion Synthesis
-
, ,
Conditional VAE latent space over motion clips drives a physics controller, enabling goal-directed interactive locomotion from unstructured data.
abstract ▾ abstract ▴
High-quality motion capture datasets are now publicly available, and researchers have used them to create kinematics-based controllers that can generate plausible and diverse human motions without conditioning on specific goals (i.e., a task-agnostic generative model). In this paper, we present an algorithm to build such controllers for physically simulated characters having many degrees of freedom. Our physics-based controllers are learned by using conditional VAEs, which can perform a variety of behaviors that are similar to motions in the training dataset. The controllers are robust enough to generate more than a few minutes of motion without conditioning on specific goals and to allow many complex downstream tasks to be solved efficiently. To show the effectiveness of our method, we demonstrate controllers learned from several different motion capture databases and use them to solve a number of downstream tasks that are challenging to learn controllers that generate natural-looking motions from scratch. We also perform ablation studies to demonstrate the importance of the elements of the algorithm. Code and data for this paper are available at: https://github.com/facebookresearch/PhysicsVAE
Related Physical Based Motion Reconstruction From Videos Using Musculoskeletal Model · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters · DReCon: Data-Driven Responsive Control of Physics-Based Characters · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
how to read this ▾ how to read this ▴
- Category
- Method: physics-based character controller from a conditional VAE
- Contributions
-
- An algorithm to build physics-based controllers for high-DOF simulated characters using conditional VAEs over motion capture data
- A task-agnostic generative controller robust enough to produce minutes of motion without goal conditioning and to support diverse downstream tasks
- Ablation studies isolating the importance of each algorithmic element, with code and data released
- Context
- Extends latent-variable motion modeling such as Motion VAEs (Ling et al. 2020) into the physically simulated, high-DOF control setting. Builds on: Character Controllers Using Motion VAEs
- Correctness
- Demonstrated on several motion capture databases and downstream tasks, so the behaviors it can express are inherited from the training motions; the ablations support the design choices but generalization beyond captured styles is not assumed.
- Clarity
- Accessible; a first pass conveys how the cVAE latent space drives the controller, a second pass covers the training and ablation details.
- How to read it
- Focus on the conditional VAE structure and how the latent conditions the physics controller; do a second pass on the ablations if you are designing a similar system.
Motion Synthesis
-
, , , , ,
Progressive Cloth Simulation (PCS) is a forward simulation method for efficient preview of cloth quasistatics on very coarse triangle meshes, with consistent and progressive refinement over a hierarch
abstract ▾ abstract ▴
Progressive Cloth Simulation (PCS) is a forward simulation method for efficient preview of cloth quasistatics on very coarse triangle meshes, with consistent and progressive refinement over a hierarchy of higher-resolution models. Coarse previews remain close to the converged high-resolution drape, so designers can interactively inspect results that reliably predict the final folds, wrinkles, and draping. The method progressively improves the solution across the mesh hierarchy while preserving consistent behavior throughout upsampling.
Related Subspace Clothing Simulation Using Adaptive Bases · Multi-Resolution Isotropic Strain Limiting · Directing Cloth Draping through Blended UVs · Adaptive Anisotropic Remeshing for Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: progressive cloth quasistatics simulation
- Contributions
-
- Progressive Cloth Simulation (PCS), a forward method for efficient preview of cloth quasistatics on very coarse triangle meshes
- Consistent, progressive refinement across a hierarchy of higher-resolution models so coarse previews reliably predict the converged drape
- Interactive inspection of folds, wrinkles, and draping that remains close to the final high-resolution result
- Context
- Sits in the cloth simulation lineage going back to implicit-integration foundations like Large Steps in Cloth Simulation (Baraff and Witkin 1998), here targeting progressive quasistatic preview. Builds on: Large Steps in Cloth Simulation
- Correctness
- Targets cloth quasistatics (static drape) rather than full dynamics, and the claim is that coarse previews stay close to the converged high-resolution drape; readers should note it is a preview-to-refinement workflow, not a substitute for dynamic simulation.
- Clarity
- Accessible problem statement; a first pass conveys the progressive-preview idea, a second pass is needed for the hierarchy and convergence formulation.
- How to read it
- Read first for the quasistatics scope and the coarse-to-fine consistency guarantee; second pass on the optimization over the mesh hierarchy if you implement it.
CFX
- QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars SIGGRAPH Asia Academic 157 cites
, ,
Deep RL drives a full-body simulated avatar in real time using only VR headset and controller signals as input.
abstract ▾ abstract ▴
Real-time tracking of human body motion is crucial for interactive and immersive experiences in AR/VR. However, very limited sensor data about the body is available from standalone wearable devices such as HMDs (Head Mounted Devices) or AR glasses. In this work, we present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers, and simulates plausible and physically valid full body motions. Using high quality full body motion as dense supervision during training, a simple policy network can learn to output appropriate torques for the character to balance, walk, and jog, while closely following the input signals. Our results demonstrate surprisingly similar leg motions to ground truth without any observations of the lower body, even when the input is only the 6D transformations of the HMD. We also show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments.
Related UniCon: Universal Neural Controller for Physics-Based Character Motion · Physics-Based Motion Retargeting from Sparse Inputs · DReCon: Data-Driven Responsive Control of Physics-Based Characters · SuperTrack: Motion Tracking for Physically Simulated Characters Using Supervisory Signals
how to read this ▾ how to read this ▴
- Category
- Method: full-body motion tracking from sparse VR sensors
- Contributions
-
- A reinforcement learning framework that drives a physically valid full-body simulated avatar from only HMD and two controller signals in real time
- Use of high-quality full-body motion as dense supervision so a simple policy learns torques to balance, walk, and jog while following the input
- A single policy shown to be robust across diverse locomotion styles, body sizes, and novel environments, including plausible leg motion from upper-body input alone
- Context
- Builds on physics-based motion tracking with supervisory signals such as SuperTrack (Fussell et al. 2021), specialized to the sparse three-point VR input case. Builds on: SuperTrack: Motion Tracking for Physically Simulated Characters Using Supervisory Signals
- Correctness
- Lower-body motion is inferred without lower-body observations, so plausible leg motion is a learned prior conditioned on sparse upper-body signals rather than a measurement; accuracy on motions far from the training distribution is not guaranteed.
- Clarity
- Accessible; a first pass conveys the sparse-input-to-full-body idea, a second pass covers the RL training and supervision.
- How to read it
- Focus on what signals are available versus inferred and how dense supervision is used; second pass on the policy and reward design if you build sparse-input avatars.
Motion Synthesis / Retargeting
-
, , ,
Neural interpolation between precomputed hair simulation snapshots achieves real-time performance while preserving physically accurate dynamic behavior.
abstract ▾ abstract ▴
Traditionally, reduced hair simulation methods are either restricted to heuristic approximations or bound to specific hairstyles. We introduce the first CNN-integrated framework for simulating various hairstyles. The approach produces visually realistic hairs with an interactive speed. To address the technical challenges, our hair simulation pipeline is designed as a two-stage process. First, we present a fully-convolutional neural interpolator as the backbone generator to compute dynamic weights for guide hair interpolation. Then, we adopt a second generator to produce fine-scale displacements to enhance the hair details. We train the neural interpolator with a dedicated loss function and the displacement generator with an adversarial discriminator. Experimental results demonstrate that our method is effective, efficient, and superior to the state-of-the-art on a wide variety of hairstyles. We further propose a performance-driven digital avatar system and an interactive hairstyle editing tool to illustrate the practical applications.
Related A Pixel-Based Framework for Data-Driven Clothing · Stable Spaces for Real-time Clothing · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network
how to read this ▾ how to read this ▴
- Category
- Method: real-time hair simulation via neural interpolation
- Contributions
-
- The first CNN-integrated framework presented for simulating a variety of hairstyles at interactive speed
- A two-stage pipeline with a fully-convolutional neural interpolator for guide-hair interpolation weights and a second generator for fine-scale displacements
- A performance-driven digital avatar system and an interactive hairstyle editing tool as applications
- Context
- Builds on reduced-model and interactive hair simulation such as A Reduced Model for Interactive Hairs (Chai et al. 2014), replacing heuristic reduction with learned interpolation. Builds on: A Reduced Model for Interactive Hairs
- Correctness
- Trained with a dedicated loss plus an adversarial discriminator and reported superior across many hairstyles, but as a learned interpolator its fidelity is bounded by the training simulations and chosen guide hairs rather than a from-scratch physical solve.
- Clarity
- Accessible; a first pass conveys the two-stage interpolate-then-detail design, a second pass covers the loss and discriminator.
- How to read it
- Read first for the two-stage architecture and what each generator produces; second pass on the training losses if you want to reproduce the real-time quality.
CFX / ML Deformation
-
, , , ,
Bird feathers exhibit fascinating reflectance governed by fiber-like structures whose hierarchical patterns span many orders of magnitude in scale, with non-cylindrical fiber cross-sections
abstract ▾ abstract ▴
Bird feathers exhibit fascinating reflectance governed by fiber-like structures whose hierarchical patterns span many orders of magnitude in scale, with non-cylindrical fiber cross-sections and regular nanostructures that produce rich structural color. This work introduces a feather modeling and rendering framework that abstracts the microscopic geometry and reflectance into a microfacet-like BSDF requiring no precomputation or storage and supporting efficient importance sampling. The model is validated against a BSDF-capturing setup for small biological structures and calibrated photographs of rock dove neck feathers.
Related A Surface-based Appearance Model for Pennaceous Feathers · Appearance Modeling of Iridescent Feathers with Diverse Nanostructures · A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence · Modeling and Rendering of Realistic Feathers
how to read this ▾ how to read this ▴
- Category
- Method: appearance model for iridescent feathers
- Contributions
-
- A feather modeling and rendering framework that abstracts hierarchical fiber geometry and structural color into a microfacet-like BSDF
- A model requiring no precomputation or storage and supporting efficient importance sampling
- Validation against a BSDF-capturing setup for small biological structures and calibrated photographs of rock dove neck feathers
- Context
- Extends microfacet and structural-color theory, in the lineage of practical iridescence models such as Belcour and Barla (2017), to non-cylindrical feather fibers with nanostructures. Builds on: A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence
- Correctness
- Validated against captured BSDF measurements and calibrated photographs of rock dove neck feathers, so it is grounded in real data for that species; generalization to other feather types and scales is an abstraction of the underlying micro-geometry rather than a full first-principles simulation.
- Clarity
- Specialized; a first pass conveys the BSDF abstraction goal, a second pass is needed for the microfacet derivation and sampling.
- How to read it
- Focus first on how multi-scale fiber structure is collapsed into a sampleable BSDF; do a second pass on the importance-sampling and validation if you are implementing the shader.
CFX
-
,
Reviews the evolution of Pixar's cloth authoring tools from 2001 and presents the updated tailoring pipeline deployed on Turning Red and Lightyear.
abstract ▾ abstract ▴
This work presents the most recent updates to the cloth tailoring pipeline at Pixar. We start by reviewing the evolution of cloth authoring tools used at Pixar from 2001 to the present day. Motivated by previous approaches, we introduce a structured workflow for cloth tailoring that manages multiple mesh versions concurrently. In our implementation, artists interact primarily with a low-resolution quad-dominant mesh, which defines the garment look as well as setups for rigging and simulation. Our system then converts this coarse input model into a triangulated mesh for simulation and a quadrangulated subdivision surface for rendering. To this end, we developed a new remeshing tool that outputs surface triangulations with adaptive resolution and conforming to edge constraints. We also devised procedural routines to generate render meshes by applying fold-over thickness, refining the mesh, and inserting seams. In addition, we introduced a suite of algorithms for transferring input attributes onto the derived meshes, including UV shells, face colors, crease edges, and vertex weights. Our revamped pipeline was deployed on Pixar’s feature films Turning Red and Lightyear, producing hundreds of high-quality garment meshes.
Related Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production · Simulating Wind Effects on Cloth and Hair in Disney's Frozen · USD in Production · Zero to USD in 80 Days
how to read this ▾ how to read this ▴
- Category
- Production talk: Pixar cloth tailoring pipeline
- Contributions
-
- Demonstrates a structured cloth-tailoring workflow where artists drive a low-resolution quad-dominant mesh that defines look, rigging, and simulation setup
- Shows automatic conversion to a simulation triangulation and a quadrangulated subdivision render surface, including a new adaptive remeshing tool conforming to edge constraints
- Presents procedural render-mesh generation (fold-over thickness, refinement, seam insertion) and attribute transfer for UVs, face colors, creases, and vertex weights
- Context
- Continues Pixar's costume and cloth production lineage (e.g., Art-Directed Costumes at Pixar, de Goes et al. 2018), reviewing tool evolution from 2001 to the present. Builds on: Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production
- Correctness
- Studio practice rather than peer-reviewed method; results are production-proven, deployed on Turning Red and Lightyear, so trade-offs reflect Pixar's specific pipeline and tooling rather than a general benchmark.
- Clarity
- Accessible and practitioner-oriented; a first pass conveys the multi-mesh workflow, a second pass surfaces the remeshing and attribute-transfer specifics.
- How to read it
- Read for the concurrent multi-mesh-version concept and how one coarse mesh feeds sim and render; revisit the remeshing and attribute-transfer sections if you build authoring tools.
CFX
-
, , , , , , ,
Procedural length-preserving rope rig handling over 5000 ropes on the hero tall ship in The Sea Beast with intuitive animator controls.
abstract ▾ abstract ▴
This talk presents an animation friendly, procedural solution for animating ropes in the movie The Sea Beast. With over 5000 ropes on our hero tall ship, we embarked on development of a better rope solution for our animators. The resulting rope rig would allow our animators to interact with ropes using intuitive controls, while producing complex shapes and preserving length. Character interactions with ropes were easy to produce, and stretchy, rubbery ropes avoided. This new rope rig was critical to the believability of our world, considering the massive number of ropes, in so many shots. With this new rig, we were even able to final some shots in animation, without having to simulate the rope dynamics at all.
Related Skunk: DreamWorks Fur Motion System · Improv: A System for Scripting Interactive Actors in Virtual Worlds · Real-Time Motion Retargeting to Highly Varied User-Created Morphologies
how to read this ▾ how to read this ▴
- Category
- Production talk / rigging breakdown
- Contributions
-
- A procedural, length-preserving rope rig deployed at scale (over 5000 ropes on one hero ship)
- Intuitive animator controls that produce complex rope shapes and easy character interactions
- Enough animation control that some shots could be finaled without simulating rope dynamics
- Context
- A production rigging effort for The Sea Beast, sitting in the lineage of character-friendly procedural rig tools that aim to keep dynamic elements controllable by animators rather than left to simulation.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on one film's tall-ship shots, so the approach's generality beyond ropes and that show's needs is not established here.
- Clarity
- Accessible; a single first pass conveys the workflow and intent, with little formal math to revisit.
- How to read it
- Read once for the workflow and the animator-control philosophy; focus on how length preservation and shape control are exposed to animators, and watch the accompanying video for the believability claims.
Rigging / CFX
-
, , ,
Builds a volumetric musculoskeletal hand model from MRI data that matches scanned geometry across the full range of motion.
abstract ▾ abstract ▴
Precision modeling of the hand internal musculoskeletal anatomy has been largely limited to individual poses, and has not been connected into continuous volumetric motion of the hand anatomy actuating across the hand's entire range of motion. This is for a good reason, as hand anatomy and its motion are extremely complex and cannot be predicted merely from the anatomy in a single pose. We give a method to simulate the volumetric shape of hand's musculoskeletal organs to any pose in the hand's range of motion, producing external hand shapes and internal organ shapes that match ground truth optical scans and medical images (MRI) in multiple scanned poses. We achieve this by combining MRI images in multiple hand poses with FEM multibody nonlinear elastoplastic simulation. Our system models bones, muscles, tendons, joint ligaments and fat as separate volumetric organs that mechanically interact through contact and attachments, and whose shape matches medical images (MRI) in the MRI-scanned hand poses. The match to MRI is achieved by incorporating pose-space deformation and plastic strains into the simulation. We show how to do this in a non-intrusive manner that still retains all the simulation benefits, namely the ability to prescribe realistic material properties, generalize to arbitrary poses, preserve volume and obey contacts and attachments.
Related EMU: Efficient Muscle Simulation in Deformation Space · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging · Pose-Space Subspace Dynamics · Anatomically Detailed Simulation of Human Torso
how to read this ▾ how to read this ▴
- Category
- Method: anatomical hand simulation from medical imaging
- Contributions
-
- A method to simulate the volumetric musculoskeletal hand across its full range of motion, not just isolated poses
- Separate volumetric organs (bones, muscles, tendons, ligaments, fat) that interact via contact and attachments in FEM multibody nonlinear elastoplastic simulation
- Matches optical scans and MRI in scanned poses by incorporating pose-space deformation and plastic strains
- Context
- Extends MRI-driven hand modeling (Wang et al., Hand Modeling and Simulation Using Stabilized MRI, 2019) from single-pose anatomy toward continuous volumetric motion across the range of motion. Builds on: Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging
- Correctness
- Validation is against MRI and optical scans in a set of captured poses; the match to ground truth is demonstrated for those scanned poses, so accuracy for the in-between poses rests on the simulation and pose-space deformation generalizing, which a reader should keep in mind.
- Clarity
- Dense; a first pass gives the organ-based modeling idea, but the FEM, plasticity, and pose-space formulation need a careful second pass.
- How to read it
- First pass for the modeling decomposition and data pipeline; do a second pass on the FEM elastoplastic formulation and how plastic strains plus pose-space deformation force the MRI match if you intend to reimplement.
Muscles
-
, , , ,
Composes new rigged and skinned characters by mixing body parts from production-ready animated models.
abstract ▾ abstract ▴
We propose a novel technique to compose new 3D animated models, such as videogame characters, by combining pieces from existing ones. Our method works on production-ready rigged, skinned, and animated 3D models to reassemble new ones. We exploit mix-and-match operations on the skeletons to trigger the automatic creation of a new mesh, linked to the new skeleton by a set of skinning weights and complete with a set of animations. The resulting model preserves the quality of the input meshings (which can be quad-dominant and semi-regular), skinning weights (inducing believable deformation), and animations, featuring coherent movements of the new skeleton. Our method enables content creators to reuse valuable, carefully designed assets by assembling new ready-to-use characters while preserving most of the hand-crafted subtleties of models authored by digital artists. As shown in the accompanying video, it allows for drastically cutting the time needed to obtain the final result.
Related Geodesic Voxel Binding for Production Character Meshes · Animation Setup Transfer for 3D Characters · Real-Time Skeletal Skinning with Optimized Centers of Rotation · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks
how to read this ▾ how to read this ▴
- Category
- Method: mix-and-match composition of rigged characters
- Contributions
-
- A technique to compose new animated 3D models by combining parts of existing production-ready rigged, skinned, animated assets
- Mix-and-match operations on skeletons that automatically trigger creation of a new mesh, skinning weights, and animations
- Preserves input meshing quality (quad-dominant, semi-regular), believable skinning deformation, and coherent skeleton motion
- Context
- Builds on the tradition of automatic rigging and skinning of 3D characters (Baran and Popovic, Automatic Rigging and Animation of 3D Characters, 2007), reframing it as reuse and recombination of already-authored assets. Builds on: Automatic Rigging and Animation of 3D Characters
- Correctness
- Demonstrated on videogame-style rigged characters in the accompanying video; results depend on the inputs being well-authored and structurally compatible, so seams, weight, and animation coherence at the joins are the things to scrutinize.
- Clarity
- Fairly accessible; a first pass conveys the operations and intent, with a second pass for the mesh-stitching and weight-transfer details.
- How to read it
- Read once for the skeleton-driven workflow and what is automated; second pass on the mesh blending and skinning-weight transfer at boundaries, and watch the video to judge deformation quality.
Rigging / Skinning
-
, ,
Physics-based self-supervised loss recasts implicit integration as optimization, training garment networks without labeled data at two orders of magnitude faster than supervised methods.
abstract ▾ abstract ▴
We present a self-supervised method to learn dynamic 3D deformations of garments worn by parametric human bodies. State-of-the-art data-driven approaches to model 3D garment deformations are trained using supervised strategies that require large datasets, usually obtained by expensive physics-based simulation methods or professional multi-camera capture setups. In contrast, we propose a new training scheme that removes the need for ground-truth samples, enabling self-supervised training of dynamic 3D garment deformations. Our key contribution is to realize that physics-based deformation models, traditionally solved in a frame-by-frame basis by implicit integrators, can be recasted as an optimization problem. We leverage such optimization-based scheme to formulate a set of physics-based loss terms that can be used to train neural networks without precomputing ground-truth data. This allows us to learn models for interactive garments, including dynamic deformations and fine wrinkles, with a two orders of magnitude speed up in training time compared to state-of-the-art supervised methods.
Related PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · Neural Cloth Simulation · SwinGar: Spectrum-Inspired Neural Dynamic Deformation for Free-Swinging Garments · A Pixel-Based Framework for Data-Driven Clothing
how to read this ▾ how to read this ▴
- Category
- Method: self-supervised neural garment deformation
- Contributions
-
- A self-supervised scheme that learns dynamic 3D garment deformation on parametric bodies without ground-truth data
- Recasting implicit-integrator physics as an optimization, yielding physics-based loss terms to train the network directly
- Interactive garments with dynamics and fine wrinkles, with a reported two-orders-of-magnitude training speedup over supervised methods
- Context
- Departs from supervised, simulation- or capture-trained garment models such as Santesteban et al. (Learning-Based Animation of Clothing for Virtual Try-On, 2019) by replacing labeled data with a physics-derived loss. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- The core assumption is that frame-by-frame implicit integration can be written as an optimization whose terms serve as a training loss; results are demonstrated on garments over parametric human bodies, so behavior outside the trained body and garment space is the main caution.
- Clarity
- Moderately dense; the idea is graspable on a first pass, but the loss derivation from the implicit integrator rewards a second pass.
- How to read it
- Read once for the self-supervised insight (physics as loss); do a second pass on the energy terms and how implicit integration becomes the optimization objective if you want to apply it to other deformables.
CFX / ML Deformation
- Space Rangers with Cornrows: Methods for Modeling Braids and Curls in Pixar's Groom Pipeline SIGGRAPH Pixar 4 cites
Procedural groom tools for generating braids, curls, and edge hairs along hand-sculpted source curves, creating Lightyear characters' hairstyles.
abstract ▾ abstract ▴
This presentation is a debrief of the processes and methods added to Pixar’s groom pipeline to create the hairstyles of Lightyear characters Alisha and Izzy Hawthorne. The processes include novel ways of generating braids, curls, braid partitioning hairs (edge hairs), and graphic shapes populated with hair.
Related Wig Refitting in Pixar's Inside Out 2 · Build Your Own Procedural Grooming Pipeline · Hair and Fur in an Evolving Pipeline · Simulating Rapunzel's Hair in Disney's Tangled
how to read this ▾ how to read this ▴
- Category
- Production talk / groom pipeline breakdown
- Contributions
-
- New procedural groom methods for generating braids and curls along hand-sculpted source curves
- Techniques for braid-partitioning (edge) hairs and for populating graphic shapes with hair
- Application to creating the hairstyles of the Lightyear characters Alisha and Izzy Hawthorne
- Context
- Extends Pixar's existing groom pipeline and curly-hair work (Iben et al., Artistic Simulation of Curly Hair, 2013), adding authoring tools for braided and curled styles built on artist-drawn guide curves. Builds on: Artistic Simulation of Curly Hair
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on specific Lightyear characters, so the methods are validated by shipped shots rather than by formal evaluation or comparison.
- Clarity
- Accessible and concrete; a first pass conveys the tools and the artist workflow they support.
- How to read it
- Read once for the procedural-along-curves approach to braids and curls; focus on how artist-sculpted source curves drive generation and on the edge-hair partitioning trick, which is the most transferable idea.
CFX
-
, , ,
SUPR jointly trains a full-body and part-specific models from 1.2 million scans, introducing a novel kinematic foot model with contact-aware deformations.
abstract ▾ abstract ▴
Statistical 3D shape models of the head, hands, and fullbody are widely used in computer vision and graphics. Despite their wide use, we show that existing models of the head and hands fail to capture the full range of motion for these parts. Moreover, existing work largely ignores the feet, which are crucial for modeling human movement and have applications in biomechanics, animation, and the footwear industry. The problem is that previous body part models are trained using 3D scans that are isolated to the individual parts. Such data does not capture the full range of motion for such parts, e.g. the motion of head relative to the neck. Our observation is that full-body scans provide important information about the motion of the body parts. Consequently, we propose a new learning scheme that jointly trains a full-body model and specific part models using a federated dataset of full-body and body-part scans. Specifically, we train an expressive human body model called SUPR (Sparse Unified Part-Based Human Representation), where each joint strictly influences a sparse set of model vertices. The factorized representation enables separating SUPR into an entire suite of body part models. Note that the feet have received little attention and existing 3D body models have highly under-actuated feet.
Related STAR: Sparse Trained Articulated Human Body Regressor · Expressive Body Capture: 3D Hands, Face, and Body from a Single Image · ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling · NIMBLE: A Non-rigid Hand Model with Bones and Muscles
how to read this ▾ how to read this ▴
- Category
- Method / model: a unified part-based human body model
- Contributions
-
- SUPR, an expressive body model where each joint strictly influences a sparse set of vertices
- A learning scheme that jointly trains a full-body model and separable part models from a federated dataset of full-body and body-part scans
- A novel kinematic foot model with contact-aware deformations, and the ability to separate SUPR into a suite of body-part models
- Context
- Builds directly on sparse articulated body modeling (Osman et al., STAR, 2020) and addresses the limitation that part-only scans miss the full range of motion that full-body scans reveal. Builds on: STAR: Sparse Trained Articulated Human Body Regressor
- Correctness
- The key claim is that jointly training on full-body plus part scans captures motion (e.g. head relative to neck, foot contact) that isolated-part training misses; it is trained on a large scan corpus, but expressiveness for unusual anatomies or poses outside the captured distribution remains a fair caution.
- Clarity
- Reasonably accessible for readers familiar with parametric body models; a first pass conveys the federated-training idea, a second pass for the factorization and foot model.
- How to read it
- First pass for the motivation and the federated joint-training scheme; second pass on the sparse joint-to-vertex factorization and the kinematic contact-aware foot model if you use or extend body models like SMPL or STAR.
Skinning / ML Deformation
-
, ,
Variational transformer encodes text and motion into a joint latent space, generating diverse motion sequences from a single description.
abstract ▾ abstract ▴
We address the problem of generating diverse 3D human motions from textual descriptions. This challenging task requires joint modeling of both modalities: understanding and extracting useful human-centric information from the text, and then generating plausible and realistic sequences of human poses. In contrast to most previous work which focuses on generating a single, deterministic, motion from a textual description, we design a variational approach that can produce multiple diverse human motions. We propose TEMOS, a text-conditioned generative model leveraging variational autoencoder (VAE) training with human motion data, in combination with a text encoder that produces distribution parameters compatible with the VAE latent space. We show the TEMOS framework can produce both skeleton-based animations as in prior work, as well more expressive SMPL body motions. We evaluate our approach on the KIT Motion-Language benchmark and, despite being relatively straightforward, demonstrate significant improvements over the state of the art. Code and models are available on our webpage.
Related Action-Conditioned 3D Human Motion Synthesis with Transformer VAE · Executing Your Commands via Motion Diffusion in Latent Space · Generating Diverse and Natural 3D Human Motions from Text · MotionCLIP: Exposing Human Motion Generation to CLIP Space
how to read this ▾ how to read this ▴
- Category
- Method: text-conditioned human motion generation
- Contributions
-
- TEMOS, a variational (VAE) generative model that produces multiple diverse 3D human motions from a single textual description
- A text encoder that outputs distribution parameters compatible with the motion VAE latent space, jointly modeling text and motion
- Generation of both skeleton-based animations and more expressive SMPL body motions
- Context
- Extends transformer-VAE motion synthesis (Petrovich et al., ACTOR, 2021) from action-conditioned generation toward free-text conditioning with diverse, non-deterministic output. Builds on: Action-Conditioned 3D Human Motion Synthesis with Transformer VAE
- Correctness
- Evaluated on the KIT Motion-Language benchmark with reported improvements over prior work; as the authors note it is relatively straightforward, so generalization beyond that benchmark's vocabulary and motion distribution is the main caveat.
- Clarity
- Accessible to readers with VAE and transformer background; a first pass conveys the joint-latent-space idea, a second pass for the training objective and diversity mechanism.
- How to read it
- Read once for the text-and-motion shared-latent-space design; second pass on the VAE training and how the text encoder is aligned to the latent space, plus the KIT evaluation protocol.
Motion Synthesis
-
,
GAMMA decomposes long-term scene-aware human motion into generative body-marker primitives with a policy for perpetual goal-directed navigation.
abstract ▾ abstract ▴
Our goal is to populate digital environments, in which digital humans have diverse body shapes, move perpetu-ally, and have plausible body-scene contact. The core challenge is to generate realistic, controllable, and infinitely long motions for diverse 3D bodies. To this end, we propose generative motion primitives via body surface markers, or GAMMA in short. In our solution, we decompose the long-term motion into a time sequence of motion primitives. We exploit body surface markers and conditional variational autoencoder to model each motion primitive, and generate long-term motion by implementing the gen-erative model recursively. To control the motion to reach a goal, we apply a policy network to explore the genera-tive model's latent space and use a tree-based search to preserve the motion quality during testing. Experiments show that our method can produce more realistic and controllable motion than state-of-the-art data-driven methods. With conventional path-finding algorithms, the generated human bodies can realistically move long distances for a long period of time in the scene. Code is released for re-search purposes at: https://yz-cnsdqz.github.io/eigenmotion/GAMMA/
Related Interactive Character Control with Auto-Regressive Motion Diffusion Models · PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network · Composite Motion Learning with Task Control · PDP: Physics-Based Character Animation via Diffusion Policy
how to read this ▾ how to read this ▴
- Category
- Method: long-term scene-aware human motion generation
- Contributions
-
- GAMMA, which decomposes long-term motion into a sequence of generative motion primitives modeled with body surface markers and a conditional VAE
- Recursive application of the generative model to produce perpetual, arbitrarily long motion for diverse body shapes
- A policy network plus tree-based search over the latent space for goal-directed control while preserving motion quality
- Context
- Tackles character-scene interaction and navigation in the spirit of learned controllers like Starke et al. (Neural State Machine for Character-Scene Interactions, 2019), using marker-based generative primitives instead of a state machine. Builds on: Neural State Machine for Character-Scene Interactions
- Correctness
- The approach assumes recursively chained motion primitives stay stable and plausible over long horizons, with the tree search guarding quality; it reports more realistic and controllable motion than data-driven baselines, but plausibility of body-scene contact over very long durations is the thing to watch.
- Clarity
- Moderately dense; a first pass conveys the primitive-plus-policy structure, a second pass for the CVAE and the search procedure.
- How to read it
- Read once for the decomposition into recursive marker-based primitives and the goal-reaching policy; second pass on the tree-search control and how contact and quality are maintained over long sequences.
Motion Synthesis
-
, ,
MICA predicts metric-accurate FLAME face shapes from single images using supervised learning on 2,000+ identities with face recognition features.
abstract ▾ abstract ▴
Face reconstruction and tracking is a building block of numerous applications in AR/VR, human-machine interac-tion, as well as medical applications. Most of these applications rely on a metrically correct prediction of the shape, especially, when the reconstructed subject is put into a metrical context (i.e., when there is a reference object of known size). A metrical reconstruction is also needed for any application that measures distances and dimensions of the subject (e.g., to virtually fit a glasses frame). State-of-the-art methods for face reconstruction from a single image are trained on large 2D image datasets in a self-supervised fashion. However, due to the nature of a perspective projection they are not able to reconstruct the actual face dimensions, and even predicting the average human face outperforms some of these methods in a metrical sense. To learn the actual shape of a face, we argue for a supervised training scheme. Since there exists no large-scale 3D dataset for this task, we annotated and unified small- and medium-scale databases. The resulting unified dataset is still a medium-scale dataset with more than 2k identities and training purely on it would lead to overfitting.
Related EMOCA: Emotion Driven Monocular Face Capture and Animation · I M Avatar: Implicit Morphable Head Avatars from Videos · Neural Head Avatars from Monocular RGB Videos · Learning Neural Parametric Head Models
how to read this ▾ how to read this ▴
- Category
- Method: metrically accurate single-image face reconstruction
- Contributions
-
- MICA, which predicts metric-accurate FLAME face shape from a single image
- A supervised training scheme arguing that self-supervised 2D-trained methods cannot recover true face dimensions under perspective projection
- A unified, annotated 3D dataset assembled from small and medium databases (over 2k identities), leveraging face-recognition features
- Context
- Builds on the FLAME face model (Li et al., Learning a Model of Facial Shape and Expression from 4D Scans, 2017) and contrasts itself with self-supervised single-image methods that lack metric scale. Builds on: Learning a Model of Facial Shape and Expression from 4D Scans
- Correctness
- The central premise is that metric shape requires supervised 3D training because perspective projection makes 2D self-supervision scale-ambiguous; trained on a unified medium-scale dataset of 2k+ identities, so coverage of identities and conditions outside that data is the natural limitation, and the authors themselves note overfitting risk from training purely on it.
- Clarity
- Accessible; a first pass conveys the metric-accuracy argument and the dataset strategy, a second pass for the network and loss details.
- How to read it
- Read once for the metric-vs-self-supervised argument and the dataset unification; second pass on how face-recognition features and the regularization counter overfitting if metric face shape matters to your application.
Facial
-
Presents Pixar's profile mover and curve-net system deployed on Panda Mei, achieving surface-detail-preserving deformations controlled by sparse bezier or Catmull-Rom curve networks.
Rigging
-
, , ,
Animal Logic integrates USD into a large legacy pipeline for simultaneous feature film production, documenting architecture decisions that improved toolchain productivity across character and shot work.
abstract ▾ abstract ▴
This paper describes the key steps by which the animation and VFX studio Animal Logic integrated Pixar's Universal Scene Description into a large existing legacy pipeline. It discusses architectural choices, including an entity-fragment composition pattern inspired by the Entity-Component-System pattern, entity domains that map to departmental workflows, technical variants for heavy payload data, and a USD-based packaging and versioning scheme. The authors detail the software systems developed to support these patterns, namely the LEAF, AssetWorkshop, and VirtualBreakdown toolkits, and the phased rollout across productions such as Peter Rabbit and DC League of Super-Pets. The successful migration consolidated many workflows and technology stacks, enabling the simultaneous development of multiple feature films.
Related Forging a New Animation Pipeline with USD · Universal Scene Description: Open Source Release · Combining the Benefits of Nodes and Layers in a USD World · A Deep Dive into Universal Scene Description and Hydra
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline architecture breakdown
- Contributions
-
- Describes integrating Pixar's USD into Animal Logic's large existing legacy pipeline
- Architectural patterns: an entity-fragment composition pattern (inspired by Entity-Component-System), entity domains mapped to departments, technical variants for heavy payloads, and a USD-based packaging and versioning scheme
- Documents the supporting toolkits (LEAF, AssetWorkshop, VirtualBreakdown) and a phased rollout across productions, enabling simultaneous feature-film development
- Context
- Builds on USD's open-source release (Pixar, 2016) and prior pipeline-with-USD experience (Baillet et al., Forging a New Animation Pipeline with USD, 2018), focused on retrofitting USD into a pre-existing studio stack. Builds on: Forging a New Animation Pipeline with USD · Universal Scene Description: Open Source Release
- Correctness
- Studio practice, not peer-reviewed; the architecture is production-proven across named films, so the lessons are concrete but specific to Animal Logic's legacy context and may not transfer directly to a greenfield pipeline.
- Clarity
- Accessible to pipeline and TD readers; a first pass conveys the patterns and rollout, with terminology assuming USD familiarity.
- How to read it
- Read once for the architectural patterns and the migration strategy; focus on the entity-fragment/ECS-inspired composition and the variant-and-versioning scheme as the transferable design lessons rather than the studio-specific toolkit names.
Rigging
-
, , , ,
Zero-shot speech-driven full-body gesture generation using example style references, trained on a diverse high-quality motion dataset.
abstract ▾ abstract ▴
We present ZeroEGGS, a neural network framework for speech‐driven gesture generation with zero‐shot style control by example. This means style can be controlled via only a short example motion clip, even for motion styles unseen during training. Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings. The probabilistic nature of our framework further enables the generation of a variety of outputs given the input, addressing the stochastic nature of gesture motion. In a series of experiments, we first demonstrate the flexibility and generalizability of our model to new speakers and styles. In a user study, we then show that our model outperforms previous state‐of‐the‐art techniques in naturalness of motion, appropriateness for speech, and style portrayal. Finally, we release a high‐quality dataset of full‐body gesture motion including fingers, with speech, spanning across 19 different styles. Our code and data are publicly available at https://github.com/ubisoft/ubisoft‐laforge‐ZeroEGGS.
Related Multi-Objective Adversarial Gesture Generation · FaceFormer: Speech-Driven 3D Facial Animation with Transformers · Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion · MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement
how to read this ▾ how to read this ▴
- Category
- Method: a speech-driven gesture generation model
- Contributions
-
- Zero-shot style control of full-body gesture via a single short example motion clip, including styles unseen in training
- A variational style embedding allowing latent manipulation, blending, and scaling, with probabilistic sampling for varied outputs
- Release of a high-quality full-body (with fingers) gesture-plus-speech dataset spanning 19 styles
- Context
- Builds on prior speech-to-gesture synthesis such as Ferstl's Multi-Objective Adversarial Gesture Generation, replacing adversarial training with a variational example-based style framework. Builds on: Multi-Objective Adversarial Gesture Generation
- Correctness
- Naturalness, appropriateness for speech, and style portrayal are assessed via a user study claiming to outperform prior state of the art, so claims rest on perceptual ratings; readers should remember user-study results are subjective and tied to this dataset.
- Clarity
- Accessible; a first pass conveys the example-based style idea, do a second pass for the variational formulation and latent operations.
- How to read it
- Focus first on how the style example is encoded into the latent embedding and mixed with speech; do a second pass on the variational loss if you intend to reproduce or extend the style control.
Motion Synthesis / Facial
2021
61-
, , , , , , , , , , , ,
Comprehensive survey of 3D morphable face models from the original Basel Face Model to neural variants, covering fitting and applications.
abstract ▾ abstract ▴
In this article, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely, capture, modeling, image formation, and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research, and highlighting the broad range of current and future applications.
Related Reconstruction of Personalized 3D Face Rigs from Monocular Video · Facial Performance Synthesis using Deformation-Driven Polynomial Displacement Maps · A Morphable Model for the Synthesis of 3D Faces · Face2Face: Real-Time Face Capture and Reenactment of RGB Videos
how to read this ▾ how to read this ▴
- Category
- Survey: 3D morphable face models, past, present and future
- Contributions
-
- A detailed survey of 3D Morphable Face Models across the 20 years since they were first proposed
- A state-of-the-art review organized around capture, modeling, image formation, and image analysis
- Identification of unsolved challenges, future research directions, and the broad range of current and future applications
- Context
- Traces the lineage from Blanz and Vetter's original Morphable Model for the Synthesis of 3D Faces through the Basel Face Model to neural variants. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- As a survey it synthesizes and frames prior work rather than presenting new results; a reader should treat its taxonomy and open-problem list as the authors' considered viewpoint, useful for orientation rather than as a benchmark.
- Clarity
- Highly accessible as an entry point; a first pass maps the field, later passes drill into the cited primary works.
- How to read it
- Read it as a map: first pass for the four-stage structure (capture, modeling, image formation, analysis) to locate where a topic sits; revisit specific sections and follow citations for depth.
Facial
-
, , ,
Graph-network ODE emulator adds vivid secondary dynamics to skinned characters over 30x faster than full FEM simulation with topology-independent inference.
abstract ▾ abstract ▴
Fast and light-weight methods for animating 3D characters are desirable in various applications such as computer games. We present a learning-based approach to enhance skinning-based animations of 3D characters with vivid secondary motion effects. We design a neural network that encodes each local patch of a character simulation mesh where the edges implicitly encode the internal forces between the neighboring vertices. The network emulates the ordinary differential equations of the character dynamics, predicting new vertex positions from the current accelerations, velocities and positions. Being a local method, our network is independent of the mesh topology and generalizes to arbitrarily shaped 3D character meshes at test time. We further represent per-vertex constraints and material properties such as stiffness, enabling us to easily adjust the dynamics in different parts of the mesh. We evaluate our method on various character meshes and complex motion sequences. Our method can be over 30 times more efficient than ground-truth physically based simulation, and outperforms alternative solutions that provide fast approximations.
Related Finding Hank · Data-Driven Physics for Human Soft Tissue Animation · Pose-Space Subspace Dynamics · Strain Based Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: learned emulator for secondary motion of 3D characters
- Contributions
-
- A neural network that adds vivid secondary-motion dynamics to skinning-based character animation by emulating the character-dynamics ODEs
- A local patch encoding where mesh edges implicitly encode internal forces, predicting new vertex positions from current accelerations, velocities, and positions, making it topology-independent and generalizable to arbitrary meshes
- Per-vertex constraints and material properties (e.g. stiffness) that let the dynamics be tuned across different mesh regions
- Context
- Sits in the lineage of learned deformation approximators for animation (related to Fast and Deep Deformation Approximations by Bailey et al.), here targeting physically inspired secondary dynamics rather than pose-space deformation. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Reported as over 30x more efficient than ground-truth physically based simulation and outperforming fast-approximation alternatives; a reader should keep in mind it emulates rather than solves the dynamics, so it approximates the reference simulation it was trained against.
- Clarity
- Accessible at the local-patch / graph idea; a first pass conveys the emulation concept, a second pass covers the ODE-style prediction and constraint handling.
- How to read it
- First pass on why a local, topology-independent encoding generalizes across meshes; second pass on the acceleration/velocity/position prediction and material parameters if you plan to integrate it.
ML Deformation / CFX
-
, , , , , ,
Blue Sky Studios reviews three years building a USD-centric layer on Conduit, delivering six short films and documenting how artist feedback shaped the USD character workflow.
abstract ▾ abstract ▴
Over the past three years, Blue Sky Studios built a USD-centric layer on top of its next generation pipeline framework, Conduit. This transition involved mapping the legacy Blue Sky workflows into USD constructs. In addition, direct artist feedback during the delivery of six short films provided insights that informed the evolution of the Conduit backend to support these modernized workflows.
Related Procedural Block-Based USD Workflows in Conduit · Achieving and Maintaining Real-Time Rigs · USD at Scale · USD and Scene Interoperability: Demystifying the State of the Art
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline retrospective: a USD-centric layer on Conduit
- Contributions
-
- Reviews three years at Blue Sky Studios building a USD-centric layer atop the Conduit next-generation pipeline framework
- Documents mapping legacy Blue Sky workflows into USD constructs
- Shares insights from delivering six short films and how direct artist feedback drove the evolution of the Conduit backend and the USD character workflow
- Context
- A retrospective tying together the studio's Conduit pipeline framework and Pixar's Universal Scene Description open-source release as the basis of the modernized workflow. Builds on: Conduit: A Modern Pipeline for the Open Source World · Universal Scene Description: Open Source Release
- Correctness
- Studio practice rather than peer-reviewed research; the lessons are production-proven across six short films but reflect one studio's context, so transferability to other pipelines is not guaranteed.
- Clarity
- Accessible as a narrative retrospective; a single read conveys the workflow lessons, with no formal derivations to revisit.
- How to read it
- Read once for the migration lessons and the artist-feedback-to-backend loop; note the specific legacy-to-USD mappings if you are planning a similar pipeline transition.
Rigging
-
, , , , , , , , , ,
End-to-end pipeline for 2D-style multiples and smear geometry on 3D characters, inspired by Looney Tunes, deployed on Boss Baby: Family Business.
abstract ▾ abstract ▴
Animating with multiples and smears is a technique of 2D animation dating roughly back to the 1940s, most notably on the Looney Tunes cartoons. We have seen an increase in the use of multiples and smear geometry as some 3D animation becomes more stylized and the computed motion blur is not enough to convey a more exaggerated motion or speed. DreamWorks first began the exploration starting with Peabody and Sherman and has since used different methods throughout the years. For The Boss Baby: Family Business, the DreamWorks A.C.M.E. Multilimb System is an end-to-end pipeline solution for achieving the traditional 2D multiples and smears that puts the control back into the hands of Animation.
Related Group Based Rigging of Realistically Feathered Wings · Sony Imageworks Animation Layout Workflow with Unreal Engine and OpenUSD · The Versatile Rigging of Splat in Strange World · Abstracting Rigging Concepts for a Future Proof Framework Design
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline tool: 2D-style multiples and smears on 3D characters
- Contributions
-
- An end-to-end pipeline (DreamWorks A.C.M.E. Multilimb System) for producing traditional 2D-style multiples and smear geometry on 3D characters
- Puts control of the multiples and smear effect back into animators' hands, targeting exaggerated motion that computed motion blur cannot convey
- Deployed in production on The Boss Baby: Family Business, building on prior DreamWorks explorations since Peabody and Sherman
- Context
- Continues DreamWorks' line of stylized-motion tooling (multiples and smears, a 2D animation technique dating to 1940s Looney Tunes) applied to modern stylized 3D animation.
- Correctness
- Studio practice, not peer-reviewed; the system is production-proven on a shipped feature, so claims are about artist workflow and on-screen results rather than measured generalization.
- Clarity
- Accessible to animation/pipeline readers; a single read conveys what the system does and why, with no formal method to revisit.
- How to read it
- Read once for the workflow design and where artist control sits; focus on how multiples vs smears are authored if you need to build stylized-motion tooling.
Rigging
-
, ,
ACTOR uses a Transformer VAE to synthesize variable-length SMPL motion sequences conditioned on action category labels.
abstract ▾ abstract ▴
We tackle the problem of action-conditioned generation of realistic and diverse human motion sequences. In contrast to methods that complete, or extend, motion sequences, this task does not require an initial pose or sequence. Here we learn an action-aware latent representation for human motions by training a generative variational autoencoder (VAE). By sampling from this latent space and querying a certain duration through a series of positional encodings, we synthesize variable-length motion sequences conditioned on a categorical action. Specifically, we design a Transformer-based architecture, ACTOR, for encoding and decoding a sequence of parametric SMPL human body models estimated from action recognition datasets. We evaluate our approach on the NTU RGB+D, HumanAct12 and UESTC datasets and show improvements over the state of the art. Furthermore, we present two use cases: improving action recognition through adding our synthesized data to training, and motion denoising. Code and models are available on our project page [53].
Related TEMOS: Generating Diverse Human Motions from Textual Descriptions · MotionCLIP: Exposing Human Motion Generation to CLIP Space · Executing Your Commands via Motion Diffusion in Latent Space · SKEL-Betweener: a Neural Motion Rig for Interactive Motion Authoring
how to read this ▾ how to read this ▴
- Category
- Method: action-conditioned generative model for human motion
- Contributions
-
- A Transformer VAE (ACTOR) that learns an action-aware latent space and decodes variable-length SMPL motion sequences from a categorical action label
- Sampling plus positional-encoding queries to control sequence duration without needing an initial pose or seed sequence
- Two demonstrated use cases: augmenting action-recognition training data and motion denoising
- Context
- Generative human-motion synthesis on parametric SMPL bodies, combining VAE latent learning with a Transformer sequence model and evaluated against prior action-recognition motion datasets.
- Correctness
- Validated on NTU RGB+D, HumanAct12 and UESTC with reported improvements over prior work; keep in mind it conditions on coarse action categories (not fine-grained control) and operates on SMPL bodies estimated from action-recognition data, so input-estimate quality bounds the result.
- Clarity
- Accessible at a first pass for the idea; a second pass pays off for the VAE objective and the positional-encoding duration mechanism.
- How to read it
- First pass for the action-conditioned VAE concept and the no-seed-pose framing; second pass on the Transformer encoder/decoder design and loss if you intend to reimplement or extend conditioning.
Motion Synthesis
-
, , , ,
Adversarial motion prior enabling physics-based characters to reproduce diverse motion styles from unstructured motion clip datasets.
abstract ▾ abstract ▴
Synthesizing graceful and life-like behaviors for physically simulated characters has been a fundamental challenge in computer animation. Data-driven methods that leverage motion tracking are a prominent class of techniques for producing high fidelity motions for a wide range of behaviors. However, the effectiveness of these tracking-based methods often hinges on carefully designed objective functions, and when applied to large and diverse motion datasets, these methods require significant additional machinery to select the appropriate motion for the character to track in a given scenario. In this work, we propose to obviate the need to manually design imitation objectives and mechanisms for motion selection by utilizing a fully automated approach based on adversarial imitation learning. High-level task objectives that the character should perform can be specified by relatively simple reward functions, while the low-level style of the character's behaviors can be specified by a dataset of unstructured motion clips, without any explicit clip selection or sequencing. For example, a character traversing an obstacle course might utilize a task-reward that only considers forward progress, while the dataset contains clips of relevant behaviors such as running, jumping, and rolling.
Related Synthesizing Physical Character-Scene Interactions · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · DReCon: Data-Driven Responsive Control of Physics-Based Characters · ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters
how to read this ▾ how to read this ▴
- Category
- Method: adversarial motion prior for physics-based character control
- Contributions
-
- An adversarial imitation-learning prior (AMP) that lets simulated characters reproduce motion styles from an unstructured clip dataset without manual clip selection or sequencing
- Separation of high-level task objectives (simple reward functions) from low-level style (the motion dataset), removing hand-designed imitation objectives
- Context
- Builds on tracking-based physics control such as DeepMimic, replacing per-clip tracking objectives with a GAN-style discriminator that scores motion naturalness. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- Demonstrated on physically simulated characters performing task-driven behaviors with learned style; note that adversarial training can be unstable and prone to mode collapse, and style fidelity depends on the coverage of the provided clip dataset.
- Clarity
- Readable conceptually; a second pass is needed for the discriminator formulation and the reward-combination details.
- How to read it
- First pass for the task-reward vs. style-prior split (the core idea); second pass on the discriminator objective and training setup if you plan to apply it to your own characters.
Motion Synthesis
-
, , , , , ,
Introduces neural blend weight fields driven by a skeletal deformation to map observation space to a canonical NeRF, enabling free-viewpoint video of dynamic humans.
abstract ▾ abstract ▴
This paper addresses the challenge of reconstructing an animatable human model from a multi-view video. Some recent works have proposed to decompose a non-rigidly deforming scene into a canonical neural radiance field and a set of deformation fields that map observation-space points to the canonical space, thereby enabling them to learn the dynamic scene from images. However, they represent the deformation field as translational vector field or SE(3) field, which makes the optimization highly under-constrained. Moreover, these representations cannot be explicitly controlled by input motions. Instead, we introduce neural blend weight fields to produce the deformation fields. Based on the skeleton-driven deformation, blend weight fields are used with 3D human skeletons to generate observation-to-canonical and canonical-to-observation correspondences. Since 3D human skeletons are more observable, they can regularize the learning of deformation fields. Moreover, the learned blend weight fields can be combined with input skeletal motions to generate new deformation fields to animate the human model. Experiments show that our approach significantly outperforms recent human synthesis methods. The code and supplementary materials are available at https://zju3dv.github.io/animatable_nerf/.
Related 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting · Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans · S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling · SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
how to read this ▾ how to read this ▴
- Category
- Method: animatable neural radiance field for dynamic human bodies
- Contributions
-
- Neural blend weight fields that, combined with a 3D human skeleton, generate observation-to-canonical and canonical-to-observation deformation correspondences
- Skeleton-driven regularization that better constrains the deformation field than translational or SE(3) fields
- Learned blend-weight fields recombined with new skeletal motions to animate the reconstructed human and render free-viewpoint video
- Context
- Extends canonical-NeRF-plus-deformation reconstruction by tying deformation to SMPL/skeleton-based blend skinning, building on SMPL and Neural Body. Builds on: SMPL: A Skinned Multi-Person Linear Model · Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans
- Correctness
- Reported to outperform recent human-synthesis methods from multi-view video; remember it requires multi-view input and an observable 3D skeleton, and reconstruction quality is tied to skeleton-driven deformation assumptions.
- Clarity
- Moderately accessible; a second pass is useful to connect the blend-weight field formulation to classic LBS.
- How to read it
- First pass for how blend-weight fields make the NeRF explicitly pose-controllable; second pass on the observation/canonical mapping math if you work on animatable avatars.
ML Deformation / Skinning
-
,
Remedy Entertainment details building a custom motion-matching animation system for Control from scratch after middleware became unavailable, covering pipeline, debugging tools, and cinematic transitions.
Motion Synthesis / Rigging
-
Official Autodesk overview of MotionBuilder 2022 new features including Python 3 support, expanded Python API, FCurve quaternion visualization, Character Extension improvements, and Story tool stability fixes.
abstract ▾ abstract ▴
This Autodesk overview pitches MotionBuilder as a real-time character animation tool that supports affordable consumer-level motion capture devices and ships with a library of pre-built moves for common animations. It positions MotionBuilder as an addition to a Maya or 3ds Max pipeline, opening up real-time, director-driven production for virtual production, previsualization and performance animation. The tool set covers character rigging, non-linear animation editing and motion capture data manipulation within a real-time 3D engine.
Related How to Retarget Motion Capture in MotionBuilder · Meet MotionMaker: New AI Animation Tool In Maya · Setup Live Link Between Motionbuilder and Unreal Engine 5 Tutorial · MotionBuilder: Essentials Characterization, Retargeting and Baking Animations
how to read this ▾ how to read this ▴
- Category
- Production talk / product overview (MotionBuilder 2022)
- Contributions
-
- Overview of MotionBuilder as a real-time, director-driven character animation and mocap tool positioned alongside Maya/3ds Max pipelines
- Highlights 2022 features: Python 3 support, expanded Python API, FCurve quaternion visualization, Character Extension improvements, and Story tool stability fixes
- Context
- Relates to real-time character animation, motion-capture cleanup, and non-linear animation editing within a virtual-production and previsualization pipeline.
- Correctness
- Vendor product pitch, not peer-reviewed; feature claims are promotional and the practical value is workflow and pipeline integration rather than a validated technical result.
- Clarity
- Very accessible; a single pass conveys the feature set and where the tool fits in a pipeline.
- How to read it
- Skim once for the new-feature list and pipeline positioning; only revisit specific features (e.g. the Python 3 API changes) if they affect your own tooling or scripts.
Retargeting / Motion Synthesis
-
Naughty Dog technical overview of The Last of Us Part II character pipeline covering artist-driven shader workflows, gore tools, and live-production optimisations without destructive asset changes.
Skinning / Facial
-
, ,
Extends IPC to shells, rods, and particles coupled seamlessly, enabling unified interpenetration-free simulation of cloth, hair, and volumetric bodies together.
abstract ▾ abstract ▴
We extend the incremental potential contact (IPC) model [Li et al. 2020a] for contacting elastodynamics to resolve systems composed of codimensional degrees-of-freedoms in arbitrary combination. This enables a unified, interpenetration-free, robust, and stable simulation framework that couples codimension-0,1,2, and 3 geometries seamlessly with frictional contact. Extending the IPC model to thin structures poses new challenges in computing strain, modeling thickness and determining collisions. To address these challenges we propose three corresponding contributions. First, we introduce a C 2 constitutive barrier model that directly enforces strain limiting as an energy potential while preserving rest state. This provides energetically-consistent strain limiting models (both isotropic and anisotropic) for cloth that enable strict satisfaction of strain-limit inequalities with direct coupling to both elastodynamics and contact via minimization of the incremental potential. Second, to capture the geometric thickness of codimensional domains we extend the IPC model to directly enforce distance offsets. Our treatment imposes a strict guarantee that mid-surfaces (respectively mid-lines) of shells (respectively rods) will not move closer than applied thickness values, even as these thicknesses become characteristically small.
Related Nonlinear Cloth Simulation with Isogeometric Analysis · Incremental Potential Contact: Intersection- and Inversion-free Large-Deformation Dynamics · Strain Based Dynamics · Projective Dynamics: Fusing Constraint Projections for Fast Simulation
how to read this ▾ how to read this ▴
- Category
- Method: unified codimensional contact simulation (C-IPC)
- Contributions
-
- Extends IPC to couple codimension-0,1,2,3 geometries (volumes, shells, rods, particles) seamlessly with frictional contact, interpenetration-free
- A C2 constitutive barrier model enforcing strain limiting as an energy potential while preserving rest state (isotropic and anisotropic)
- Distance-offset enforcement so codimensional mid-surfaces and mid-lines keep a strict geometric thickness during collision
- Context
- Directly extends Incremental Potential Contact (IPC) from volumetric elastodynamics to thin and reduced-dimension structures, enabling cloth, hair, and bodies in one solver. Builds on: Incremental Potential Contact: Intersection- and Inversion-free Large-Deformation Dynamics
- Correctness
- Built on the IPC guarantee of intersection- and inversion-free dynamics extended to shells/rods/particles; strict guarantees come at the cost of barrier-based optimization that is computationally heavy, so interactive use is not the target.
- Clarity
- Dense and math-heavy; a first pass gives the unification claim, but the formulation needs careful second and third passes.
- How to read it
- First pass for what codimensions are unified and the three contributions; reserve a second/third pass for the barrier model and strain-limiting derivations only if you implement or modify a contact solver.
CFX
-
, , , ,
A geometry-conditioned recurrent network with encoder-space optimization preserves self-contacts and prevents interpenetration when retargeting motion across different character bodies.
abstract ▾ abstract ▴
This paper introduces a motion retargeting method that preserves self-contacts and prevents interpenetration. Self-contacts, such as when hands touch each other or the torso or the head, are important attributes of human body language and dynamics, yet existing methods do not model or preserve these contacts. Likewise, interpenetration, such as a hand passing into the torso, are a typical artifact of motion estimation methods. The input to our method is a human motion sequence and a target skeleton and character geometry. The method identifies self-contacts and ground contacts in the input motion, and optimizes the motion to apply to the output skeleton, while preserving these contacts and reducing interpenetration. We introduce a novel geometry-conditioned recurrent network with an encoder-space optimization strategy that achieves efficient retargeting while satisfying contact constraints. In experiments, our results quantitatively outperform previous methods and we conduct a user study where our retargeted motions are rated as higher-quality than those produced by recent works. We also show our method generalizes to motion estimated from human videos where we improve over previous works that produce noticeable interpenetration.
Related Geometry-Aware Retargeting for Two-Skinned Characters Interaction · Learning Character-Agnostic Motion for Motion Retargeting in 2D · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks
how to read this ▾ how to read this ▴
- Category
- Method: contact-aware motion retargeting across skinned characters
- Contributions
-
- A motion-retargeting method that detects self-contacts and ground contacts in the input and preserves them while reducing interpenetration on the target body
- A geometry-conditioned recurrent network with an encoder-space optimization strategy to satisfy contact constraints efficiently
- Generalization to motion estimated from human video, improving over prior estimation
- Context
- Builds on deep motion retargeting (e.g. Skeleton-Aware Networks), adding character geometry conditioning and explicit self-contact and interpenetration handling. Builds on: Skeleton-Aware Networks for Deep Motion Retargeting
- Correctness
- Reported to quantitatively outperform prior methods with a user study rating its retargeted motion higher; note that results hinge on reliable contact detection in the input, and the method targets skinned-character retargeting rather than full physical plausibility.
- Clarity
- Accessible at a first pass for the problem and approach; a second pass clarifies the encoder-space optimization.
- How to read it
- First pass for the contact-preservation framing and why geometry conditioning matters; second pass on the network plus encoder-space optimization if you build retargeting pipelines.
Retargeting
-
Epic Games VP of Digital Humans Technology presents MetaHuman Creator, detailing the high-fidelity facial rigging system and workflow for generating realistic real-time digital humans.
abstract ▾ abstract ▴
Vladimir Mastilovic, VP of Digital Humans Technology at Epic, presents the MetaHuman Creator, a cloud-streamed browser tool that generates real-time digital human faces from a scanned, data-driven database too large to ship client-side. He explains that the tool computes a full rig of about 700 joints, 700 blend shapes and eight levels of detail in roughly a second, using preset characters and per-region blend spaces plus a direct-manipulation face-IK tool whose end effectors are constrained by real scan data to keep appearances plausible. He details separable low- and high-frequency skin detail that affects texture, geometry and dynamic compression and stretching maps driven by facial muscle contractions, procedurally generated iris color and refraction in the eye shader, and attachable simulated grooms and facial hair. The resulting asset, including geometry, layered textures, joints and blend shapes, exports one-to-one into both Unreal Engine and Maya over an FBX workflow with a live link for animating in Maya while previewing in the engine, and is positioned as a customizable starting point rather than a likeness or scan-replication tool, with MoCap support via ARKit, Faceware, Dynamixyz and others.
Related Interactive Sculpting of Digital Faces Using an Anatomical Modeling Paradigm · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · Maya 2020 | Proximity Wrap Deformer · DreamWorks Animation Facial Motion and Deformation System
how to read this ▾ how to read this ▴
- Category
- Production talk / system breakdown (MetaHuman Creator)
- Contributions
-
- Demonstrates MetaHuman Creator, a cloud-streamed browser tool that generates real-time digital human faces from a data-driven scan database too large to ship client-side
- Shows a fast pipeline computing a full rig (about 700 joints, 700 blend shapes, eight LODs) in roughly a second, with scan-constrained face-IK direct manipulation
- Details separable low/high-frequency skin detail, muscle-driven compression/stretch maps, procedural iris shading, attachable grooms, and one-to-one FBX export to Unreal and Maya with live link
- Context
- Relates to photorealistic digital-actor work such as the Digital Emily Project, repackaging scan-based facial capture into a real-time, preset-plus-blend-space authoring tool. Builds on: The Digital Emily Project: Achieving a Photorealistic Digital Actor
- Correctness
- Studio/vendor practice, not peer-reviewed; the results are production-proven and shipping, but figures are presenter-stated and the tool is positioned as a plausible starting point, not a likeness or scan-replication system.
- Clarity
- Accessible; a single viewing conveys the workflow and component breakdown.
- How to read it
- Watch once for the rig structure, the scan-constrained face-IK idea, and the export/live-link workflow; revisit specific segments (skin detail layers, eye shader) if relevant to your own facial pipeline.
Facial / Rigging
-
, ,
Compresses DDM models into a two-layer LBS representation using continuous pose-space example data to minimize approximation error.
abstract ▾ abstract ▴
Direct Delta Mush (DDM) is a high-quality, direct skinning method with a low setup cost. However, its storage and run-time computing cost are relatively high for two reasons: its skinning weights are 4 X 4 matrices instead of scalars like other direct skinning methods, and its computation requires one 3 X 3 Singular Value Decomposition per vertex. In this paper, we introduce a compression method that takes a DDM model and splits it into two layers: the first layer is a smaller DDM model that computes a set of virtual bone transformations and the second layer is a Linear Blend Skinning model that computes per-vertex transformations from the output of the first layer. The two-layer model can approximate the deformation of the original DDM model with significantly lower costs. Our main contribution is a novel problem formulation for the DDM compression based on a continuous example-based technique, in which we minimize the compression error on an uncountable set of example poses. This formulation provides an elegant metric for the compression error and simplifies the problem to the common linear matrix factorization. Our formulation also takes into account the skeleton hierarchy of the model, the bind pose, and the range of motions.
Related Direct Delta Mush Skinning and Variants · Two-Layer Sparse Compression of Dense-Weight Blend Skinning · Fast Automatic Skinning Transformations · Bodyopt: A Character Deformation Pipeline for Avatar: The Way of Water
how to read this ▾ how to read this ▴
- Category
- Method: skinning compression for Direct Delta Mush
- Contributions
-
- A two-layer compression of a DDM model: a smaller DDM layer computing virtual bone transformations, plus a Linear Blend Skinning layer for per-vertex transforms, cutting storage and run-time cost
- A novel continuous example-based problem formulation that minimizes compression error over an uncountable set of example poses
- Reduction of the problem to standard linear matrix factorization while accounting for the skeleton hierarchy
- Context
- Compresses the Direct Delta Mush skinning model (and variants), replacing its per-vertex 4x4 weights and per-vertex SVD with a cheaper LBS-backed approximation. Builds on: Direct Delta Mush Skinning and Variants
- Correctness
- Aims to approximate the original DDM deformation at significantly lower cost; it is an approximation, so a reader should weigh the accuracy/cost trade-off and check how the virtual-bone count affects fidelity for their meshes.
- Clarity
- Fairly technical; a first pass gives the two-layer idea, a second pass is needed for the continuous-example formulation.
- How to read it
- First pass for the two-layer DDM-to-LBS structure and why it is cheaper; second pass on the continuous example formulation and matrix factorization if you implement or tune the compression.
Skinning
-
, , ,
Neural network predicts pose-dependent dynamic wrinkle details for garments directly from skeletal motion without explicit cloth simulation.
abstract ▾ abstract ▴
A vital task of the wider digital human effort is the creation of realistic garments on digital avatars, both in the form of characteristic fold patterns and wrinkles in static frames as well as richness of garment dynamics under avatars' motion. Existing workflow of modeling, simulation, and rendering closely replicates the physics behind real garments, but is tedious and requires repeating most of the workflow under changes to characters' motion, camera angle, or garment resizing. Although data-driven solutions exist, they either focus on static scenarios or only handle dynamics of tight garments. We present a solution that, at test time, takes in body joint motion to directly produce realistic dynamic garment image sequences. Specifically, given the target joint motion sequence of an avatar, we propose dynamic neural garments to synthesize plausible dynamic garment appearance from a desired viewpoint. Technically, our solution generates a coarse garment proxy sequence, learns deep dynamic features attached to this template, and neurally renders the features to produce appearance changes such as folds, wrinkles, and silhouettes. We demonstrate generalization behavior to both unseen motion and unseen camera views. Further, our network can be fine-tuned to adopt to new body shape and/or background images.
Related Motion Guided Deep Dynamic 3D Garments · GarMatNet: A Learning-Based Method for Predicting 3D Garment Mesh with Parameterized Materials · A Pixel-Based Framework for Data-Driven Clothing · Dynamic Deformables: Implementation and Production Practicalities
how to read this ▾ how to read this ▴
- Category
- Method: neural rendering of pose-dependent dynamic garments
- Contributions
-
- Dynamic Neural Garments: at test time takes body joint motion and directly produces realistic dynamic garment image sequences from a desired viewpoint, without explicit cloth simulation
- A pipeline that generates a coarse garment proxy sequence, learns deep dynamic features attached to the template, and neurally renders them into folds, wrinkles, and silhouettes
- Demonstrated generalization to unseen motions
- Context
- Builds on learning-based clothing animation (e.g. virtual try-on), extending data-driven garments from static or tight-clothing cases to dynamic, looser garment appearance. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Validated on garment image-sequence synthesis with generalization shown; note that it outputs rendered appearance from a viewpoint (image-space) rather than a simulated 3D cloth mesh, so it is best read as neural rendering, not physical simulation.
- Clarity
- Accessible at a first pass for the goal and pipeline stages; a second pass clarifies the feature-learning and neural-rendering steps.
- How to read it
- First pass for the coarse-proxy then learned-feature then neural-render pipeline and what is image-space vs. geometry; second pass on the dynamic feature representation if you work on neural garment appearance.
CFX / ML Deformation
-
,
Naughty Dog presented a systemic emotion pipeline spanning rigging, animation, dialogue, and design to deliver up to 20 emotions for 65 characters across 13 languages in gameplay.
Facial / Rigging
-
, , , ,
Scalable FEM-quality muscle simulator handling heterogeneous materials including soft muscles, tendons, and bones without geometric coarsening.
abstract ▾ abstract ▴
EMU is an efficient and scalable model to simulate bulk musculoskeletal motion with heterogenous materials. First, EMU requires no model reductions, or geometric coarsening, thereby producing results visually accurate when compared to an FEM simulation. Second, EMU is efficient and scales much better than state‐of‐the‐art FEM with the number of elements in the mesh, and is more easily parallelizable. Third, EMU can handle heterogeneously stiff meshes with an arbitrary constitutive model, thus allowing it to simulate soft muscles, stiff tendons and even stiffer bones all within one unified system. These three key characteristics of EMU enable us to efficiently orchestrate muscle activated skeletal movements. We demonstrate the efficacy of our approach via a number of examples with tendons, muscles, bones and joints.
Related Simulation of Hand Anatomy Using Medical Imaging · How to Build a Human: Practical Physics-Based Character Animation · Shape Targeting: A Versatile Active Elasticity Constitutive Model · Fast Simulation of Deformable Characters with Articulated Skeletons in Projective Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: efficient heterogeneous muscle simulation (EMU)
- Contributions
-
- EMU, a deformation-space muscle simulator that produces FEM-accurate results with no model reduction or geometric coarsening
- Better scaling than state-of-the-art FEM with element count, and more easily parallelizable
- Handles heterogeneously stiff meshes with an arbitrary constitutive model, unifying soft muscle, stiff tendon, and bone in one system
- Context
- Builds on quasistatic FEM flesh and muscle simulation (e.g. Teran et al.), reformulating the problem in deformation space for efficiency and heterogeneous-material handling. Builds on: Robust Quasistatic Finite Elements and Flesh Simulation
- Correctness
- Claims visual accuracy comparable to FEM with better scaling, shown on tendon/muscle/bone/joint examples; as the validation is example-based and visual, a reader should not assume it is a clinically accurate biomechanical model.
- Clarity
- Technical; a first pass conveys the efficiency and heterogeneity claims, while the deformation-space formulation needs a second pass.
- How to read it
- First pass for why deformation space buys scalability and heterogeneous stiffness in one solver; second pass on the formulation and constitutive handling if you implement musculoskeletal simulation.
Muscles
-
, ,
Portable real-time rig deformation framework providing production-quality character deformation in interactive and game contexts.
abstract ▾ abstract ▴
FIRA is a machine learning pipeline that compresses complex VFX deformation rigs into portable neural network representations for realtime use in previs and virtual production. The system trains neural networks on complex Framestore rigs and deploys them as lightweight models in Maya and Unreal Engine without requiring the full rigging toolchain.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Geodesic Voxel Binding for Production Character Meshes · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation · Automatic Rigging and Animation of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method / production system: ML compression of deformation rigs for realtime
- Contributions
-
- A machine learning pipeline that compresses complex VFX deformation rigs into portable neural network representations
- Trains neural networks on Framestore rigs and deploys them as lightweight models
- Runs in Maya and Unreal Engine without requiring the full rigging toolchain, targeting previs and virtual production
- Context
- Aimed at realtime/interactive character deformation, it builds on dependency-graph evaluation work for character animation such as LibEE (Watt et al.). Builds on: LibEE: A Multithreaded Dependency Graph for Character Animation
- Correctness
- Demonstrated on production Framestore rigs deployed to Maya and Unreal; as a learned approximation of an authored rig, fidelity outside the training distribution and the cost of retraining per rig are the practical limits to keep in mind.
- Clarity
- Likely accessible and system-oriented; a first pass conveys the pipeline, a second pass clarifies the training and deployment specifics.
- How to read it
- Read for the architecture of the train-then-deploy pipeline and where it fits in previs/virtual-production; a second pass pays off only if you need the model and integration details.
Rigging / Skinning
-
, ,
This paper resolves frictional contact between deformable elastic objects against smooth implicit surface representations instead of polygonal meshes, which lack a consistent inside-outside partition
abstract ▾ abstract ▴
This paper resolves frictional contact between deformable elastic objects against smooth implicit surface representations instead of polygonal meshes, which lack a consistent inside-outside partition and a smooth near-surface distance field. The authors augment a moving least squares implicit surface with a local contact kernel and develop a parallel transport approximation that transfers frictional impulses across the deforming surface. The resulting formulation produces robust, artifact-free contact and friction response for elastic solids.
Related An Implicit Frictional Contact Solver for Adaptive Cloth Simulation · Projective Dynamics with Dry Frictional Contact · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Adaptive Nonlinearity for Collisions in Complex Rod Assemblies
how to read this ▾ how to read this ▴
- Category
- Method: frictional contact for deformable solids against implicit surfaces
- Contributions
-
- Resolves frictional contact between deformable elastic objects and smooth implicit surfaces rather than polygonal meshes
- Augments a moving least squares implicit surface with a local contact kernel
- Develops a parallel transport approximation to transfer frictional impulses across the deforming surface
- Context
- Part of the physics-based contact and friction lineage for deformables, related to implicit frictional contact solvers such as Li et al.'s adaptive cloth work. Builds on: An Implicit Frictional Contact Solver for Adaptive Cloth Simulation
- Correctness
- The approach relies on a smooth implicit surface providing a consistent inside-outside partition and near-surface distance field; results are shown as robust, artifact-free elastic contact, but reliance on the MLS surface representation and the parallel-transport approximation are the assumptions to keep in mind.
- Clarity
- Technically dense (TOG); a first pass conveys the motivation, a second and likely third pass are needed for the formulation.
- How to read it
- Focus first on why implicit surfaces are chosen over meshes; do a careful second/third pass on the contact kernel and parallel-transport derivation if you intend to implement or extend it.
CFX
-
, , , , ,
Transfers musculature from a reference anatomical model to bodies of different proportions while preserving functionality and producing simulation-ready results.
abstract ▾ abstract ▴
We present a novel retargeting algorithm that transfers the musculature of a reference anatomical model to new bodies with different sizes, body proportions, muscle capability, and joint range of motion while preserving the functionality of the original musculature as closely as possible. The geometric configuration and physiological parameters of musculotendon units are estimated and optimized to adapt to new bodies. The range of motion around joints is estimated from a motion capture dataset and edited further for individual models. The retargeted model is simulation‐ready, so we can physically simulate muscle‐actuated motor skills with the model. Our system is capable of generating a wide variety of anatomical bodies that can be simulated to walk, run, jump and dance while maintaining balance under gravity. We will also demonstrate the construction of individualized musculoskeletal models from bi‐planar X‐ray images and medical examination.
Related Scalable Muscle-Actuated Human Simulation and Control · Generative GaitNet · Physical Based Motion Reconstruction From Videos Using Musculoskeletal Model · Learning Locomotion Skills Using DeepRL: Does the Choice of Action Space Matter?
how to read this ▾ how to read this ▴
- Category
- Method: musculature retargeting for simulation-ready anatomical models
- Contributions
-
- A retargeting algorithm that transfers a reference musculature to bodies of different size, proportion, muscle capability and joint range of motion while preserving functionality
- Estimates and optimizes geometric configuration and physiological parameters of musculotendon units, with range of motion estimated from motion capture
- Produces simulation-ready models, including individualized musculoskeletal models built from bi-planar X-ray images and medical examination
- Context
- Extends muscle-actuated human simulation and control, building directly on Lee et al.'s scalable muscle-actuated simulation. Builds on: Scalable Muscle-Actuated Human Simulation and Control
- Correctness
- Validated by physically simulating retargeted models walking, running, jumping and dancing under gravity; functionality is preserved as closely as possible rather than exactly, and quality depends on the reference model and the motion capture used to estimate range of motion.
- Clarity
- Moderately technical (CGF); a first pass conveys the goal and pipeline, a second pass is needed for the parameter optimization.
- How to read it
- Read for the retargeting pipeline and what simulation-ready means here; second pass on the musculotendon parameter estimation if you work with muscle simulation.
Muscles / Retargeting
- GarMatNet: A Learning-Based Method for Predicting 3D Garment Mesh with Parameterized Materials MIG Academic 3 cites
, ,
Two-stream network predicts body-fitted garment mesh deformation conditioned on pose and parameterized fabric material properties.
abstract ▾ abstract ▴
Recent progress in learning-based methods of garment mesh generation is resulting in increased efficiency and maintenance of reality during the generation process. However, none of the previous works so far have focused on variations in material types based on a parameterized material parameter under static poses. In this work, we propose a learning-based method, GarMatNet, for predicting garment deformation based on the functions of human poses and garment materials while maintaining detailed garment wrinkles. GarMatNet consists of two components: a generally-fitting network for predicting smoothed garment mesh and a locally-detailed network for adding detailed wrinkles based on smoothed garment mesh. We hypothesize that material properties play an essential role in the deformation of garments. Since the influences of material type are relatively smaller than pose or body shape, we employ linear interpolation among different factors to control deformation. More specifically, we apply a parameterized material space based on the mass-spring model to express the difference between materials and construct a suitable network structure with weight adjustment between material properties and poses.
Related N-Cloth: Predicting 3D Cloth Deformation with Mesh-Based Networks · Motion Guided Deep Dynamic 3D Garments · Learning-Based Animation of Clothing for Virtual Try-On · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation
how to read this ▾ how to read this ▴
- Category
- Method: learning-based garment deformation conditioned on material
- Contributions
-
- GarMatNet, a two-stream network predicting garment deformation from human pose and parameterized garment material
- A generally-fitting network for smoothed garment mesh plus a locally-detailed network that adds wrinkles
- A parameterized material space based on a mass-spring model, with linear interpolation to control material-driven deformation
- Context
- Sits in the learning-based garment animation line of work, building on virtual-try-on clothing animation such as Santesteban et al. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Operates under static poses and assumes material influence is smaller than pose or body shape, justifying linear interpolation across materials; this interpolation assumption and the static-pose setting are the limitations to keep in mind.
- Clarity
- Accessible (MIG); a first pass conveys the two-stream idea, a second pass clarifies the material parameterization.
- How to read it
- Read for how material is parameterized and injected, and the coarse-to-detail two-network split; second pass if you care about the mass-spring material space.
CFX / ML Deformation
-
Regular-grid GPU cloth at millions of vertices resolving submillimeter wrinkles, exploiting structured memory access for high-resolution character garment simulation.
abstract ▾ abstract ▴
In this paper, we study physics-based cloth simulation in a very high resolution setting, presumably at submillimeter levels with millions of vertices, to meet perceptual precision of our human eyes. State-of-the-art simulation techniques, mostly developed for unstructured triangular meshes, can hardly meet this demand due to their large computational costs and memory footprints. We argue that in a very high resolution, it is more plausible to use regular meshes with an underlying grid structure, which can be highly compatible with GPU acceleration like high-resolution images. Based on this idea, we formulate and solve the nonlinear optimization problem for simulating high-resolution wrinkles, by a fast block-based descent method with reduced memory accesses. We also investigate the development of the collision handling component in our system, whose performance benefits greatly from the grid structure. Finally, we explore various issues related to the applications of our system, including initialization for fast convergence and temporal coherence, gathering effects, inflation and stuffing models, and mesh simplification. We can treat our system as a quasistatic wrinkle synthesis tool, run it as a standalone dynamic simulator, or integrate it into a multi-resolution solver as an additional component.
Related Vivace: A Practical Gauss-Seidel Method for Stable Soft Body Dynamics · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · Continuum-based Strain Limiting
how to read this ▾ how to read this ▴
- Category
- Method: GPU cloth simulation for submillimeter wrinkles on regular grids
- Contributions
-
- Physics-based cloth simulation at submillimeter resolution with millions of vertices using regular grid meshes for GPU compatibility
- A fast block-based descent method with reduced memory accesses to solve the nonlinear wrinkle optimization
- A grid-structure-aware collision handling component, plus initialization, temporal coherence, inflation/stuffing and simplification techniques
- Context
- Advances GPU-accelerated cloth simulation, related to contact-aware GPU cloth assembly work such as Tang et al.'s CAMA. Builds on: CAMA: Contact-Aware Matrix Assembly with Unified Collision Handling for GPU-based Cloth Simulation
- Correctness
- Framed largely as a quasistatic wrinkle synthesis tool that trades unstructured triangular meshes for regular grids to exploit structured memory access; the regular-grid assumption is the key design choice and constraint to keep in mind.
- Clarity
- Technical (SIGGRAPH) but motivated clearly; a first pass conveys the grid-on-GPU argument, a second pass is needed for the solver.
- How to read it
- Focus first on why a regular grid enables this scale on GPUs; second/third pass on the block-based descent solver and collision handling if implementing.
CFX
-
, , , , , , ,
Builds a heterogeneous graph over mesh vertices and skeletal bones with a HollowDist metric, predicting production-quality skin weights for arbitrary character topologies.
abstract ▾ abstract ▴
Character rigging is universally needed in computer graphics but notoriously laborious. We present a new method, HeterSkinNet, aiming to fully automate such processes and significantly boost productivity. Given a character mesh and skeleton as input, our method builds a heterogeneous graph that treats the mesh vertices and the skeletal bones as nodes of different types and uses graph convolutions to learn their relationships. To tackle the graph heterogeneity, we propose a new graph network convolution operator that transfers information between heterogeneous nodes. The convolution is based on a new distance HollowDist that quantifies the relations between mesh vertices and bones. We show that HeterSkinNet is robust for production characters by providing the ability to incorporate meshes and skeletons with arbitrary topologies and morphologies (e.g., out-of-body bones, disconnected mesh components, etc.). Through exhaustive comparisons, we show that HeterSkinNet outperforms state-of-the-art methods by large margins in terms of rigging accuracy and naturalness. HeterSkinNet provides a solution for effective and robust character rigging.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Learning Skeletal Articulations with Neural Blend Shapes · Real-Time Skeletal Skinning with Optimized Centers of Rotation · A Statistical Model of Human Pose and Body Shape
how to read this ▾ how to read this ▴
- Category
- Method: learning-based automatic skin weight prediction
- Contributions
-
- HeterSkinNet, a heterogeneous graph over mesh vertices and skeletal bones as distinct node types, using graph convolutions to learn their relationships
- A new graph convolution operator that transfers information between heterogeneous nodes
- A new HollowDist distance quantifying vertex-to-bone relations, supporting arbitrary mesh and skeleton topologies including out-of-body bones and disconnected components
- Context
- Continues deep-learning automatic skinning for production characters, building on graph-network skin binding such as Liu et al.'s NeuroSkinning. Builds on: NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks
- Correctness
- Reported to be robust on production characters and to outperform prior methods on rigging accuracy and naturalness through comparisons; as with learned skinning, generalization depends on the training characters and the chosen distance metric.
- Clarity
- Accessible (I3D); a first pass conveys the heterogeneous-graph idea, a second pass clarifies the HollowDist metric and convolution operator.
- How to read it
- Read for the heterogeneous-graph formulation and HollowDist; second pass on the convolution operator if you work on automatic rigging.
Skinning / ML Deformation
-
,
Analysis of how rig design decisions directly affect animator workflow, productivity, and the quality of resulting performances.
abstract ▾ abstract ▴
This paper presents Rumba, animation software designed to modernize the character animation process through improved rig design. The authors introduce manipulators as separate plug-in objects that control multiple rig controllers without adding rig complexity, and grips painted directly on the character surface. They propose mode-less rigs with modern constraint systems that eliminate traditional IK/FK switches, and introduce offset-less constraints that guarantee animation continuity across constraint activation and deactivation.
Related Sketch-based Motion Editing for Articulated Characters · Premo: Powerful Character Rigging, Fast Animation · Pose and Skeleton-aware Neural IK for Pose and Motion Editing · Rig-Space Physics
how to read this ▾ how to read this ▴
- Category
- Production system / design analysis: how rig design affects animation
- Contributions
-
- Rumba, animation software designed to modernize character animation through improved rig design
- Manipulators as separate plug-in objects controlling multiple rig controllers without adding rig complexity, plus grips painted on the character surface
- Mode-less rigs with modern constraints that eliminate IK/FK switches, and offset-less constraints that guarantee animation continuity across constraint activation and deactivation
- Context
- Relates to character rigging and animator workflow design; no prior works are cited, so it stands on general rigging and constraint-system practice.
- Correctness
- Presented as a software design and workflow argument (DigiPro) rather than a quantitative study; claims about productivity and continuity are design-driven, so treat them as proposed practice rather than measured results.
- Clarity
- Accessible and workflow-oriented; a single first pass largely conveys the ideas.
- How to read it
- Read once for the design concepts (manipulators, grips, mode-less and offset-less constraints) and how they change the animator's workflow; deep re-reading is rarely necessary.
Rigging
-
, , , ,
Neural network approach to enhance keyframed or procedural quadruped animations with learned natural motion characteristics.
abstract ▾ abstract ▴
Creating realistic quadruped animations is challenging. Producing realistic animations using methods such as key-framing is time consuming and requires much artistic expertise. Alternatively, motion capture methods have their own challenges (getting the animal into a studio, attaching motion capture markers, and getting the animal to put on the desired performance) and the resulting animation will still most likely require cleaning up. It would be useful if an animator could provide an initial rough animation and in return be given a corresponding high quality realistic one. To this end, we present a deep-learning approach for the automatic enhancement of quadruped animations. Given an initial animation, possibly lacking the subtle details of true quadruped motion and/or containing small errors, our results show that it is possible for a neural network to learn how to add these subtleties and correct errors to produce an enhanced animation while preserving the semantics and context of the initial animation. Our work also has potential uses in other applications, for example, its ability to be used in real-time means it could form part of a quadruped embodiment system.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Retargetting based on Dilated Convolutions and Skeleton-Specific Loss Functions · ReGAIL: Toward Agile Character Control From a Single Reference Motion · Pose and Skeleton-aware Neural IK for Pose and Motion Editing
how to read this ▾ how to read this ▴
- Category
- Method: neural enhancement of quadruped animation
- Contributions
-
- A deep-learning approach that takes a rough quadruped animation and enhances it into a higher-quality, more realistic one
- Learns to add subtle motion details and correct small errors while preserving the semantics and context of the input
- Runs in real time, suggesting use within a quadruped embodiment system
- Context
- Relates to neural motion enhancement and quadruped animation; no prior works are listed, so it sits generally within learning-based animation cleanup and motion synthesis.
- Correctness
- Validated by showing enhanced animations that add detail and fix errors from rough inputs; quality of the output is bounded by the input animation and the training motion, and 'realistic' here is qualitative.
- Clarity
- Accessible (MIG); a first pass conveys the rough-to-refined idea, a second pass clarifies the network and data.
- How to read it
- Read for the problem framing (enhance rather than generate) and the real-time angle; second pass for the network details if you work on motion cleanup.
Motion Synthesis / ML Deformation
-
Technical tour of MotionBuilder's Story Tool as a non-linear mocap editor, covering clip blending, pose library, track management, and the workflow advantages for real-time motion editing and cut-scene assembly.
abstract ▾ abstract ▴
This technical tour demonstrates MotionBuilder's Story tool as a non-linear mocap editor across several practical use cases. It shows retiming whole takes or fragments via the scale function and razor cuts (including slow-motion on hit reactions and attacks), repositioning clips with the ghost manipulator to align animation to the world axes or to geometry, and using the reverse time-warp to mirror a strafe motion. The presenter then stitches clips into longer sequences across multiple character tracks, blending between a run and a combo swing by keying track weight and using pass-through so lower tracks let content flow from takes above. It finishes with layering a Mixamo rhino hit reaction reacting to the samurai strike, plotting both characters to a new take, and exporting Story clips for reuse as an animation library across other FBX files.
Related MotionBuilder: Essentials Characterization, Retargeting and Baking Animations · Autodesk MotionBuilder 2022 · Setup Live Link Between Motionbuilder and Unreal Engine 5 Tutorial · How to Retarget Motion Capture in MotionBuilder
how to read this ▾ how to read this ▴
- Category
- Production talk: MotionBuilder Story tool as a non-linear mocap editor
- Contributions
-
- Demonstrates retiming whole takes or fragments via the scale function and razor cuts, including slow-motion on hit reactions
- Shows repositioning clips with the ghost manipulator and a reverse time-warp to mirror motion, then stitching clips across multiple character tracks with weight keying and pass-through blending
- Layers a Mixamo rhino hit reaction onto a samurai strike, plots both characters to a new take, and exports Story clips as a reusable animation library across FBX files
- Context
- A practical workflow tour of MotionBuilder's Story tool for non-linear mocap editing and cut-scene assembly; no academic lineage, it builds on standard MotionBuilder practice.
- Correctness
- Studio/tool practice, not peer-reviewed; the techniques are demonstrated workflows shown to work in the tool rather than validated results.
- Clarity
- Highly accessible and hands-on; a single watch conveys the techniques.
- How to read it
- Watch once with MotionBuilder open and follow along on a sample take; revisit specific segments (blending, time-warp, export) only as needed for your own edits.
Motion Synthesis / Retargeting
-
, , , , , ,
First deep implicit 3DMM of full heads including hair, using signed distance functions and disentangled geometry and color latent spaces.
abstract ▾ abstract ▴
We present the first deep implicit 3D morphable model (i3DMM) of full heads. Unlike earlier morphable face models it not only captures identity-specific geometry, texture, and expressions of the frontal face, but also models the entire head, including hair. We collect a new dataset consisting of 64 people with different expressions and hairstyles to train i3DMM. Our approach has the following favorable properties: (i) It is the first full head morphable model that includes hair. (ii) In contrast to mesh-based models it can be trained on merely rigidly aligned scans, without requiring difficult non-rigid registration. (iii) We design a novel architecture to decouple the shape model into an implicit reference shape and a deformation of this reference shape. With that, dense correspondences between shapes can be learned implicitly. (iv) This architecture allows us to semantically disentangle the geometry and color components, as color is learned in the reference space. Geometry is further disentangled as identity, expressions, and hairstyle, while color is disentangled as identity and hairstyle components. We show the merits of i3DMM using ablation studies, comparisons to state-of-the-art models, and applications such as semantic head editing and texture transfer. We will make our model publicly available1.
Related Learning Neural Parametric Head Models · Learning a Model of Facial Shape and Expression from 4D Scans · 3D Morphable Face Models: Past, Present and Future · EMOCA: Emotion Driven Monocular Face Capture and Animation
how to read this ▾ how to read this ▴
- Category
- Method / model: deep implicit 3D morphable model of full heads
- Contributions
-
- The first deep implicit 3D morphable model of full heads, including hair, captured with signed distance functions
- An architecture that decouples shape into an implicit reference shape plus a deformation, learning dense correspondences implicitly and training on merely rigidly aligned scans
- Semantic disentanglement of geometry (identity, expression, hairstyle) and color (identity, hairstyle), supported by a new 64-person dataset
- Context
- Extends the morphable-model line to implicit representations, building on the classic mesh-based morphable face model of Blanz and Vetter. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Trained on a new dataset of 64 people with varied expressions and hairstyles and supported by ablations and comparisons; the modest dataset size and the implicit/SDF representation are the scope limits a reader should keep in mind.
- Clarity
- Technical (CVPR) but well-motivated; a first pass conveys the model and disentanglement, a second pass is needed for the architecture.
- How to read it
- Read for the reference-shape-plus-deformation design and the geometry/color disentanglement; second pass on the architecture and dataset if you build face/head models.
Facial
-
, , , ,
Volumetric segmentation approach assigns every point in a character volume to muscle, fat, or bone, with interactive muscle-curve-driven authoring tools.
abstract ▾ abstract ▴
We present a new approach for modelling musculoskeletal anatomy. Unlike previous methods, we do not model individual muscle shapes as geometric primitives (polygonal meshes, NURBS etc.). Instead, we adopt a volumetric segmentation approach where every point in our volume is assigned to a muscle, fat, or bone tissue. We provide an interactive modelling tool where the user controls the segmentation via muscle curves and we visualize the muscle shapes using volumetric rendering. Muscle curves enable intuitive yet powerful control over the muscle shapes. This representation allows us to automatically handle intersections between different tissues (muscle-muscle, muscle-bone, and muscle-skin) during the modelling and automates computation of muscle fiber fields. We further introduce a novel algorithm for converting the volumetric muscle representation into tetrahedral or surface geometry for use in downstream tasks. Additionally, we introduce an interactive skeleton authoring tool that allows the users to create skeletal anatomy starting from only a skin mesh using a library of bone parts.
Related Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging · Building and Animating User-Specific Volumetric Face Rigs · Finding Hank · Smeat: ADMM Based Tools for Character Deformation
how to read this ▾ how to read this ▴
- Category
- Method: interactive volumetric anatomy modelling
- Contributions
-
- A volumetric segmentation representation that assigns every point in the character volume to muscle, fat, or bone rather than modelling muscles as separate geometric primitives
- Interactive muscle-curve authoring with volumetric rendering that automatically handles tissue intersections and computes muscle fiber fields
- An algorithm to convert the volumetric representation into tetrahedral or surface geometry, plus a skeleton authoring tool that builds bones from a skin mesh using a parts library
- Context
- Relates to physics-based anatomical character modelling, building on Kadlecek et al.'s work on reconstructing personalized anatomical models for body animation, but replaces per-muscle geometric primitives with a unified volumetric segmentation. Builds on: Reconstructing Personalized Anatomical Models for Physics-based Body Animation
- Correctness
- The approach is presented as an interactive authoring tool, so its value rests on artist usability and the fidelity of the curve-driven segmentation rather than on a quantitative ground-truth comparison; readers should note results are demonstrated on authored examples, not validated against medical anatomy.
- Clarity
- Accessible at a conceptual level; a first pass conveys the representation and tooling, a second pass is needed for the segmentation and geometry-conversion algorithm.
- How to read it
- Focus on the volumetric-vs-primitive framing and the muscle-curve control in pass one; do a second pass on the conversion-to-tet/surface algorithm and fiber-field computation if you plan to feed downstream simulation.
Muscles / Rigging
-
, ,
Unified deformation-based framework preserves spatial relationships among multiple characters and environment during real-time interactive retargeting.
abstract ▾ abstract ▴
A motion retargeting process is necessary as the body size and proportion of the actors are generally different from those of the target characters. However, the original spatial relationship between the multiple characters and the environment is easily broken when using previous motion retargeting methods, which are generally performed for each character independently. Therefore, time‐consuming manual adjustments by animators are usually required to obtain satisfactory results. To address these issues, we present a novel multicharacter motion retargeting method that preserves various types of spatial relationships between characters and environments. We establish a unified deformation‐based framework for the motion retargeting of multiple characters (more than two) or nonhuman characters with complex interactions. Also, an interactive motion editing interface with immediate feedback to the user is provided. We experimentally show that our method achieves a speedup when compared with previous motion retargeting methods.
Related Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Retargeting for Crowd Simulation · Retargeting Motion to New Characters
how to read this ▾ how to read this ▴
- Category
- Method: multi-character motion retargeting
- Contributions
-
- A unified deformation-based framework that retargets motion for multiple (more than two) or nonhuman characters while preserving spatial relationships between characters and environment
- An interactive motion editing interface with immediate feedback to the animator
- A reported speedup over previous retargeting methods that operate on each character independently
- Context
- Builds on the classic motion-retargeting lineage, notably Gleicher's Retargeting Motion to New Characters, extending it from single-character constraint solving to joint multi-character interaction preservation. Builds on: Retargeting Motion to New Characters
- Correctness
- The key assumption is that spatial relationships (contacts, relative poses) should be preserved jointly rather than per character; validation appears experimental with a speed comparison, so readers should treat the quality claims as demonstrated on the authors' interaction examples rather than a broad benchmark.
- Clarity
- Accessible problem statement; a first pass conveys the goal and interface, a second pass clarifies how the deformation framework encodes the relationships.
- How to read it
- Read pass one for the spatial-relationship problem and the unified deformation idea; a second pass pays off if you need the constraint formulation behind the interactive solver.
Retargeting
-
,
Comprehensive KineFX masterclass for Houdini users, covering procedural rigging concepts, skeleton workflows, constraint systems, and animation retargeting in SOP context.
Rigging / Retargeting / Skinning
-
, , ,
DECA reconstructs a detailed animatable 3D face from a single image using a detail-consistency loss to disentangle person-specific wrinkles from expression-dependent deformations.
abstract ▾ abstract ▴
While current monocular 3D face reconstruction methods can recover fine geometric details, they suffer several limitations. Some methods produce faces that cannot be realistically animated because they do not model how wrinkles vary with expression. Other methods are trained on high-quality face scans and do not generalize well to in-the-wild images. We present the first approach that regresses 3D face shape and animatable details that are specific to an individual but change with expression. Our model, DECA (Detailed Expression Capture and Animation), is trained to robustly produce a UV displacement map from a low-dimensional latent representation that consists of person-specific detail parameters and generic expression parameters, while a regressor is trained to predict detail, shape, albedo, expression, pose and illumination parameters from a single image. To enable this, we introduce a novel detail-consistency loss that disentangles person-specific details from expression-dependent wrinkles. This disentanglement allows us to synthesize realistic person-specific wrinkles by controlling expression parameters while keeping person-specific details unchanged. DECA is learned from in-the-wild images with no paired 3D supervision and achieves state-of-the-art shape reconstruction accuracy on two benchmarks.
Related Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer · Neural Head Avatars from Monocular RGB Videos · FLARE: Fast Learning of Animatable and Relightable Mesh Avatars · 3D Morphable Face Models: Past, Present and Future
how to read this ▾ how to read this ▴
- Category
- Method: learned animatable detailed 3D face model
- Contributions
-
- DECA regresses 3D face shape plus animatable, person-specific details from a single image, with details that change with expression
- A regressor predicting detail, shape, albedo, expression, pose, and illumination parameters and a UV displacement map from a low-dimensional latent
- A novel detail-consistency loss that disentangles person-specific details from expression-dependent wrinkles, learned from in-the-wild images
- Context
- Builds on the FLAME face model (Li et al., Learning a Model of Facial Shape and Expression from 4D Scans) as its statistical shape basis, extending monocular reconstruction toward expression-driven wrinkle synthesis. Builds on: Learning a Model of Facial Shape and Expression from 4D Scans
- Correctness
- Central assumption is that person-specific detail and expression-dependent wrinkles can be separated via the consistency loss; trained on in-the-wild images for generalization, but readers should remember reconstructed detail is a learned regression, not a measured scan, so fidelity on extreme poses or unusual subjects is not guaranteed.
- Clarity
- Reasonably accessible; a first pass conveys the disentanglement idea, a second pass is needed for the loss formulation and parameter regression.
- How to read it
- Focus on the detail-consistency loss and the shape/detail split in pass one; a second pass on the training setup and loss terms pays off if you intend to reproduce or fine-tune the model.
Facial
-
, , , , ,
Neural blend shapes learned jointly with skeleton articulations, producing pose-dependent non-linear deformations from a single neutral-pose mesh.
abstract ▾ abstract ▴
Animating a newly designed character using motion capture (mocap) data is a long standing problem in computer animation. A key consideration is the skeletal structure that should correspond to the available mocap data, and the shape deformation in the joint regions, which often requires a tailored, pose-specific refinement. In this work, we develop a neural technique for articulating 3D characters using enveloping with a pre-defined skeletal structure which produces high quality pose dependent deformations. Our framework learns to rig and skin characters with the same articulation structure ( e.g. , bipeds or quadrupeds), and builds the desired skeleton hierarchy into the network architecture. Furthermore , we propose neural blend shapes - a set of corrective pose-dependent shapes which improve the deformation quality in the joint regions in order to address the notorious artifacts resulting from standard rigging and skinning. Our system estimates neural blend shapes for input meshes with arbitrary connectivity, as well as weighting coefficients which are conditioned on the input joint rotations. Unlike recent deep learning techniques which supervise the network with ground-truth rigging and skinning parameters, our approach does not assume that the training data has a specific underlying deformation model.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Real-Time Deformation with Coupled Cages and Skeletons · HeterSkinNet: A Heterogeneous Network for Skin Weights Prediction · S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
how to read this ▾ how to read this ▴
- Category
- Method: learned rigging and skinning with neural blend shapes
- Contributions
-
- A neural technique that rigs and skins characters of a shared articulation structure (bipeds, quadrupeds) with a predefined skeleton baked into the network architecture
- Neural blend shapes, a set of corrective pose-dependent shapes that improve joint-region deformation and reduce standard skinning artifacts
- Estimation of blend shapes and weighting coefficients conditioned on joint rotations for meshes with arbitrary connectivity
- Context
- Sits in the learned-rigging and pose-dependent-deformation lineage and relates to parametric body modelling such as SMPL (Loper et al.), which similarly uses pose-corrective shapes on a skinned model. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Assumes a fixed articulation structure and a single neutral-pose input; the method targets joint-region artifact reduction, so readers should note its scope is characters matching the trained skeleton class rather than arbitrary topologies or skeletons.
- Clarity
- Accessible motivation (rigging/skinning artifacts); a first pass conveys the neural-blend-shape idea, a second pass clarifies the architecture and conditioning.
- How to read it
- Read pass one for the rig-and-skin-plus-corrective concept; do a second pass on the network architecture and how blend shapes are conditioned on rotations if you want to implement or compare against it.
Skinning / ML Deformation
-
NetEase presents a deep learning system generating full-body NPC animation including lips, facial expression, head rotation, and gestures from speech in under 500ms, eliminating manual mocap processing.
Motion Synthesis / Facial
-
Senior Product Owner Will Telford presents the Maya 2022 rigging updates including Component Tags, Deformer Falloffs, and topology-independent procedural workflows for character deformation.
abstract ▾ abstract ▴
Autodesk Senior Product Owner Will Telford walks through the rigging additions in Maya 2022, centered on moving deformation toward procedural, topology-independent workflows. He demonstrates Component Tags as named component collections stored on the shape that drive deformer membership through Boolean expressions and replace group ID, group part, and deformer set nodes, plus their use in proximity wrap bind tags. He covers Deformer Falloffs as reusable node networks for weighting, including primitive (sphere and plane), blend, and paintable component falloffs with layered non-destructive weight painting, and transfer falloffs that remap weights across changed topology. He also introduces the GPU-accelerated solidify and morph deformers, the latter using component match lookup tables and surface-relative spaces to do partial deformation that offloads work the blend shape previously handled.
Related Maya 2017 Update 3: Tension Deformer and Bake Deformer Tool · Maya 2020 | Proximity Wrap Deformer · Speed Up Animation Workflows With Maya's ML Deformer, Powered by Autodesk AI · Empowering rigs using Offset Parent Matrix [MAYA 2020]
how to read this ▾ how to read this ▴
- Category
- Production talk: Maya 2022 rigging features walkthrough
- Contributions
-
- Demonstrates Component Tags, named component collections stored on the shape that drive deformer membership via Boolean expressions and replace group ID, group part, and deformer set nodes
- Shows Deformer Falloffs as reusable node networks for weighting (primitive, blend, paintable, and topology-remapping transfer falloffs) with layered non-destructive painting
- Introduces GPU-accelerated solidify and morph deformers, with morph using component match lookup tables and surface-relative spaces for partial deformation that offloads work from blend shapes
- Context
- A product update talk continuing Autodesk's rigging-tool lineage (following New Character Rigging and Animation Tools in Maya), pushing deformation toward procedural, topology-independent workflows. Builds on: New Character Rigging and Animation Tools in Maya
- Correctness
- Vendor product demonstration, not peer-reviewed; the workflows are production-oriented and shown by the product owner, so readers should treat capabilities and benefits as presented rather than independently benchmarked.
- Clarity
- Highly accessible to riggers; a single viewing conveys the features, with hands-on testing in Maya being the real second pass.
- How to read it
- Watch for which legacy nodes each feature replaces (Component Tags vs group ID/deformer sets, morph vs blend shape); revisit specific segments when you actually rebuild a rig to use the procedural, topology-independent path.
Rigging / Skinning
-
, ,
This method animates yarn-level cloth geometry on top of an underlying deforming triangle mesh in a mechanics-aware way.
abstract ▾ abstract ▴
This method animates yarn-level cloth geometry on top of an underlying deforming triangle mesh in a mechanics-aware way. Precomputed yarn geometry is tiled over each triangle in material space, and triangle strains drive a database lookup that applies the appropriate material displacement to produce deformed yarn geometry, reproducing effects such as knit loops tightening under stretch. Combined with precomputed or real-time mesh simulation, it animates yarn-level knitted and woven cloth in real time at large scales.
Related Homogenized Yarn-Level Cloth · Mixing Yarns and Triangles in Cloth Simulation · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Yarn-Level Simulation of Woven Cloth
how to read this ▾ how to read this ▴
- Category
- Method: mechanics-aware yarn-level cloth geometry
- Contributions
-
- Animates yarn-level cloth geometry on top of an underlying deforming triangle mesh in a mechanics-aware way
- Tiles precomputed yarn geometry per triangle in material space and uses triangle strains to drive a database lookup that applies the appropriate material displacement
- Reproduces yarn-scale effects such as knit loops tightening under stretch and runs in real time at large scales when combined with mesh simulation
- Context
- Extends yarn-level cloth modelling (e.g., Cirio et al.'s yarn-level simulation of woven cloth) by decoupling expensive yarn mechanics into a precomputed, strain-indexed displacement applied over a cheaper sheet simulation. Builds on: Yarn-Level Simulation of Woven Cloth
- Correctness
- The core assumption is that local yarn response can be captured by a strain-to-displacement database tiled per triangle; this trades full yarn-level dynamics for a data-driven approximation, so readers should expect it to capture characteristic deformations rather than exact contact-resolved yarn physics.
- Clarity
- Conceptually clear (geometry on a deforming mesh); a first pass conveys the tiling-plus-lookup idea, a second pass clarifies the material-space mapping and database construction.
- How to read it
- Focus on the precompute-then-lookup pipeline and the material-space tiling in pass one; a second pass on how strains index the database pays off if you care about reproducing the mechanics-aware detail.
CFX
- MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement CVPR Academic 278 cites
, , , ,
Disentangles audio-correlated and audio-uncorrelated facial motion via a categorical latent space and cross-modality loss for full-face animation.
abstract ▾ abstract ▴
This paper presents a generic method for generating full facial 3D animation from speech. Existing approaches to audio-driven facial animation exhibit uncanny or static upper face animation, fail to produce accurate and plausible co-articulation or rely on person-specific models that limit their scalability. To improve upon existing models, we propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face. At the core of our approach is a categorical latent space for facial animation that disentangles audio-correlated and audio-uncorrelated information based on a novel cross-modality loss. Our approach ensures highly accurate lip motion, while also synthesizing plausible animation of the parts of the face that are uncorrelated to the audio signal, such as eye blinks and eye brow motion. We demonstrate that our approach outperforms several baselines and obtains state-of-the-art quality both qualitatively and quantitatively. A perceptual user study demonstrates that our approach is deemed more realistic than the current state-of-the-art in over 75% of cases. We recommend watching the supplemental video before reading the paper: https://github.com/facebookresearch/meshtalk
Related FaceFormer: Speech-Driven 3D Facial Animation with Transformers · Capture, Learning, and Synthesis of 3D Speaking Styles · SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation · MoGlow: Probabilistic and Controllable Motion Synthesis Using Normalising Flows
how to read this ▾ how to read this ▴
- Category
- Method: speech-driven full-face 3D animation
- Contributions
-
- A generic (not person-specific) audio-driven approach that synthesizes realistic motion for the entire face from speech
- A categorical latent space with a novel cross-modality loss that disentangles audio-correlated from audio-uncorrelated facial motion
- Accurate lip motion plus plausible animation of audio-uncorrelated parts such as eye blinks and brow motion, reported to be preferred over prior state of the art in a perceptual study
- Context
- Advances the speech-to-face-animation lineage, notably VOCA (Cudeiro et al., Capture, Learning, and Synthesis of 3D Speaking Styles), by addressing static upper-face motion and person-specific limitations through cross-modality disentanglement. Builds on: Capture, Learning, and Synthesis of 3D Speaking Styles
- Correctness
- Key assumption is that facial motion cleanly splits into audio-correlated and audio-uncorrelated components captured by a categorical latent; results rest on quantitative comparisons and a perceptual user study, so readers should weigh the subjective-preference evidence alongside the modelling assumption rather than as a hard accuracy guarantee.
- Clarity
- Accessible motivation; a first pass conveys the disentanglement idea, a second pass is needed for the categorical latent space and loss.
- How to read it
- Read pass one for the cross-modality disentanglement framing; do a second pass on the categorical latent and loss design, and watch the supplemental video, before judging the synthesis quality.
Facial / Motion Synthesis
-
, , ,
The appearance of a real feather arises from light interacting with complex patterned structures across multiple scales that earlier simplified curve models do not capture.
abstract ▾ abstract ▴
The appearance of a real feather arises from light interacting with complex patterned structures across multiple scales that earlier simplified curve models do not capture. Using imaging of real feathers, the authors show why prior approaches are insufficient and motivate a dedicated appearance model for feathers. They present a microstructure-based rendering approach that better reproduces feather appearance for computer graphics.
Related A Surface-based Appearance Model for Pennaceous Feathers · Procedurally Generating Biologically Driven Feathers · Biological Modeling of Feathers by Morphogenesis Simulation · Rendering Iridescent Rock Dove Neck Feathers
how to read this ▾ how to read this ▴
- Category
- Method: microstructure-based appearance model for feathers
- Contributions
-
- Imaging of real feathers used to show why prior simplified curve models are insufficient for feather appearance
- A microstructure-based rendering approach that reproduces feather appearance across the multiple patterned scales at which light interacts with the structure
- Context
- Builds on the authors' earlier procedural feather work (Procedurally Generating Biologically Driven Feathers) and sits in the appearance-modelling lineage for fiber and fur-like structures, here targeting the multi-scale microstructure of feathers. Builds on: Procedurally Generating Biologically Driven Feathers
- Correctness
- The motivating claim is that single-scale curve models miss real feather appearance; validation is grounded in imaging of real feathers, so readers should treat it as a perceptually and observationally motivated model rather than a fully measured BRDF benchmark.
- Clarity
- Accessible motivation via real-feather imaging; a first pass conveys why prior models fail, a second pass clarifies the microstructure rendering details.
- How to read it
- Focus pass one on the multi-scale-structure argument and the comparison to curve models; a second pass on the rendering construction pays off if you need to implement feather shading.
CFX
-
,
Naughty Dog technical animators documented the challenges and solutions of shipping motion matching in The Last of Us Part II, from early excitement through production hurdles to final acclaim.
Motion Synthesis
-
, , ,
Neural animation layering system enabling real-time synthesis of complex martial arts movements by composing motion layers with learned networks.
abstract ▾ abstract ▴
Interactively synthesizing novel combinations and variations of character movements from different motion skills is a key problem in computer animation. In this paper, we propose a deep learning framework to produce a large variety of martial arts movements in a controllable manner from raw motion capture data. Our method imitates animation layering using neural networks with the aim to overcome typical challenges when mixing, blending and editing movements from unaligned motion sources. The framework can synthesize novel movements from given reference motions and simple user controls, and generate unseen sequences of locomotion, punching, kicking, avoiding and combinations thereof, but also reconstruct signature motions of different fighters, as well as close-character interactions such as clinching and carrying by learning the spatial joint relationships. To achieve this goal, we adopt a modular framework which is composed of the motion generator and a set of different control modules. The motion generator functions as a motion manifold that projects novel mixed/edited trajectories to natural full-body motions, and synthesizes realistic transitions between different motions.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds · A Deep Learning Framework for Character Motion Synthesis and Editing · Automated Extraction and Parameterization of Motions in Large Data Sets
how to read this ▾ how to read this ▴
- Category
- Method: neural animation layering for motion synthesis
- Contributions
-
- A deep learning framework that imitates animation layering to mix, blend, and edit movements from unaligned motion-capture sources in a controllable way
- A modular design of a motion generator (acting as a motion manifold) plus separate control modules that synthesizes novel locomotion, punching, kicking, avoiding, and combinations with realistic transitions
- Reconstruction of fighter-specific signature motions and close-character interactions such as clinching and carrying by learning spatial joint relationships
- Context
- Builds on Starke et al.'s Local Motion Phases for multi-contact movements and echoes the layered, scriptable-actor idea of Perlin's Improv, applying learned manifolds to martial-arts motion composition. Builds on: Local Motion Phases for Learning Multi-Contact Character Movements · Improv: A System for Scripting Interactive Actors in Virtual Worlds
- Correctness
- Assumes that a learned motion manifold can project mixed or edited trajectories back to natural full-body motion; demonstrated on martial-arts mocap, so readers should note generalization beyond the trained motion domain and interaction types is not established here.
- Clarity
- Accessible at the systems level; a first pass conveys the layering analogy and modular structure, a second pass clarifies the generator and control-module networks.
- How to read it
- Read pass one for the layering-as-neural-modules concept and the manifold idea; a second pass on the control modules and phase handling pays off if you want to build interactive real-time synthesis.
Motion Synthesis
- Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans CVPR Academic 859 cites
, , , , , ,
Anchors per-frame neural radiance latent codes to a deformable SMPL mesh so that sparse-view observations are integrated across time for dynamic human reconstruction.
abstract ▾ abstract ▴
This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views. Some recent works have shown that learning implicit neural representations of 3D scenes achieves remarkable view synthesis quality given dense input views. However, the representation learning will be ill-posed if the views are highly sparse. To solve this ill-posed problem, our key idea is to integrate observations over video frames. To this end, we propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated. The deformable mesh also provides geometric guidance for the network to learn 3D representations more efficiently. To evaluate our approach, we create a multi-view dataset named ZJU-MoCap that captures performers with complex motions. Experiments on ZJU-MoCap show that our approach outperforms prior works by a large margin in terms of novel view synthesis quality. We also demonstrate the capability of our approach to reconstruct a moving person from a monocular video on the People-Snapshot dataset.
Related Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting · S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling · SMPLicit: Topology-aware Generative Model for Clothed People
how to read this ▾ how to read this ▴
- Category
- Method: implicit neural representation for dynamic human view synthesis
- Contributions
-
- Neural Body, a human representation where per-frame neural fields share one set of latent codes anchored to a deformable mesh, integrating sparse-view observations across video frames
- Use of the deformable mesh as geometric guidance to learn 3D representations more efficiently from highly sparse views
- The ZJU-MoCap multi-view dataset of performers with complex motions, used to show large improvement in novel-view-synthesis quality over prior work
- Context
- Combines neural-radiance-field-style implicit scene representations with parametric body modelling via SMPL (Loper et al.), anchoring structured latent codes to the SMPL mesh to make sparse-view dynamic reconstruction well posed. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Key assumption is that frames share latent codes anchored to an accurate deformable body mesh, so quality depends on body-fit accuracy and on motion being captured across frames; evaluated on the authors' ZJU-MoCap data, meaning readers should weigh results against that capture setting and the monocular case shown as a demonstration.
- Clarity
- Accessible idea (codes anchored to a mesh); a first pass conveys the integrate-over-frames insight, a second pass clarifies the latent-code structure and rendering.
- How to read it
- Focus pass one on the structured-latent-code-on-mesh idea and why it resolves the sparse-view ill-posedness; a second pass on the network and dataset pays off if you plan to reproduce or benchmark on ZJU-MoCap.
ML Deformation / Skinning
- PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation SIGGRAPH Asia Academic 110 cites
, ,
Formulates physics-based simulation as an implicit deep learning loss to unsupervisedly learn garment pose space deformation bases for dressed humans, matching PBS quality in comparable training time.
abstract ▾ abstract ▴
We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to unsupervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth. While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used.
Related Neural Cloth Simulation · SNUG: Self-Supervised Neural Dynamic Garments · SwinGar: Spectrum-Inspired Neural Dynamic Deformation for Free-Swinging Garments · Motion Guided Deep Dynamic 3D Garments
how to read this ▾ how to read this ▴
- Category
- Method: a neural cloth simulator (unsupervised garment PSD learning)
- Contributions
-
- Formulates physically based simulation as an implicit deep learning loss to learn garment Pose Space Deformation bases without ground-truth simulation data
- Trains a neural cloth model for dressed humans in time comparable to running PBS on a few sequences
- Presents what the authors describe as the first neural simulator for cloth
- Context
- Targets the same problem as learning-based virtual try-on (Santesteban et al.), replacing data-hungry supervised PSD learning with a self-supervised, physics-loss formulation over LBS-rigged garments. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Validated in a constrained scenario (dressed humans on a parametric body), and being unsupervised it sidesteps PBS data collection, but the constrained setting and the chosen physics terms bound how general or accurate the learned deformations are.
- Clarity
- Accessible at the idea level; a first pass conveys the unsupervised-physics-loss concept, a second pass is needed for the loss formulation and PSD parameterization.
- How to read it
- First pass for the implicit-PBS-as-loss idea and why it avoids simulation data; do a second pass on the loss terms and the LBS plus PSD setup if you intend to reimplement or compare against PBS.
CFX / ML Deformation
- PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network MIG Academic 3 cites
,
Particle filtering policy network enabling continuous control of physically simulated characters with improved robustness and diversity.
abstract ▾ abstract ▴
Data-driven methods for physics-based character control using reinforcement learning have been successfully applied to generate high-quality motions. However, existing approaches typically rely on Gaussian distributions to represent the action policy, which can prematurely commit to suboptimal actions when solving high-dimensional continuous control problems for highly-articulated characters. In this paper, to improve the learning performance of physics-based character controllers, we propose a framework that considers a particle-based action policy as a substitute for Gaussian policies. We exploit particle filtering to dynamically explore and discretize the action space, and track the posterior policy represented as a mixture distribution. The resulting policy can replace the unimodal Gaussian policy which has been the staple for character control problems, without changing the underlying model architecture of the reinforcement learning algorithm used to perform policy optimization. We demonstrate the applicability of our approach on various motion capture imitation tasks. Baselines using our particle-based policies achieve better imitation performance and speed of convergence as compared to corresponding implementations using Gaussians, and are more robust to external perturbations during character control. Related code is available at: https://motion-lab.github.io/PFPN.
Related Composite Motion Learning with Task Control · DReCon: Data-Driven Responsive Control of Physics-Based Characters · UniCon: Universal Neural Controller for Physics-Based Character Motion · CALM: Conditional Adversarial Latent Models for Directable Virtual Characters
how to read this ▾ how to read this ▴
- Category
- Method: a reinforcement-learning policy representation for physics-based character control
- Contributions
-
- Proposes a particle-based action policy (PFPN) as a drop-in substitute for the usual Gaussian policy in RL character control
- Uses particle filtering to dynamically explore and discretize the action space and track a mixture posterior policy
- Reports improved imitation performance and convergence speed on motion-capture imitation tasks without changing the underlying RL architecture
- Context
- Builds on example-guided physics-based RL control in the DeepMimic lineage (Peng et al.), addressing the limitation that unimodal Gaussian policies can commit prematurely in high-dimensional articulated control. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- Demonstrated on motion-capture imitation tasks as a policy substitution, so gains are shown within that setting; the particle representation adds machinery and the benefit outside imitation or at different action dimensionalities is not something to assume from the abstract.
- Clarity
- Moderately technical; a first pass conveys the Gaussian-versus-particle-policy motivation, a second pass is needed for the particle-filtering update and mixture-policy details.
- How to read it
- First pass to grasp why a multimodal particle policy helps articulated control; do a second pass on the particle-filter mechanics only if you work in RL motion imitation and want to swap the policy class.
Motion Synthesis
-
,
Upsampling approach that adds high-resolution wrinkle detail to low-resolution cloth simulations using physics-inspired priors for game use.
abstract ▾ abstract ▴
Proposes a data-driven method for learning linear upsampling operators that enrich coarse cloth simulation meshes with mid-scale details while maintaining interactive performance. Uses harmonic regularization to fit training data without overfitting, and employs tracking constraints with harmonic test functions to align coarse and fine-scale simulations. Demonstrates generalization to unseen conditions like different wind velocities and novel character motions.
Related Continuum-based Strain Limiting · Directing Cloth Draping through Blended UVs · Strain Based Dynamics · Untangling Cloth
how to read this ▾ how to read this ▴
- Category
- Method: data-driven cloth upsampling for interactive/game use
- Contributions
-
- Learns linear upsampling operators that add mid-scale detail to coarse cloth meshes at interactive performance
- Uses harmonic regularization to fit training data without overfitting and harmonic test functions as tracking constraints to align coarse and fine simulations
- Demonstrates generalization to unseen conditions such as different wind velocities and novel character motions
- Context
- Sits in the coarse-to-fine cloth detailing line and pairs with position-based / compliant-constraint simulators (XPBD, Macklin et al.) as the underlying low-resolution solver it enriches. Builds on: XPBD: Position-Based Simulation of Compliant Constrained Dynamics
- Correctness
- Linear operators keep the method fast and the harmonic constraints aim to curb overfitting, with generalization shown on wind and motion variation; being a learned upsampler it adds plausible mid-scale detail rather than physically exact fine-scale dynamics, so accuracy versus a full high-resolution sim is a caveat.
- Clarity
- Accessible framing for practitioners; a first pass conveys the upsampling idea and games target, a second pass is needed for the harmonic regularization and tracking-constraint math.
- How to read it
- First pass for whether learned upsampling fits your real-time pipeline; do a second pass on the harmonic test-function constraints if you need to train operators for your own cloth assets.
CFX
-
Blue Sky Studios defines modular Houdini-based element blocks that allow artists to contribute to USD character and shot pipelines without deep USD knowledge.
abstract ▾ abstract ▴
We present a procedural block-based approach for USD pipelines that minimizes up-front USD knowledge requirements while ensuring users can still leverage the power of native USD. Building on USD and Conduit, we define fundamental workflow principles and philosophies on artist-interaction that guide our modular Houdini-based toolsets. Finally, we discuss the successes and challenges in scaling these workflows into production.
Related A Pipeline Retrospective on USD and Conduit · USD in Production · Universal Scene Description: Open Source Release · Combining the Benefits of Nodes and Layers in a USD World
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline design (USD workflows)
- Contributions
-
- Demonstrates a procedural block-based approach to USD pipelines that lowers the up-front USD knowledge artists need while keeping native USD power available
- Defines workflow principles and modular Houdini-based toolsets layered on USD and Conduit
- Discusses successes and challenges in scaling these workflows into production
- Context
- Extends Blue Sky Studios' Conduit pipeline work (Staeubli et al.) and its USD-plus-Conduit retrospective (Hallac et al.), focusing on artist-facing modular blocks atop that foundation. Builds on: Conduit: A Modern Pipeline for the Open Source World · A Pipeline Retrospective on USD and Conduit
- Correctness
- Studio practice, not peer-reviewed; the block approach is production-proven at one studio with its own Conduit-based stack, so portability to other pipelines and tools is not guaranteed.
- Clarity
- Accessible and practitioner-oriented; a single read conveys the philosophy, with details living in the toolset and pipeline specifics.
- How to read it
- Read once for the modular-block philosophy and the artist-abstraction-over-USD argument; revisit the scaling successes-and-challenges section if you are designing a USD pipeline for non-USD-fluent artists.
Rigging
-
, , , , ,
Real-time neural character deformation combining skeleton-driven skinning with learned dynamic detail including cloth and soft-tissue effects.
abstract ▾ abstract ▴
This paper proposes a deep videorealistic 3D human character model that displays highly realistic shape, motion, and dynamic appearance learned in a weakly supervised way from multi-view imagery. In contrast to prior work, the controllable character displays motion-dependent dynamics such as the swing of a skirt without requiring physics simulation, along with a learned dynamic texture model that captures motion-dependent appearance and view-dependent lighting effects. The method uses a parametric differentiable character representation combined with embedded deformation and per-vertex displacements regressed by a novel structure-aware graph convolutional network, and a neural generative dynamic texture model, all trained from multi-view video using differentiable rendering. Taking only a skeletal motion and camera view as input, the model creates physically plausible clothing deformations and video-realistic textures in real time.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Codec Avatars: Photorealistic Telepresence at Scale · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · Learning Skeletal Articulations with Neural Blend Shapes
how to read this ▾ how to read this ▴
- Category
- Method: real-time learned dynamic character model (deformation plus appearance)
- Contributions
-
- Learns a controllable, videorealistic 3D human with motion-dependent dynamics (such as a swinging skirt) without physics simulation, weakly supervised from multi-view video
- Combines a parametric differentiable character with embedded deformation and per-vertex displacements regressed by a structure-aware graph convolutional network
- Adds a neural generative dynamic texture model for motion-dependent appearance and view-dependent lighting, producing real-time output from skeletal motion and camera view
- Context
- Advances learned deformation approximation in the spirit of Fast and Deep Deformation Approximations (Bailey et al.), extending it to clothing dynamics and a jointly learned dynamic texture trained via differentiable rendering. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Trained weakly supervised from multi-view imagery and producing physically plausible (not simulated) deformations, so results depend on captured motions and views; plausibility rather than physical accuracy and generalization beyond the training distribution are the caveats to keep in mind.
- Clarity
- Dense, multi-component system; a first pass conveys the skeleton-plus-learned-dynamics-plus-dynamic-texture pipeline, second and third passes are needed for the graph-network and differentiable-rendering training details.
- How to read it
- First pass for the overall real-time architecture and what each module contributes; do a second pass on the structure-aware GCN and dynamic texture model if neural character rendering is your focus.
ML Deformation / Skinning
- Reinventing a Character Creation Pipeline Using Landmarking, Simulation, and Shared Character Data SIGGRAPH Industrial 0 cites
, , , , ,
Blue Sky Studios' Universal Mesh-centered pipeline integrating automated rig landmark placement, geometry-driven rigging tools, and stylized simulation-based deformation.
abstract ▾ abstract ▴
Reinventing the humanoid character build pipeline at Blue Sky Studios presented several opportunities to create synergy between different processes, all centering around the concept of creating and maintaining a standardized Universal Mesh. The automation of rig argument placement, the building of rigging tools that use aspects of character geometry as inputs, the separation of character data from assets, and the creation of a stylized simulation-based approach to deform animated characters were all influenced by this base Universal Mesh. Ultimately, this approach offered new ways to get the most out of our character pipeline.
Related Delta Mush: Smoothing Deformations While Preserving Detail · Patch-based Surface Relaxation · Flesh, Flab, and Fascia Simulation on Zootopia · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2
how to read this ▾ how to read this ▴
- Category
- Production talk / character pipeline redesign
- Contributions
-
- Centers the humanoid character build pipeline on a standardized Universal Mesh to create synergy across processes
- Automates rig argument (landmark) placement and builds rigging tools that take character geometry as inputs
- Separates character data from assets and introduces a stylized simulation-based approach to deform animated characters
- Context
- Continues Blue Sky Studios' rigging-pipeline work on achieving and maintaining real-time rigs (Hallac et al.), reorganizing it around a shared base mesh and geometry-driven tooling. Builds on: Achieving and Maintaining Real-Time Rigs
- Correctness
- Studio practice, not peer-reviewed; the Universal Mesh approach is production-proven at Blue Sky on stylized characters, so its fit for other studios or non-stylized, topology-varying characters should not be assumed.
- Clarity
- Accessible to technical-artist and pipeline readers; one read conveys the Universal-Mesh organizing idea and how landmarking, rigging, and simulation hang off it.
- How to read it
- Read once for the Universal-Mesh-as-shared-source concept and the data-versus-asset separation; revisit the landmarking and geometry-driven rigging sections if you are restructuring a character build pipeline.
Rigging / Skinning
-
,
A closer look at MetaHuman rigs in Maya and Unreal Engine, covering customization workflows for high-fidelity digital human characters.
Rigging / Facial / Skinning
-
, , , , , , ,
Jointly learns implicit shape, skeleton topology, and skinning weight fields from data, enabling animation and novel view synthesis from RGB or LiDAR input.
abstract ▾ abstract ▴
Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation. As there are exponentially many variations of humans with different shape, pose and clothing, it is critical to develop methods that can automatically reconstruct and animate humans at scale from real world data. Towards this goal, we represent the pedestrian’s shape, pose and skinning weights as neural implicit functions that are directly learned from data. This representation enables us to handle a wide variety of different pedestrian shapes and poses without explicitly fitting a human parametric body model, allowing us to handle a wider range of human geometries and topologies. We demonstrate the effectiveness of our approach on various datasets and show that our reconstructions outperform existing state-of-the-art methods. Furthermore, our re-animation experiments show that we can generate 3D human animations at scale from a single RGB image (and/or an optional LiDAR sweep) as input.
Related One Model to Rig Them All: Diverse Skeleton Rigging with UniRig · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · Learning Skeletal Articulations with Neural Blend Shapes · MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds
how to read this ▾ how to read this ▴
- Category
- Method: neural implicit human modeling (shape, skeleton, skinning fields)
- Contributions
-
- Jointly represents a person's shape, skeleton/pose, and skinning weights as neural implicit functions learned directly from data
- Avoids explicitly fitting a parametric body model, handling a wider range of human geometries and topologies
- Reconstructs and re-animates 3D humans at scale from a single RGB image with an optional LiDAR sweep
- Context
- Positions itself relative to parametric body models like SMPL (Loper et al.), replacing the fixed template with learned implicit shape, skeleton, and skinning fields. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Demonstrated on pedestrian-style datasets with RGB and optional LiDAR input and reported to outperform prior reconstructions; as a learned implicit method, output fidelity depends on the training data distribution and on how much the optional LiDAR sweep contributes, which a reader should weigh.
- Clarity
- Technical but well-motivated; a first pass conveys the three-field decomposition and the template-free argument, a second pass is needed for the field formulations and training.
- How to read it
- First pass for why dropping the parametric template helps with diverse geometries; do a second pass on the skeleton and skinning field definitions if you build implicit animatable humans.
Skinning / ML Deformation / Rigging
-
, , ,
Converts raw 3D scans of clothed humans into animatable avatars using locally pose-aware implicit functions for pose-dependent correctives without mesh registration.
abstract ▾ abstract ▴
We present SCANimate, an end-to-end trainable framework that takes raw 3D scans of a clothed human and turns them into an animatable avatar. These avatars are driven by pose parameters and have realistic clothing that moves and deforms naturally. SCANimate does not rely on a customized mesh template or surface mesh registration. We observe that fitting a parametric 3D body model, like SMPL, to a clothed human scan is tractable while surface registration of the body topology to the scan is often not, because clothing can deviate significantly from the body shape. We also observe that articulated transformations are invertible, resulting in geometric cycle-consistency in the posed and unposed shapes. These observations lead us to a weakly supervised learning method that aligns scans into a canonical pose by disentangling articulated deformations without template-based surface registration. Furthermore, to complete missing regions in the aligned scans while modeling pose-dependent deformations, we introduce a locally pose-aware implicit function that learns to complete and model geometry with learned pose correctives. In contrast to commonly used global pose embeddings, our local pose conditioning significantly reduces long-range spurious correlations and improves generalization to unseen poses, especially when training data is limited.
Related SMPLicit: Topology-aware Generative Model for Clothed People · NiLBS: Neural Inverse Linear Blend Skinning · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · Learning Skeletal Articulations with Neural Blend Shapes
how to read this ▾ how to read this ▴
- Category
- Method: weakly supervised animatable clothed avatars from scans
- Contributions
-
- End-to-end framework turning raw 3D scans of clothed humans into pose-driven animatable avatars without a custom mesh template or surface registration
- Uses geometric cycle-consistency of invertible articulated transformations to canonicalize scans in a weakly supervised way
- Introduces a locally pose-aware implicit function that completes missing regions and models pose-dependent deformations via learned local pose correctives
- Context
- Builds on fitting the SMPL parametric body (Loper et al.) to scans, then departs from template surface registration by learning local pose-aware implicit correctives instead of a global pose embedding. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Relies on the observations that SMPL fitting to clothed scans is tractable and articulated transforms are invertible (cycle-consistent), trained weakly supervised; clothing that deviates far from the body or scan completion in unseen regions are where the learned correctives are most stretched, so a reader should keep generalization limits in mind.
- Clarity
- Conceptually rich; a first pass conveys the canonicalization-via-cycle-consistency and local-implicit-corrective ideas, a second pass is needed for the disentanglement and implicit-function formulation.
- How to read it
- First pass for the registration-free canonicalization and local-pose-corrective insight; do a second pass on the cycle-consistency loss and local implicit function if you build avatars from scans.
Skinning / CFX / ML Deformation
- Semi-Supervised Video-Driven Facial Animation Transfer for Production SIGGRAPH Asia Industrial 30 cites
, , , , ,
Unsupervised image-to-image translation learns a shared latent space from video, then a supervised linear mapping drives character facial animation coefficients for production.
abstract ▾ abstract ▴
We propose a simple algorithm for automatic transfer of facial expressions, from videos to a 3D character, as well as between distinct 3D characters through their rendered animations. Our method begins by learning a common, semantically-consistent latent representation for the different input image domains using an unsupervised image-to-image translation model. It subsequently learns, in a supervised manner, a linear mapping from the character images' encoded representation to the animation coefficients. At inference time, given the source domain (i.e., actor footage), it regresses the corresponding animation coefficients for the target character. Expressions are automatically remapped between the source and target identities despite differences in physiognomy. We show how our technique can be used in the context of markerless motion capture with controlled lighting conditions, for one actor and for multiple actors. Additionally, we show how it can be used to automatically transfer facial animation between distinct characters without consistent mesh parameterization and without engineered geometric priors. We compare our method with standard approaches used in production and with recent state-of-the-art models on single camera face tracking.
Related A Facial Motion Retargeting Pipeline for Appearance Agnostic 3D Characters · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Transferring Facial Expressions to Different Face Models · Facial Retargeting with Automatic Range of Motion Alignment
how to read this ▾ how to read this ▴
- Category
- Method: production facial animation transfer (video-driven, semi-supervised)
- Contributions
-
- Learns a common semantically consistent latent space across image domains using an unsupervised image-to-image translation model
- Learns a supervised linear mapping from encoded character images to animation coefficients, regressing them from actor footage at inference
- Remaps expressions across differing physiognomies and transfers facial animation between characters without consistent mesh parameterization or engineered geometric priors
- Context
- Extends the authors' production facial-capture work (Masquerade, Moser et al.), shifting from fine-scale detail recovery toward cross-domain expression transfer via a shared learned latent space. Builds on: Masquerade: Fine-Scale Details for Head-Mounted Camera Motion Capture Data
- Correctness
- Demonstrated for markerless capture under controlled lighting for single and multiple actors and compared against standard production approaches; the controlled-lighting setting and the reliance on a learned shared latent space are practical assumptions a reader should note before generalizing to in-the-wild footage.
- Clarity
- Accessible given the simple two-stage (unsupervised latent plus supervised linear) recipe; a first pass conveys the pipeline, a second pass clarifies the translation model and coefficient regression.
- How to read it
- First pass for the latent-space-then-linear-map recipe and its production framing; do a second pass on the image-to-image model and the supervised mapping if you work on retargeting capture to characters.
Facial / Retargeting
-
Step-by-step guide to configuring the MotionBuilder Live Link plugin for streaming real-time mocap and character animation from MotionBuilder 2022 directly into Unreal Engine 5 for virtual production previsualization.
abstract ▾ abstract ▴
This tutorial walks through setting up Live Link to stream real-time animation from MotionBuilder 2022 into Unreal Engine 5. It covers enabling the Live Link 2.0 plugin in Unreal, installing the MoBu Live Link plugin by extracting the correct binaries into the Autodesk MotionBuilder 2022 plugins folder, and connecting the two via the UE Live Link device and a Message Bus Source. It then shows streaming the mannequin skeleton hierarchy using the subject selector, enabling camera sync, and driving an in-scene character by building an animation blueprint with a Live Link Pose node fed a local-space ref pose. The author also warns that cloth and dynamics can crash when an actor leaves the capture volume and recommends the more stable Chaos implementation for mocap shoots.
Related Autodesk MotionBuilder 2022 · How to Retarget Motion Capture in MotionBuilder · How to use the Motionbuilder STORY to IMPROVE your WORKFLOW · MotionBuilder: Essentials Characterization, Retargeting and Baking Animations
how to read this ▾ how to read this ▴
- Category
- Production tutorial / tooling walkthrough (real-time mocap streaming)
- Contributions
-
- Walks through enabling Live Link 2.0 in Unreal Engine 5 and installing the MoBu Live Link plugin by placing the correct binaries in the MotionBuilder 2022 plugins folder
- Shows connecting via the UE Live Link device and a Message Bus Source, streaming the mannequin skeleton with the subject selector, and enabling camera sync
- Drives an in-scene character through an animation blueprint with a Live Link Pose node fed a local-space ref pose, and warns that cloth and dynamics can crash when an actor leaves the capture volume, recommending the Chaos implementation
- Context
- Practical previsualization/virtual-production workflow connecting MotionBuilder 2022 to Unreal Engine 5 via the Live Link plugin, rather than building on a research lineage.
- Correctness
- Studio/tooling practice, not peer-reviewed; steps are version-specific to MotionBuilder 2022 and UE5 and reflect hands-on experience, including the noted cloth/dynamics crash on volume exit, so exact menus and stability may differ across versions.
- Clarity
- Step-by-step and accessible; following it once end-to-end is the way to absorb it, no second analytic pass needed.
- How to read it
- Follow along in the actual software with matching versions; pay attention to the binary-placement step, the Message Bus connection, the local-space ref pose, and the Chaos-cloth stability warning.
Retargeting / Motion Synthesis
-
,
Production cloth simulator using bilinear quad elements, improving stability and artist controllability over traditional triangle-based approaches.
abstract ▾ abstract ▴
The most widely used cloth simulation algorithms within the computer graphics community are defined exclusively for triangle meshes. However, assets used in production are often made up of non-planar quadrilaterals. Dividing these elements into triangles and then mapping the displacements back to the original mesh results in faceting and tent-like artifacts when quadrilaterals are rendered as bilinear patches. We propose a method to simulate cloth dynamics on quadrilateral meshes directly, drawing on the well studied Koiter thin sheet model [Koiter 1960] to define consistent elastic energies for linear and bilinear elements. The algorithm elides the need for artifact-prone geometric mapping, and has computation times similar to its fully triangular counterpart.
Related Discrete Shells · Dynamic Deformables: Implementation and Production Practicalities · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Nonlinear Cloth Simulation with Isogeometric Analysis
how to read this ▾ how to read this ▴
- Category
- Method: a cloth simulation algorithm for quad meshes
- Contributions
-
- Simulates cloth dynamics directly on quadrilateral meshes instead of triangulating them first
- Defines consistent elastic energies for linear and bilinear elements via the Koiter thin sheet model
- Avoids artifact-prone geometric mapping while keeping computation times similar to triangle-based methods
- Context
- Sits in the lineage of implicit cloth dynamics (Baraff and Witkin's Large Steps in Cloth Simulation) but reformulates the element energy around bilinear quad patches using the Koiter 1960 thin sheet model. Builds on: Large Steps in Cloth Simulation
- Correctness
- Aimed at production assets built from non-planar quads where triangulation causes faceting and tent-like artifacts; the claim of triangle-comparable cost and the soundness of the bilinear energy are best confirmed against the paper's own examples rather than assumed across all garment types.
- Clarity
- Motivation and the artifact it removes are accessible on a first pass; the Koiter-based energy derivation needs a second pass.
- How to read it
- First pass for why quads beat triangulate-then-map; do a second pass on the elastic energy formulation if you need to implement the bilinear element.
CFX
-
, , , ,
An implicit function conditioned on SMPL parameters and a semantically interpretable latent code generates clothing of diverse topologies including open jackets, skirts, and shoes.
abstract ▾ abstract ▴
In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. In contrast to existing learning-based approaches that require training specific models for each type of garment, SMPLicit can represent in a unified manner different garment topologies (e.g. from sleeveless tops to hoodies and to open jackets), while controlling other properties like the garment size or tightness/looseness. We show our model to be applicable to a large variety of garments including T-shirts, hoodies, jackets, shorts, pants, skirts, shoes and even hair. The representation flexibility of SMPLicit builds upon an implicit model conditioned with the SMPL human body parameters and a learnable latent space which is semantically interpretable and aligned with the clothing attributes. The proposed model is fully differentiable, allowing for its use into larger end-to-end trainable systems. In the experimental section, we demonstrate SMPLicit can be readily used for fitting 3D scans and for 3D reconstruction in images of dressed people. In both cases we are able to go beyond state of the art, by retrieving complex garment geometries, handling situations with multiple clothing layers and providing a tool for easy outfit editing.
Related SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks · N-Cloth: Predicting 3D Cloth Deformation with Mesh-Based Networks · SNUG: Self-Supervised Neural Dynamic Garments · Motion Guided Deep Dynamic 3D Garments
how to read this ▾ how to read this ▴
- Category
- Method: a generative model for clothed human geometry
- Contributions
-
- A single generative model that represents many garment topologies (sleeveless tops, hoodies, open jackets, skirts, shoes, hair) in a unified way
- An implicit model conditioned on SMPL body parameters plus a semantically interpretable, attribute-aligned latent space controlling size and tightness
- A fully differentiable formulation usable for fitting 3D scans and reconstructing dressed people from images
- Context
- Builds on learning-based clothed-body modeling (Ma et al.'s CAPE, Learning to Dress 3D People in Generative Clothing) and extends the SMPL body model with an implicit, topology-aware clothing representation. Builds on: Learning to Dress 3D People in Generative Clothing
- Correctness
- Strength is topology flexibility from a single model and differentiability for downstream fitting; reported gains are demonstrated on 3D scan fitting and image-based reconstruction, so generalization beyond those settings and to extreme poses or unusual garments should be read cautiously.
- Clarity
- High-level idea (implicit clothing conditioned on SMPL plus interpretable latent) is accessible; the implicit-function training and conditioning details reward a second pass.
- How to read it
- First pass for the representation idea and what the latent code controls; second pass on the implicit formulation and fitting pipeline if you plan to reuse or extend it.
CFX / ML Deformation
- SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes CVPR Academic 270 cites
, , , ,
Proposes forward skinning for neural implicit surfaces via iterative root finding with analytical implicit-differentiation gradients, enabling end-to-end training from posed meshes.
abstract ▾ abstract ▴
Neural implicit surface representations have emerged as a promising paradigm to capture 3D shapes in a continuous and resolution-independent manner. However, adapting them to articulated shapes is non-trivial. Existing approaches learn a backward warp field that maps deformed to canonical points. However, this is problematic since the backward warp field is pose dependent and thus requires large amounts of data to learn. To address this, we introduce SNARF, which combines the advantages of linear blend skinning (LBS) for polygonal meshes with those of neural implicit surfaces by learning a forward deformation field without direct supervision. This deformation field is defined in canonical, pose-independent, space, enabling generalization to unseen poses. Learning the deformation field from posed meshes alone is challenging since the correspondences of deformed points are defined implicitly and may not be unique under changes of topology. We propose a forward skinning model that finds all canonical correspondences of any deformed point using iterative root finding. We derive analytical gradients via implicit differentiation, enabling end-to-end training from 3D meshes with bone transformations. Compared to state-of-the-art neural implicit representations, our approach generalizes better to unseen poses while preserving accuracy.
Related Invertible Neural Skinning · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · Learning Skeletal Articulations with Neural Blend Shapes · 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting
how to read this ▾ how to read this ▴
- Category
- Method: forward skinning for neural implicit shapes
- Contributions
-
- Learns a forward (canonical-to-deformed) deformation field for neural implicit surfaces, avoiding pose-dependent backward warps
- Finds all canonical correspondences of a deformed point via iterative root finding, handling topology changes
- Derives analytical gradients through implicit differentiation, enabling end-to-end training from posed meshes alone
- Context
- Combines linear blend skinning for polygonal meshes (rooted in SMPL, Loper et al.) with neural implicit surface representations, addressing the data-hungry, pose-dependent nature of prior backward-warp approaches. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- The forward-skinning-plus-root-finding design targets better generalization to unseen poses than backward warps; correspondences are implicit and not always unique under topology change, and the iterative root finding adds cost, so test-time pose range and convergence are worth keeping in mind.
- Clarity
- The forward-vs-backward framing is clear; the root-finding scheme and implicit-differentiation gradients need a careful second pass.
- How to read it
- First pass for why forward skinning generalizes better; second and third passes on the root-finding and gradient derivation if you intend to reimplement or build on it.
Skinning / ML Deformation
- SuperTrack: Motion Tracking for Physically Simulated Characters Using Supervisory Signals SIGGRAPH Asia Ubisoft 45 cites
, ,
Supervisory signal method for physics-based motion tracking that enables simulated characters to closely follow diverse motion reference clips.
abstract ▾ abstract ▴
In this paper we show how the task of motion tracking for physically simulated characters can be solved using supervised learning and optimizing a policy directly via back-propagation. To achieve this we make use of a world model trained to approximate a specific subset of the environment's transition function, effectively acting as a differentiable physics simulator through which the policy can be optimized to minimize the tracking error. Compared to popular model-free methods of physically simulated character control which primarily make use of Proximal Policy Optimization (PPO) we find direct optimization of the policy via our approach consistently achieves a higher quality of control in a shorter training time, with a reduced sensitivity to the rate of experience gathering, dataset size, and distribution.
Related DReCon: Data-Driven Responsive Control of Physics-Based Characters · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation · QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars
how to read this ▾ how to read this ▴
- Category
- Method: learning-based motion tracking for physics-based characters
- Contributions
-
- Frames physics-based motion tracking as supervised learning, optimizing the control policy directly via back-propagation
- Trains a world model that approximates the environment's transition function, acting as a differentiable physics simulator
- Reports higher control quality in shorter training time with reduced sensitivity to dataset size and experience-gathering rate versus PPO-based methods
- Context
- Positioned against model-free PPO controllers for simulated characters and builds on data-driven responsive control (Bergamin et al.'s DReCon). Builds on: DReCon: Data-Driven Responsive Control of Physics-Based Characters
- Correctness
- Key assumption is that a learned world model is an accurate-enough differentiable proxy of physics for policy gradients; comparisons are made primarily to PPO on motion tracking, so the world-model approximation error and behavior outside the trained motion distribution are the limitations to watch.
- Clarity
- The supervised-versus-model-free framing is clear; the world-model training and differentiable-optimization mechanics need a second pass.
- How to read it
- First pass for the world-model-as-differentiable-simulator idea and why it beats PPO here; second pass on the training loop and loss if you want to reproduce the control quality.
Motion Synthesis
-
, ,
First shipped game neural cloth system predicting jersey deformations and normal-map details from skeleton pose at real-time rates in Madden NFL 21.
abstract ▾ abstract ▴
This work presents Swish, a real-time machine-learning based cloth simulation technique for games. Swish was used to generate realistic cloth deformation and wrinkles for NFL player jerseys in Madden NFL 21. To our knowledge, this is the first neural cloth simulation featured in a shipped game. This technique allows accurate high-resolution simulation for tight clothing, which is a case where traditional real-time cloth simulations often achieve poor results. We represent cloth detail using both mesh deformations and a database of normal maps, and train a simple neural network to predict cloth shape from the pose of a character’s skeleton. We share implementation and performance details that will be useful to other practitioners seeking to introduce machine learning into their real-time character pipelines.
Related Stable Spaces for Real-time Clothing · GarMatNet: A Learning-Based Method for Predicting 3D Garment Mesh with Parameterized Materials · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · Implementing a Machine Learning Deformer for CG Crowds: Our Journey
how to read this ▾ how to read this ▴
- Category
- Method / production: real-time neural cloth for games
- Contributions
-
- A neural network predicting cloth shape from a character's skeleton pose at real-time rates, shipped in Madden NFL 21
- Represents detail with both mesh deformations and a database of normal maps to capture high-resolution wrinkles
- Shares implementation and performance details for introducing ML into real-time character pipelines, targeting tight clothing where traditional real-time cloth struggles
- Context
- Builds on learning-based clothing animation for virtual try-on (Santesteban et al.) and adapts it to a shipped real-time game setting, billed as the first neural cloth simulation in a shipped game. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Demonstrated on NFL player jerseys (tight clothing driven by skeleton pose), so results are validated in that domain; the pose-to-shape mapping and normal-map database are tailored to that case, and generalization to loose or free-flowing garments is not the target.
- Clarity
- Practitioner-oriented and accessible; a first pass conveys the architecture and the mesh-plus-normal-map representation.
- How to read it
- First pass for the representation and shipping constraints; revisit the implementation and performance section if you are building a real-time character ML pipeline of your own.
CFX / ML Deformation
-
Pixar resurrects 33,000 legacy set and prop models using USD for reuse across four feature films, three in-production films, and six short-form projects.
abstract ▾ abstract ▴
At Pixar we have developed a set of tools to resurrect more than 33,000 previously-unusable set & prop models from our old films to be used as a studio-wide resource for previs, set dressing, cameos, automated testing, shader library development test subjects, short film and streaming projects, VR projects, and research. Based on the extensive use of USD [Disney/Pixar 2016], the Digital Backlot has now been used for four released feature films, three feature films still in production, and six completed short-form projects (with several more under development). Our pipeline also ensures that future asset development will continue to build up this library.
Related Forging a New Animation Pipeline with USD · Universal Scene Description: Open Source Release · Combining the Benefits of Nodes and Layers in a USD World · FIRA: Portable Realtime Rig Deformation
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline and asset-reuse system
- Contributions
-
- A toolset that resurrects more than 33,000 previously-unusable legacy set and prop models into a studio-wide reusable resource
- Uses USD to serve previs, set dressing, cameos, automated testing, shader-library test subjects, and short-form and VR projects
- A pipeline ensuring future asset development keeps building up the shared library, already used across released and in-production features and short-form work
- Context
- Built on Universal Scene Description (Pixar/Disney USD open-source release, 2016) as the interchange and composition foundation for cross-show asset reuse. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Studio practice, not peer-reviewed; the value is production-proven across multiple Pixar films and shorts, but the approach reflects Pixar's specific USD-centric pipeline and legacy-asset constraints rather than a portable, generalized method.
- Clarity
- Accessible and narrative; a single read conveys the motivation and scope without heavy technical depth.
- How to read it
- Read once for how USD enables large-scale legacy-asset revival and the breadth of downstream uses; useful as a reference if you run a USD-based studio pipeline.
Rigging
-
, , , , ,
Per-vertex deformations derived from skeletal linear and angular velocities add cartoon squash-and-stretch effects on top of standard skinning in real time.
abstract ▾ abstract ▴
Secondary animation effects are essential for liveliness. We propose a simple, real‐time solution for adding them on top of standard skinning, enabling artist‐driven stylization of skeletal motion. Our method takes a standard skeleton animation as input, along with a skin mesh and rig weights. It then derives per‐vertex deformations from the different linear and angular velocities along the skeletal hierarchy. We highlight two specific applications of this general framework, namely the cartoon‐like “squashy” and “floppy” effects, achieved from specific combinations of velocity terms. As our results show, combining these effects enables to mimic, enhance and stylize physical‐looking behaviours within a standard animation pipeline, for arbitrary skinned characters. Interactive on CPU, our method allows for GPU implementation, yielding real‐time performances even on large meshes. Animator control is supported through a simple interface toolkit, enabling to refine the desired type and magnitude of deformation at relevant vertices by simply painting weights. The resulting rigged character automatically responds to new skeletal animation, without further input.
Related Real-Time Skeletal Skinning with Optimized Centers of Rotation · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Skinning with Dual Quaternions · Stretchable and Twistable Bones for Skeletal Shape Deformation
how to read this ▾ how to read this ▴
- Category
- Method: a real-time stylized skinning technique
- Contributions
-
- Derives per-vertex deformations from linear and angular velocities along the skeletal hierarchy, layered on top of standard skinning
- Realizes cartoon-like squashy and floppy effects from specific combinations of velocity terms
- Supports artist control via weight painting and runs in real time on CPU with a GPU implementation for large meshes
- Context
- Extends standard skeletal skinning pipelines (in the tradition of dual-quaternion skinning, Kavan et al.) to add velocity-driven secondary motion without a physics solver. Builds on: Skinning with Dual Quaternions
- Correctness
- Assumes secondary, stylized motion can be approximated from skeletal velocities rather than physical simulation; demonstrated on arbitrary skinned characters and aimed at liveliness and stylization, so it targets plausible cartoon effects rather than physically accurate dynamics.
- Clarity
- Accessible; a first pass conveys the velocity-to-deformation idea, with a second pass for the exact velocity-term combinations.
- How to read it
- First pass for the squash-and-stretch concept and where it fits in the rig; second pass on the velocity formulation and weight toolkit if implementing it.
Skinning
-
, ,
This paper presents a soft tissue simulation method that replaces indirect volume preservation via Poisson's ratio with direct enforcement of zonal volume constraints, while controlling fine-scale vol
abstract ▾ abstract ▴
This paper presents a soft tissue simulation method that replaces indirect volume preservation via Poisson's ratio with direct enforcement of zonal volume constraints, while controlling fine-scale volumetric deformation through a cell-wise compression penalty. To improve realism, it adds an epidermis model that mimics the much higher surface stiffness of real skinned bodies. The approach produces plausible flesh and skin deformation with reliable volume conservation for character animation.
Related Active Volumetric Musculoskeletal Systems · Art-Directed Muscle Simulation for High-End Facial Animation · Building Accurate Physics-based Face Models from Data · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging
how to read this ▾ how to read this ▴
- Category
- Method: a volume-preserving soft tissue simulation
- Contributions
-
- Replaces indirect volume preservation via Poisson's ratio with direct enforcement of zonal volume constraints
- Controls fine-scale volumetric deformation with a cell-wise compression penalty
- Adds an epidermis model mimicking the higher surface stiffness of real skinned bodies for more plausible flesh and skin deformation
- Context
- Builds on finite-element flesh simulation for characters (Teran et al.'s robust quasistatic finite elements and flesh simulation), changing how volume conservation is enforced. Builds on: Robust Quasistatic Finite Elements and Flesh Simulation
- Correctness
- Claims reliable volume conservation and plausible flesh-and-skin behavior for character animation; results are demonstrated on character soft tissue, so the constraint and skin-stiffness model should be judged on those examples rather than assumed to transfer to all materials or large deformations.
- Clarity
- The motivation (direct zonal constraints over Poisson's-ratio coupling, plus a skin layer) is clear; the constraint formulation and penalty need a second pass.
- How to read it
- First pass for the volume-preservation and epidermis idea and why it improves realism; second pass on the constraint and compression-penalty math if you implement the solver.
Muscles
-
,
Novel techniques for simulating Southeast Asian wrapped garments (sampot, dhoti, bust-wraps) outside the standard seam-based pattern pipeline in Raya.
abstract ▾ abstract ▴
This talk outlines novel techniques used to create the complex wrapped clothing on Walt Disney Animation Studios’ “Raya and the Last Dragon”. Inspired by traditional Southeast Asian designs, these wrapped garments are formed by deftly folding long panels of cloth, with little to no reliance on seams to hold the structure. This departure from a standard pattern-based pipeline made the construction and performance of these specialized garments in CG a very challenging task. Using the sampot, dhoti, and bust-wrap garments as production examples, we describe their real-world counterpart designs and construction, discuss what makes them challenging to create in CG, and then outline how we extrapolated their designs and realized them for the stylistic needs and performances of the characters on the film.
Related Continuum-based Strain Limiting · GPU-Based Simulation of Cloth Wrinkles at Submillimeter Levels · Efficient Simulation of Inextensible Cloth · Adaptive Anisotropic Remeshing for Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Production talk / cloth pipeline techniques
- Contributions
-
- Techniques for simulating Southeast Asian wrapped garments (sampot, dhoti, bust-wrap) that rely on folding rather than seams
- Adapts construction and performance of these garments outside the standard seam-based, pattern pipeline
- Documents how real-world wrap designs were extrapolated and realized for the stylistic and performance needs of Raya's characters
- Context
- Addresses a gap in standard pattern-and-seam cloth workflows (which sit on solvers such as Tamstorf et al.'s smoothed-aggregation multigrid for cloth) for garments held together by folding rather than stitching. Builds on: Smoothed Aggregation Multigrid for Cloth Simulation
- Correctness
- Studio practice, not peer-reviewed; techniques are production-proven on Raya and the Last Dragon and shaped by the film's specific garments and stylistic needs, so they are case-driven solutions rather than a generalized wrapped-cloth method.
- Clarity
- Accessible and example-led around the sampot, dhoti, and bust-wrap; a single read conveys the challenges and the approach.
- How to read it
- Read once for why seam-free wrapped garments break the standard pipeline and the workarounds used; revisit when facing similarly non-pattern-based clothing.
CFX
2020
59-
, , ,
Represents garment deformations as UV-space images enabling convolutional neural networks to predict pose-dependent clothing deformation efficiently.
abstract ▾ abstract ▴
We propose a novel approach to learning cloth deformation as a function of body pose, recasting the graph‐like triangle mesh data structure into image‐based data in order to leverage popular and well‐developed convolutional neural networks (CNNs) in a two‐dimensional Euclidean domain. Then, a three‐dimensional animation of clothing is equivalent to a sequence of two‐dimensional RGB images driven/choreographed by time dependent joint angles. In order to reduce nonlinearity demands on the neural network, we utilize procedural skinning of the body surface to capture much of the rotation/deformation so that the RGB images only contain textures of displacement offsets from skin to clothing. Notably, we illustrate that our approach does not require accurate unclothed body shapes or robust skinning techniques. Additionally, we discuss how standard image based techniques such as image partitioning for higher resolution can readily be incorporated into our framework.
Related Stable Spaces for Real-time Clothing · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · Real-Time Hair Simulation with Neural Interpolation · SNUG: Self-Supervised Neural Dynamic Garments
how to read this ▾ how to read this ▴
- Category
- Method: data-driven, pose-dependent clothing deformation
- Contributions
-
- Recasting triangle-mesh garment deformation as UV-space RGB images so 2D CNNs can be applied directly
- Procedural body skinning that absorbs most rotation so the images only encode skin-to-clothing displacement offsets, reducing network nonlinearity
- A framework tolerant of inaccurate unclothed body shapes and skinning, with standard image techniques (e.g. partitioning) reused for higher resolution
- Context
- Relates to learning-based clothing animation (referenced Learning-Based Animation of Clothing for Virtual Try-On, Santesteban 2019), reframing the problem in an image-based Euclidean domain. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Demonstrated as a clothing animation driven by joint angles; the pixel/UV reformulation assumes deformation maps well to a 2D image domain, so seams, UV distortion, and topology effects are inherent considerations the abstract addresses only partially via partitioning.
- Clarity
- Clearly motivated and intuitive (mesh-to-image analogy); a first pass conveys the representation, a second pass is needed for the network and displacement-encoding details.
- How to read it
- Focus first on why the UV-image plus skinned-offset representation tames nonlinearity; second pass on the image partitioning and resolution handling if you plan to extend it to new garments or higher detail.
CFX / ML Deformation
-
, , ,
While cloth dynamics solvers have advanced substantially, progress in fast and reliable collision handling has lagged behind.
abstract ▾ abstract ▴
While cloth dynamics solvers have advanced substantially, progress in fast and reliable collision handling has lagged behind. This work studies the safety, efficiency, and realism of repulsion-based self-collision handling for GPU cloth simulators. The authors identify the necessary vertex-distance conditions for cloth to enter a self-intersection and negate them into vertex-distance constraints that prevent self-collisions while running efficiently on the GPU.
Related Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation · Robust Treatment of Collisions, Contact and Friction for Cloth Animation · Better Collisions and Faster Cloth for Pixar's Coco · CAMA: Contact-Aware Matrix Assembly with Unified Collision Handling for GPU-based Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: GPU cloth self-collision handling
- Contributions
-
- A study of safety, efficiency, and realism for repulsion-based self-collision handling on GPU cloth simulators
- Identification of the necessary vertex-distance conditions for cloth to enter a self-intersection
- Negation of those conditions into vertex-distance constraints that prevent self-collisions while running efficiently on the GPU
- Context
- Builds on robust cloth collision and friction treatment (referenced Bridson et al. 2002), targeting the gap where fast, reliable collision handling has lagged behind advances in cloth dynamics solvers. Builds on: Robust Treatment of Collisions, Contact and Friction for Cloth Animation
- Correctness
- The approach is framed as a safe repulsion method derived from explicit vertex-distance intersection conditions; as a repulsion (not continuous-collision) scheme it prevents entering self-intersection under those conditions, so behavior for already-tangled states or extreme time steps is the natural thing to scrutinize.
- Clarity
- Focused and problem-driven; a first pass conveys the safety argument, a second pass is needed for the precise vertex-distance conditions and GPU constraint formulation.
- How to read it
- Read the intersection-condition derivation carefully, since the contribution is the conditions-to-constraints negation; a second pass on the GPU implementation pays off if you maintain a cloth solver.
CFX
- Accurate Face Rig Approximation with Deep Differential Subspace Reconstruction SIGGRAPH Industrial 27 cites
, ,
Deep network for approximating complex face rig computations using differential subspace reconstruction for real-time face deformation.
abstract ▾ abstract ▴
To be suitable for film-quality animation, rigs for character deformation must fulfill a broad set of requirements. They must be able to create highly stylized deformation, allow a wide variety of controls to permit artistic freedom, and accurately reflect the design intent. Facial deformation is especially challenging due to its nonlinearity with respect to the animation controls and its additional precision requirements, which often leads to highly complex face rigs that are not generalizable to other characters. This lack of generality creates a need for approximation methods that encode the deformation in simpler structures. We propose a rig approximation method that addresses these issues by learning localized shape information in differential coordinates and, separately, a subspace for mesh reconstruction. The use of differential coordinates produces a smooth distribution of errors in the resulting deformed surface, while the learned subspace provides constraints that reduce the low frequency error in the reconstruction. Our method can reconstruct both face and body deformations with high fidelity and does not require a set of well-posed animation examples, as we demonstrate with a variety of production characters.
Related FaceBaker: Baking Character Facial Rigs with Machine Learning · Fast and Deep Facial Deformations · Implementing a Machine Learning Deformer for CG Crowds: Our Journey · A Facial Composite Editor for Blendshape Characters
how to read this ▾ how to read this ▴
- Category
- Method: neural approximation of a film-quality face rig
- Contributions
-
- A rig approximation that learns localized shape information in differential coordinates
- A separately learned subspace for mesh reconstruction that constrains and reduces low-frequency error
- High-fidelity reconstruction of both face and body deformation without requiring a set of well-posed animation examples
- Context
- Relates to learned rig/deformation approximation (referenced Fast and Deep Deformation Approximations, Bailey 2018), targeting the nonlinearity and precision demands specific to facial rigs. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Differential coordinates are used to smooth the error distribution and the learned subspace to curb low-frequency error; it is an approximation of a complex rig, so output fidelity is bounded by training coverage and the chosen subspace, and per-character generality of the rig itself remains a stated motivation rather than a solved problem.
- Clarity
- Moderately technical; a first pass conveys the differential-coordinates-plus-subspace split, a second pass is needed for the learning and reconstruction formulation.
- How to read it
- On the first pass focus on why differential coordinates plus a reconstruction subspace address facial nonlinearity and error distribution; second pass on the training setup and error analysis if approximating your own rigs for real-time use.
Facial / ML Deformation
-
, , , ,
CycleGAN on motion words transfers adult mocap into child motion style without temporal alignment of training sequences.
abstract ▾ abstract ▴
Child characters are commonly seen in leading roles in top-selling video games. Previous studies have shown that child motions are perceptually and stylistically different from those of adults. Creating motion for these characters by motion capturing children is uniquely challenging because of confusion, lack of patience and regulations. Retargeting adult motion, which is much easier to record, onto child skeletons, does not capture the stylistic differences. In this paper, we propose that style translation is an effective way to transform adult motion capture data to the style of child motion. Our method is based on CycleGAN, which allows training on a relatively small number of sequences of child and adult motions that do not even need to be temporally aligned. Our adult2child network converts short sequences of motions called motion words from one domain to the other. The network was trained using a motion capture database collected by our team containing 23 locomotion and exercise motions. We conducted a perception study to evaluate the success of style translation algorithms, including our algorithm and recently presented style translation neural networks. Results show that the translated adult motions are recognized as child motions significantly more often than adult motions.
Related Automated Extraction and Parameterization of Motions in Large Data Sets · Autodesk MotionBuilder 2022 · Motion Warping · How to Retarget Motion Capture in MotionBuilder
how to read this ▾ how to read this ▴
- Category
- Method: a motion style transfer technique (CycleGAN)
- Contributions
-
- Frames adult-to-child motion conversion as unpaired style translation using a CycleGAN trained on short 'motion words'
- Trains on a relatively small, temporally unaligned set of child and adult mocap (a 23-motion locomotion/exercise database collected by the authors)
- Runs a perception study comparing the method against recent style-translation networks
- Context
- Builds on unpaired neural motion style transfer, notably Aberman et al.'s 'Unpaired Motion Style Transfer from Video to Animation', applying the CycleGAN idea to the adult-versus-child stylistic gap. Builds on: Unpaired Motion Style Transfer from Video to Animation
- Correctness
- Rests on the assumption that adult-child differences are a transferable 'style' and that motion-word translation preserves content; validated mainly through a perceptual study on their own modest database, so generalization beyond locomotion/exercise and to other skeletons is unproven.
- Clarity
- Accessible; a first pass conveys the framing and the perception-study outcome, do a second pass for the motion-word representation and CycleGAN losses.
- How to read it
- Focus on the motion-word definition and the perception-study design; a second pass is worth it if you care about how unpaired GAN training avoids temporal alignment.
Motion Synthesis / Retargeting
-
,
Feathers are sophisticated skin appendages where many barb curves branch out from a central shaft and interlock via barbules to form a vane.
abstract ▾ abstract ▴
Feathers are sophisticated skin appendages where many barb curves branch out from a central shaft and interlock via barbules to form a vane. This work generates pathlines of particles in a velocity field to emulate the helical growth of barbs inside a cylindrical follicle, then applies forward kinematics to mimic the unfurling of the feather after the follicle sheath breaks off. An optional barb-snapping algorithm reproduces the geometric restriction imposed by barbules, enabling feather growth simulation directly in 3D.
Related A Biologically-Parameterized Feather Model · Animating Puss in Boots' Feather in Shrek 2 · Procedurally Generating Biologically Driven Feathers · Microstructure-based Appearance Rendering for Feathers
how to read this ▾ how to read this ▴
- Category
- Method: a procedural / biological growth model for geometry
- Contributions
-
- Models feather barbs as particle pathlines in a velocity field to emulate helical growth inside a cylindrical follicle
- Uses forward kinematics to mimic the unfurling of the feather once the follicle sheath breaks off
- Adds an optional barb-snapping algorithm to reproduce the geometric restriction imposed by barbules, enabling growth directly in 3D
- Context
- Extends biologically parameterized feather modeling, in the lineage of Streit and Heidrich's 'A Biologically-Parameterized Feather Model', by simulating morphogenesis rather than fitting a static parametric shape. Builds on: A Biologically-Parameterized Feather Model
- Correctness
- Assumes a morphogenesis-inspired velocity-field plus forward-kinematics process is a faithful proxy for real barb growth; it is a plausibility-driven model demonstrated on generated feathers, so visual biological fidelity rather than measured accuracy is the bar, and integration into a full plumage/grooming pipeline is out of scope.
- Clarity
- Moderately technical; a first pass conveys the growth metaphor, a second pass is needed for the velocity-field and barb-snapping formulation.
- How to read it
- Focus on the follicle growth analogy and the barb-snapping step; a second pass pays off only if you intend to implement procedural feather geometry.
CFX
-
,
Hybrid simulation and animation rig for Onward's Dad character using tetrahedral volumes with rest-state deformation and animator-controllable targeting.
abstract ▾ abstract ▴
In Pixar’s Onward, the character Dad had an upper half which consisted of a stuffed hoodie, puffy vest, and garden gloves. The arms were floppy, stuffed sleeves able to swing freely while the head was a cinched, stuffed hood topped with a cap and wearing sunglasses. His lower body was rigged and simulated like a typical character. Knowing it was unrealistic to hand-animate the loose swinging arms and squishy upper body for a feature-length project, we developed a hybrid simulation/animation rig using tetrahedral volumes with complex rest state deformation and animatable targeting. This resulted in a robust, iterative workflow where simulation and animation were used together.
Related Scriptable Character FX Solution · Simulating Rapunzel's Hair in Disney's Tangled · Stable Spaces for Real-time Clothing · Directing Cloth Draping through Blended UVs
how to read this ▾ how to read this ▴
- Category
- Production talk / character breakdown
- Contributions
-
- Demonstrates a hybrid simulation/animation rig for Pixar Onward's Dad, whose stuffed upper body and floppy arms could not be hand-animated
- Uses tetrahedral volumes with complex rest-state deformation plus animatable targeting to keep simulation under artist control
- Shows an iterative workflow where simulation and animation are used together rather than in sequence
- Context
- Relates to production deformable-body simulation practice, closely tied to Kim and Eberle's 'Dynamic Deformables: Implementation and Production Practicalities' and Pixar's Fizt solver. Builds on: Dynamic Deformables: Implementation and Production Practicalities
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a single shipped film, so the rest-state and targeting choices are tuned to this specific character rather than evaluated for generality.
- Clarity
- Accessible and example-driven; a single read conveys the workflow, with figures/video carrying most of the insight.
- How to read it
- Read once for the hybrid sim/animation philosophy and the rest-state deformation trick; pair with the Dynamic Deformables course for the underlying math.
CFX / Rigging
-
, , ,
Learns autoregressive conditional VAEs of human motion whose latent space serves as action space for deep reinforcement learning controllers achieving goal-directed locomotion.
abstract ▾ abstract ▴
A fundamental problem in computer animation is that of realizing purposeful and realistic human movement given a sufficiently-rich set of motion capture clips. We learn data-driven generative models of human movement using autoregressive conditional variational autoencoders, or Motion VAEs. The latent variables of the learned autoencoder define the action space for the movement and thereby govern its evolution over time. Planning or control algorithms can then use this action space to generate desired motions. In particular, we use deep reinforcement learning to learn controllers that achieve goal-directed movements. We demonstrate the effectiveness of the approach on multiple tasks. We further evaluate system-design choices and describe the current limitations of Motion VAEs.
Related Interactive Character Control with Auto-Regressive Motion Diffusion Models · DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters · A Deep Learning Framework for Character Motion Synthesis and Editing
how to read this ▾ how to read this ▴
- Category
- Method: a learned generative motion model + RL controller
- Contributions
-
- Learns autoregressive conditional variational autoencoders (Motion VAEs) of human movement from mocap clips
- Uses the VAE latent variables as the action space governing motion evolution over time
- Trains deep reinforcement learning controllers in that action space for goal-directed locomotion, with ablations of system-design choices
- Context
- Builds on data-driven character control, succeeding phase-based approaches such as Holden et al.'s 'Phase-Functioned Neural Networks for Character Control' by replacing handcrafted structure with a learned latent action space. Builds on: Phase-Functioned Neural Networks for Character Control
- Correctness
- Assumes a learned autoregressive latent space is a well-behaved action space for RL and that the source clips are 'sufficiently rich'; demonstrated on multiple tasks with stated limitations, so quality depends on data coverage and the paper is candid about current limits.
- Clarity
- Fairly accessible for a learning paper; a first pass conveys the VAE-as-action-space idea, a second pass is needed for the autoregressive conditioning and RL setup.
- How to read it
- Focus on how the latent action space is defined and fed to RL, plus the limitations section; a second pass is worthwhile for the autoregressive conditioning details.
Motion Synthesis
- Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network CGF Industrial 33 cites
, , , ,
Triangle mesh CNN regresses clothing deformation from character poses and hand skin deformation from joint angles on manifold meshes.
abstract ▾ abstract ▴
We introduce a triangle mesh based convolutional neural network. The proposed network structure can be used for problems where input and/or output are defined on a manifold triangle mesh with or without boundary. We demonstrate its applications in cloth upsampling, adding back details to Principal Component Analysis (PCA) compressed cloth, regressing clothing deformation from character poses, and regressing hand skin deformation from bones' joint angles. The data used for training in this work are generated from high resolution extended position based dynamics (XPBD) physics simulations with small time steps and high iteration counts and from an offline FEM simulator, but it can come from other sources. The inference time of our prototype implementation, depending on the mesh resolution and the network size, can provide between 4 to 134 times faster than a GPU based simulator. The inference also only needs to be done for meshes currently visible by the camera.
Related Subspace Neural Physics: Fast Data-Driven Interactive Simulation · XPBD: Position-Based Simulation of Compliant Constrained Dynamics · Wrinkle Meshes · A Pixel-Based Framework for Data-Driven Clothing
how to read this ▾ how to read this ▴
- Category
- Method: a neural network architecture for mesh-defined data
- Contributions
-
- Introduces a triangle-mesh-based CNN for problems where input and/or output live on a manifold triangle mesh, with or without boundary
- Applies it to cloth upsampling, restoring detail to PCA-compressed cloth, regressing clothing deformation from poses, and hand skin deformation from joint angles
- Trains on XPBD and offline FEM simulation data and reports large inference speedups over a GPU simulator (only for camera-visible meshes)
- Context
- Sits in the learning-based deformation lineage, related to Santesteban et al.'s 'Learning-Based Animation of Clothing for Virtual Try-On', generalizing such regressors via a mesh-native convolution operator. Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- Assumes the mesh connectivity is fixed/manifold and that simulation-generated training data is representative; demonstrated across several deformation tasks with reported speedups, but accuracy is bounded by the training simulator and the visible-mesh-only inference is an approximation.
- Clarity
- Moderately technical; a first pass conveys the applications and speed claims, a second pass is needed for the mesh-convolution operator definition.
- How to read it
- Focus on how convolution is defined on the triangle mesh and which tasks it handles; second pass for the operator and data-generation pipeline if you plan to reuse the architecture.
CFX / ML Deformation / Skinning
-
, , ,
Neural avatars for telepresence using variational autoencoders to encode facial expressions into compact latent codes for real-time rendering.
Facial / ML Deformation
-
, , ,
Enriches rig animations with elastodynamic secondary motion in the subspace orthogonal to rig displacements.
abstract ▾ abstract ▴
We present a novel approach to enrich arbitrary rig animations with elastodynamic secondary effects. Unlike previous methods which pit rig displacements and physical forces as adversaries against each other, we advocate that physics should complement artists' intentions. We propose optimizing for elastodynamic displacements in the subspace orthogonal to displacements that can be created by the rig. This ensures that the additional dynamic motions do not undo the rig animation. The complementary space is high-dimensional, algebraically constructed without manual oversight, and capable of rich high-frequency dynamics. Unlike prior tracking methods, we do not require extra painted weights, segmentation into fixed and free regions or tracking clusters. Our method is agnostic to the physical model and plugs into non-linear FEM simulations, geometric as-rigid-as-possible energies, or mass-spring models. Our method does not require a particular type of rig and adds secondary effects to skeletal animations, cage-based deformations, wire deformers, motion capture data, and rigid-body simulations.
Related Rig-Space Physics · Data-Driven Physics for Human Soft Tissue Animation · Pose-Space Subspace Dynamics · SkinMixer: Blending 3D Animated Models
how to read this ▾ how to read this ▴
- Category
- Method: a secondary-dynamics simulation technique
- Contributions
-
- Adds elastodynamic secondary motion by optimizing displacements in the subspace orthogonal to those the rig can produce, so dynamics never undo the rig
- Constructs this high-dimensional complementary space algebraically without manual weights, segmentation, or tracking clusters
- Works model-agnostically with nonlinear FEM, as-rigid-as-possible energies, or mass-spring models, and across skeletal, cage, wire, mocap, and rigid-body rigs
- Context
- Reframes rig-aware physics, succeeding tracking-style approaches like Hahn et al.'s 'Rig-Space Physics', by making physics complement rather than compete with the artist's rig. Builds on: Rig-Space Physics
- Correctness
- Key assumption is that artist intent lives entirely in the rig subspace so dynamics should be confined to its orthogonal complement; broadly demonstrated across rig and physics-model types, though the result is only as expressive as the chosen rig parameterization and physical energy.
- Clarity
- Conceptually clean but mathematically dense; a first pass conveys the orthogonal-subspace idea, a second pass is needed for the constraint formulation.
- How to read it
- Focus on the orthogonal-complement construction and the 'complement, not compete' framing; a second pass into the optimization is worthwhile if you implement secondary dynamics.
ML Deformation / Muscles
-
,
GPU CUDA implementation of reduced-coordinate deformable-body simulation with optimized memory layout for real-time character deformation.
abstract ▾ abstract ▴
Real-time deformable object simulation is important in interactive applications such as games and virtual reality. One common approach to achieve speed is to employ model reduction, a technique whereby the equations of motion of a deformable object are projected to a suitable low-dimensional space. Improving the real-time performance of model-reduced systems has been the subject of much research. While modern GPUs play an important role in real-time simulation and parallel computing, existing model reduction systems typically utilize CPUs and seldom employ GPUs. We give a method to efficiently employ GPUs for vertex position computation in model-reduced simulations. Our CUDA-based algorithm gives a substantial speedup compared to a CPU implementation, thanks to our system architecture that employs a memory layout friendly to GPU memory, reduces the communication between the CPU and GPU, and enables the CPU and GPU to work in parallel.
Related Pose-Space Subspace Dynamics · Compression and Direct Manipulation of Complex Blendshape Models · FEM Simulation of 3D Deformable Solids: A Practitioner's Guide to Theory, Discretization and Model Reduction · Stable Spaces for Real-time Clothing
how to read this ▾ how to read this ▴
- Category
- Method / systems: a GPU implementation for reduced deformable simulation
- Contributions
-
- Gives a CUDA method to efficiently compute vertex positions in model-reduced deformable simulations on the GPU
- Designs a system architecture with a GPU-friendly memory layout that reduces CPU-GPU communication and lets CPU and GPU run in parallel
- Reports a substantial speedup over a CPU implementation for real-time deformable bodies
- Context
- Builds on model-reduction for deformable simulation, in the lineage of Fulton et al.'s 'Latent-space Dynamics for Reduced Deformable Simulation', addressing the underused GPU side of such systems. Builds on: Latent-space Dynamics for Reduced Deformable Simulation
- Correctness
- Assumes a model-reduced (low-dimensional subspace) formulation is acceptable for the target objects; the contribution is an engineering speedup measured against a CPU baseline, so gains depend on mesh resolution, network/basis size, and the chosen reduction quality rather than improved physical accuracy.
- Clarity
- Implementation-focused; a first pass conveys the architecture and where the speedup comes from, with details in the memory-layout discussion.
- How to read it
- Focus on the memory layout and CPU-GPU work split; read closely only if you are porting reduced simulation to CUDA, otherwise a first pass suffices.
ML Deformation
- Data-driven Extraction and Composition of Secondary Dynamics in Facial Performance Capture SIGGRAPH Disney Research 3 cites
, , , ,
Data-driven method to extract and separately compose secondary dynamic effects from facial performance capture for enhanced realism.
abstract ▾ abstract ▴
Performance capture of expressive subjects, particularly facial performances acquired with high spatial resolution, will inevitably incorporate some fraction of motion that is due to inertial effects and dynamic overshoot due to ballistic motion. This is true in most natural capture environments where the actor is able to move freely during their performance, rather than being tethered to a fixed position. Normally these secondary dynamic effects are unwanted, as the captured facial performance is often retargeted to different head motion, and sometimes to completely different characters, and in both cases the captured dynamic effects should be removed and new secondary effects should be added. This paper advances the hypothesis that for a highly constrained elastic medium such as the human face, these secondary inertial effects are predominantly due to the motion of the underlying bony structures (cranium and mandible). Our work aims to compute and characterize the difference between the captured dynamic facial performance, and a speculative quasistatic variant of the same motion should the inertial effects have been absent.
Related Semi-Supervised Video-Driven Facial Animation Transfer for Production · BlendForces: A Dynamic Framework for Facial Animation · FaceLab: Scalable Facial Performance Capture for Visual Effects · FaceBaker: Baking Character Facial Rigs with Machine Learning
how to read this ▾ how to read this ▴
- Category
- Method: data-driven analysis of facial performance capture
- Contributions
-
- Extracts unwanted secondary inertial/overshoot dynamics from captured facial performances so they can be removed before retargeting
- Advances the hypothesis that facial secondary inertia is predominantly driven by motion of the cranium and mandible
- Computes the difference between the captured dynamic performance and a speculative quasistatic variant, enabling separate composition of new secondary effects
- Context
- Builds on anatomically grounded facial rigging, related to Zoss et al.'s 'An Empirical Rig for Jaw Animation', extending it from static jaw structure to dynamic inertial effects. Builds on: An Empirical Rig for Jaw Animation
- Correctness
- Central assumption is that the face is a constrained elastic medium whose secondary motion is dominated by bony-structure (cranium/mandible) motion; validated on captured performances, but the bone-driven hypothesis may under-model soft-tissue-driven dynamics and depends on accurate skull/jaw tracking.
- Clarity
- Specialized; a first pass conveys the extract-and-recompose hypothesis, a second pass is needed for the quasistatic-difference formulation.
- How to read it
- Focus on the cranium/mandible hypothesis and the quasistatic-versus-captured decomposition; second pass if you work on retargeting or dynamic facial realism.
Facial
-
,
Course covering Pixar's Fizt simulator improvements for speed, robustness, and generality deployed across Coco, Cars 3, and Onward.
abstract ▾ abstract ▴
Simulating dynamic deformation has been an integral component of Pixar's storytelling since Boo's shirt in Monsters, Inc. (2001). Recently, several key transformations have been applied to Pixar's core simulator Fizt that improve its speed, robustness, and generality. Starting with Coco (2017), improved collision detection and response were incorporated into the cloth solver, then with Cars 3 (2017) 3D solids were introduced, and in Onward (2020) clothing is allowed to interact with a character's body with two-way coupling. The 3D solids are based on a fast, compact, and powerful new formulation that we have published over the last few years at SIGGRAPH. Under this formulation, the construction and eigendecomposition of the force gradient, long considered the most onerous part of the implementation, becomes fast and simple. We provide a detailed, self-contained, and unified treatment here that is not available in the technical papers. This new formulation is only a starting point for creating a simulator that is up challenges of a production environment. One challenge is performance: we discuss our current best practices for accelerating system assembly and solver performance. Another challenge that requires considerable attention is robust collision detection and response.
Related Nonlinear Cloth Simulation with Isogeometric Analysis · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Finding Hank · Robust Treatment of Collisions, Contact and Friction for Cloth Animation
how to read this ▾ how to read this ▴
- Category
- Course / production practicalities (SIGGRAPH Courses)
- Contributions
-
- Documents improvements to Pixar's Fizt simulator for speed, robustness, and generality across Coco, Cars 3, and Onward
- Gives a detailed, self-contained, unified treatment of a new 3D-solids formulation that makes force-gradient construction and eigendecomposition fast and simple
- Shares best practices for accelerating system assembly and solver performance in a production environment
- Context
- Consolidates a line of SIGGRAPH elasticity work for character simulation, related to McAdams et al.'s 'Efficient Elasticity for Character Skinning with Contact and Collisions', into production-oriented guidance. Builds on: Efficient Elasticity for Character Skinning with Contact and Collisions
- Correctness
- Studio-grounded and pedagogical rather than a new evaluated result; the formulation is mathematically derived and the practices are production-proven across named films, so claims are about implementability and shipped robustness rather than benchmarked superiority.
- Clarity
- Highly accessible as a course; the self-contained derivation is its main value and rewards a careful read more than a paper would.
- How to read it
- Treat as a reference: read the eigendecomposition/force-gradient treatment closely, then dip into the performance best-practices sections as needed.
CFX / ML Deformation
-
, , , , ,
Generates personalized blendshapes, physically-based textures, and full facial rig from a single face scan automatically.
abstract ▾ abstract ▴
The creation of high-fidelity computer-generated (CG) characters for films and games is tied with intensive manual labor, which involves the creation of comprehensive facial assets that are often captured using complex hardware. To simplify and accelerate this digitization process, we propose a framework for the automatic generation of high-quality dynamic facial models, including rigs which can be readily deployed for artists to polish. Our framework takes a single scan as input to generate a set of personalized blendshapes, dynamic textures, as well as secondary facial components (e.g., teeth and eyeballs). Based on a facial database with over 4, 000 scans with pore-level details, varying expressions and identities, we adopt a self-supervised neural network to learn personalized blendshapes from a set of template expressions. We also model the joint distribution between identities and expressions, enabling the inference of a full set of personalized blendshapes with dynamic appearances from a single neutral input scan. Our generated personalized face rig assets are seamlessly compatible with professional production pipelines for facial animation and rendering. We demonstrate a highly robust and effective framework on a wide range of subjects, and showcase high-fidelity facial animations with automatically generated personalized dynamic textures.
Related Creating an Actor-Specific Facial Rig from Performance Capture · Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild · RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data · Building and Animating User-Specific Volumetric Face Rigs
how to read this ▾ how to read this ▴
- Category
- Method: automatic facial asset and rig generation
- Contributions
-
- Generates a full facial rig (personalized blendshapes, dynamic physically-based textures, and secondary components like teeth and eyeballs) from a single neutral scan
- Uses a self-supervised network trained on a 4,000+ scan pore-level database to learn personalized blendshapes from template expressions
- Models the joint distribution of identities and expressions, and outputs assets compatible with professional production pipelines
- Context
- Builds on automatic anatomical face modeling, related to Cong et al.'s 'Fully Automatic Generation of Anatomical Face Simulation Models', shifting from simulation-model fitting to learned single-scan rig generation. Builds on: Fully Automatic Generation of Anatomical Face Simulation Models
- Correctness
- Assumes a single neutral scan plus a large curated scan database suffice to infer plausible personalized expression rigs; demonstrated as a high-fidelity pipeline, but output identity/expression range is bounded by the training database's demographics and the assets are meant as an artist-polishable starting point rather than final.
- Clarity
- Readable systems-style write-up; a first pass conveys the single-scan-to-rig pipeline, a second pass is needed for the self-supervised blendshape learning and joint identity-expression model.
- How to read it
- Focus on the input/output of each stage and where the 4,000-scan database constrains results; second pass for the network if you build face-rig automation.
Facial / Rigging
- Expression Packing: As-Few-As-Possible Training Expressions for Blendshape Transfer CGF Academic 12 cites
, , ,
Integer optimization selects a minimal set of scanning poses with non-overlapping blendshapes for efficient blendshape transfer.
abstract ▾ abstract ▴
To simplify and accelerate the creation of blendshape rigs, using a template rig is a common procedure, especially during the creation of digital doubles. Blendshape transfer methods facilitate copy and paste functionality of the blendshapes from the template model to the digital double. However, for adequate personalization, such methods require a set of scanned training expressions of the original actor. So far, the semantics of the facial expressions to scan have been defined manually. In contrast, we formulate the semantics of the facial expressions as an integer optimization of the blendshape weights. By combining different blendshapes of the template model, our method creates facial expressions that serve as semantic references during scanning. Our method guarantees to compute as‐few‐as‐possible training expressions with minimal overlap of activated blendshapes. If the number of training expressions is limited, blendshapes are selected based on their power to personalize the resulting blendshapes compared to generic blendshape transfer methods.
Related Smooth Contact-Aware Facial Blendshapes Transfer · Creating an Actor-Specific Facial Rig from Performance Capture · Transferring the Rig and Animations from a Character to Different Face Models · Reusable Facial Rigging and Animation: Create Once, Use Many
how to read this ▾ how to read this ▴
- Category
- Method: optimization for blendshape-rig authoring
- Contributions
-
- Formulates the choice of training expressions to scan as an integer optimization over blendshape weights
- Combines template blendshapes into semantic reference expressions with minimal overlap of activated shapes
- Computes as-few-as-possible scanning poses, ranking shapes by personalization power when the budget is limited
- Context
- Sits in the blendshape-transfer and digital-double literature, replacing the manual definition of scan poses, and relates to direct blendshape control as in Lewis and Anjyo's Direct Manipulation Blendshapes. Builds on: Direct Manipulation Blendshapes
- Correctness
- The method targets efficient blendshape transfer from a template rig and assumes the template's shapes meaningfully span the actor's expressions; readers should note that minimizing scan count trades coverage for capture cost, so personalization quality depends on the chosen budget.
- Clarity
- Accessible at a first pass for the goal; the integer-programming formulation rewards a second pass.
- How to read it
- First pass for the problem framing (why fewer scans matters); do a second pass on the integer-optimization setup and the overlap/personalization objective if you plan to implement or compare selection strategies.
Facial
-
NetEase presents a differentiable neural renderer that translates a single face photo into RPG character rig parameters, enabling automatic character creation from real faces.
Facial / ML Deformation
-
, ,
Machine learning method that approximates complex facial rig deformations, reducing evaluation cost and enabling portability of proprietary Pixar rigs.
abstract ▾ abstract ▴
Character rigs are procedural systems that deform a character’s shape driven by a set of rig-control variables. Film quality character rigs are highly complex and therefore computationally expensive and slow to evaluate. We present a machine learning method for approximating facial mesh deformations which reduces rig computations, increases longevity of characters without rig upkeep, and enables portability of proprietary rigs into a variety of external platforms. We perform qualitative and quantitative evaluations on hero characters across several feature films, exhibiting the speed and generality of our approach and demonstrating that our method out performs existing state-of-the-art work on deformation approximations for character faces.
Related Accurate Face Rig Approximation with Deep Differential Subspace Reconstruction · FaceLab: Scalable Facial Performance Capture for Visual Effects · Fast and Deep Facial Deformations · Semi-Supervised Video-Driven Facial Animation Transfer for Production
how to read this ▾ how to read this ▴
- Category
- Method: machine-learning rig approximation
- Contributions
-
- A learning method that approximates film-quality facial mesh deformations to cut rig evaluation cost
- Increases character longevity without rig upkeep and enables porting proprietary rigs to external platforms
- Qualitative and quantitative evaluation on hero characters across several feature films
- Context
- Continues the line of neural deformation approximation for production faces, explicitly building on Bailey et al.'s Fast and Deep Deformation Approximations (FDDA). Builds on: Fast and Deep Deformation Approximations
- Correctness
- Validated on Pixar hero characters and reported to outperform prior deformation-approximation work, but it is an approximation of a specific class of proprietary rigs, so fidelity and generality outside the tested characters should not be assumed.
- Clarity
- Accessible; a first pass conveys the motivation and pipeline, with a second pass for the network and training details.
- How to read it
- First pass for why baking a rig helps (speed, portability, longevity); second pass on the network design and the qualitative/quantitative comparisons versus FDDA if you need to judge accuracy.
Facial / ML Deformation
-
, , , , , , ,
Scalable facial performance capture system for VFX using deep learning to track and solve detailed facial motion at production scale.
Facial
- FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction CVPR Academic 366 cites
, , , , , ,
18,760 pore-level textured 3D faces from 938 subjects with 20 expressions, with a riggable 3D face prediction method from single images.
abstract ▾ abstract ▴
In this paper, we present a large-scale detailed 3D face dataset, FaceScape, and propose a novel algorithm that is able to predict elaborate riggable 3D face models from a single image input. FaceScape dataset provides 18,760 textured 3D faces, captured from 938 subjects and each with 20 specific expressions. The 3D models contain the pore-level facial geometry that is also processed to be topologically uniformed. These fine 3D facial models can be represented as a 3D morphable model for rough shapes and displacement maps for detailed geometry. Taking advantage of the large-scale and high-accuracy dataset, a novel algorithm is further proposed to learn the expression-specific dynamic details using a deep neural network. The learned relationship serves as the foundation of our 3D face prediction system from a single image input. Different than the previous methods, our predicted 3D models are riggable with highly detailed geometry under different expressions. The unprecedented dataset and code will be released to public for research purpose.
Related Real-Time High-Fidelity Facial Performance Capture · Single-Shot High-Quality Facial Geometry and Skin Appearance Capture · Advances for Digital Humans in VFX Production at Goodbye Kansas Studios · Driving High-Resolution Facial Scans with Video Performance Capture
how to read this ▾ how to read this ▴
- Category
- Dataset plus method: 3D face dataset and riggable prediction
- Contributions
-
- FaceScape, a large dataset of pore-level textured, topologically uniform 3D faces (18,760 models, 938 subjects, 20 expressions)
- A representation as a 3D morphable model for coarse shape plus displacement maps for fine detail
- A network that learns expression-specific dynamic details to predict a riggable, detailed 3D face from a single image
- Context
- Extends the tradition of 3D facial expression databases and morphable-model fitting, positioned relative to Cao et al.'s FaceWarehouse. Builds on: FaceWarehouse: A 3D Facial Expression Database for Visual Computing
- Correctness
- The riggable single-image prediction rests on the dataset's scale and accuracy and on the coarse-shape-plus-displacement split; readers should remember that subject diversity is bounded by the 938 captured individuals and that single-image inference of fine detail is an inferred reconstruction, not a measurement.
- Clarity
- Accessible; the dataset and goals read clearly on a first pass, the detail-learning network needs a second pass.
- How to read it
- First pass to assess dataset scope and licensing for your use; second pass on the displacement-map detail learning and the single-image prediction pipeline if you intend to fit or train on it.
Facial
-
, , ,
Global-local multilinear model synthesizes identity-preserving facial expressions that extrapolate well beyond the training data pool.
abstract ▾ abstract ▴
We present a practical method to synthesize plausible 3D facial expressions for a particular target subject. The ability to synthesize an entire facial rig from a single neutral expression has a large range of applications both in computer graphics and computer vision, ranging from the efficient and cost‐effective creation of CG characters to scalable data generation for machine learning purposes. Unlike previous methods based on multilinear models, the proposed approach is capable to extrapolate well outside the sample pool, which allows it to plausibly predict the identity of the target subject and create artifact free expression shapes while requiring only a small input dataset. We introduce global‐local multilinear models that leverage the strengths of expression‐specific and identity‐specific local models combined with coarse motion estimations from a global model. Experimental results show that we achieve high‐quality, plausible facial expression synthesis results for an individual that outperform existing methods both quantitatively and qualitatively.
Related FaceWarehouse: A 3D Facial Expression Database for Visual Computing · FaceLab: Scalable Facial Performance Capture for Visual Effects · Transferring Facial Expressions to Different Face Models · Smooth Contact-Aware Facial Blendshapes Transfer
how to read this ▾ how to read this ▴
- Category
- Method: facial expression synthesis from a multilinear model
- Contributions
-
- Synthesizes a plausible 3D facial rig for a target subject from a single neutral expression
- Introduces global-local multilinear models combining expression- and identity-specific local models with a coarse global motion estimate
- Extrapolates beyond the training sample pool while requiring only a small input dataset
- Context
- Builds directly on the multilinear face-modeling tradition, notably Vlasic et al.'s Face Transfer with Multilinear Models, while addressing that family's poor extrapolation. Builds on: Face Transfer with Multilinear Models
- Correctness
- Reported to outperform prior multilinear methods quantitatively and qualitatively, but claims rest on producing plausible, identity-preserving shapes from limited input; readers should treat the synthesized rig as a plausible prediction rather than a captured ground truth and watch how far extrapolation holds.
- Clarity
- Accessible; a first pass conveys the global-local idea, a second pass clarifies the tensor formulation.
- How to read it
- First pass for the global-local decomposition intuition; second pass on the multilinear math and the extrapolation experiments if comparing against standard multilinear baselines.
Facial
-
, , ,
Fast neural network deformer for production facial rigs approximating complex corrective shapes with real-time performance.
abstract ▾ abstract ▴
Film-quality characters typically display highly complex and expressive facial deformation. The underlying rigs used to animate the deformations of a character's face are often computationally expensive, requiring high-end hardware to deform the mesh at interactive rates. In this paper, we present a method using convolutional neural networks for approximating the mesh deformations of characters' faces. For the models we tested, our approximation runs up to 17 times faster than the original facial rig while still maintaining a high level of fidelity to the original rig. We also propose an extension to the approximation for handling high-frequency deformations such as fine skin wrinkles. While the implementation of the original animation rig depends on an extensive set of proprietary libraries making it difficult to install outside of an in-house development environment, our fast approximation relies on the widely available and easily deployed TensorFlow libraries. In addition to allowing high frame rate evaluation on modest hardware and in a wide range of computing environments, the large speed increase also enables interactive inverse kinematics on the animation rig. We demonstrate our approach and its applicability through interactive character posing and real-time facial performance capture.
Related Fast and Deep Deformation Approximations · Accurate Face Rig Approximation with Deep Differential Subspace Reconstruction · Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks · Direct Manipulation Blendshapes
how to read this ▾ how to read this ▴
- Category
- Method: neural deformer for production facial rigs
- Contributions
-
- A convolutional-network approximation of facial mesh deformations running far faster than the original rig at high fidelity
- An extension that recovers high-frequency deformations such as fine skin wrinkles
- A TensorFlow-based, easily deployed approximation that enables interactive inverse kinematics on the animation rig
- Context
- Extends Bailey et al.'s Fast and Deep Deformation Approximations to faces and is informed by production deformation systems such as the DreamWorks facial motion and deformation work. Builds on: Fast and Deep Deformation Approximations · DreamWorks Animation Facial Motion and Deformation System
- Correctness
- Speedups and fidelity are demonstrated on the specific rigs tested, so the reported up-to-17x figure and quality are per-model; as with any learned approximation, behavior outside the sampled rig-control space and on unseen characters is not guaranteed.
- Clarity
- Accessible; a first pass conveys the speed/portability payoff, a second pass for the CNN architecture and the wrinkle extension.
- How to read it
- First pass for the motivation (interactive evaluation off proprietary libraries); second pass on the network, the high-frequency extension, and the IK use case if you need real-time rig evaluation.
Facial / ML Deformation
-
,
Framestore's highly parallel dynamics solver for fur and feathers that reduced manual post-simulation fixing by 80% in creature productions.
abstract ▾ abstract ▴
Framestore has been producing award winning creature effects for over 20 years, with hair, fur and feathers being crucial elements of these creatures’ visual fidelity. Simulating how these elements interact with other geometry, wind, cloth and media of varying viscosity across many hundreds of shots in a film is a time consuming and laborious process, typically requiring many refinement iterations to achieve the desired result. In this talk, we present Fibre, a stable, robust and highly parallel dynamics solver designed to help maximize production efficiency. With Fibre integrated into its proprietary fur pipeline, Framestore has been able to reduce manual post-simulation fixing by 80% and reduced the simulation time for fur and feathers by up to 50% and 80%, respectively.
Related Nonlinear Cloth Simulation with Isogeometric Analysis · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · Untangling Cloth
how to read this ▾ how to read this ▴
- Category
- Production talk: parallel hair/fur dynamics solver
- Contributions
-
- Demonstrates Fibre, Framestore's stable, robust, highly parallel dynamics solver for fur and feathers
- Shows integration into a proprietary fur pipeline that cut manual post-simulation fixing by 80%
- Reports simulation-time reductions of up to 50% for fur and 80% for feathers
- Context
- Sits in the creature-effects hair-and-fur dynamics tradition, conceptually downstream of mass-spring hair models such as Selle et al.'s hair simulation work. Builds on: A Mass Spring Model for Hair Simulation
- Correctness
- Studio practice, not peer-reviewed; the efficiency gains are production-proven on Framestore creature shows and the quoted percentages are internal pipeline metrics rather than controlled benchmarks.
- Clarity
- Accessible; written as a talk, a single pass conveys the design goals and the production wins.
- How to read it
- Read once for the pipeline-integration lessons and the parallelization strategy; revisit only if you are designing a comparable in-house groom-dynamics solver.
CFX
-
, , , , , , ,
DNEG documents how Avengers: Endgame and Togo drove a full redesign of their Furball fur authoring pipeline across five standardized stages.
abstract ▾ abstract ▴
DNEG’s in-house fur software, Furball, has been in continuous production use since 2012. During this time it has undergone significant evolution to adapt to the changing needs from production. We discuss how recent work on films such as Avengers: Endgame and Togo has led to a complete shift in the focus of our fur tools. This has helped us scale up to meet the requirements of ever more fur-intensive shows, while also opening up exciting opportunities for future development.
Related XGen: Arbitrary Primitive Generator · Pseudo-Collisions: A Method for Preventing Fur-Skin Intersections Without Physical Simulation · Embroidery and Cloth Fiber Workflows on Disney's Encanto · Wig Refitting in Pixar's Inside Out 2
how to read this ▾ how to read this ▴
- Category
- Production talk: fur pipeline evolution
- Contributions
-
- Documents how productions such as Avengers: Endgame and Togo drove a redesign of DNEG's Furball fur software
- Describes a shift in the focus of the fur tools toward more fur-intensive shows
- Frames standardized stages that let the pipeline scale and open future development opportunities
- Context
- Continues DNEG's long-running in-production fur tooling (Furball in use since 2012) within the broader creature-grooming lineage, akin to earlier grooming-pipeline talks such as the Hercules lion work. Builds on: Grooming a Lion for Hercules
- Correctness
- Studio practice, not peer-reviewed; lessons are production-proven on specific shows and reflect one studio's tooling choices rather than a generalizable or measured method.
- Clarity
- Accessible; a talk-style retrospective readable in one pass.
- How to read it
- Read once for the pipeline-evolution narrative and the rationale behind the redesigned stages; useful as context for grooming-tool architecture, no deep second pass needed.
CFX
-
, , ,
Production techniques for a wide range of passive and active hair effects across diverse musical troll tribes in Trolls World Tour.
abstract ▾ abstract ▴
The world of hair in DreamWorks’ film Trolls World Tour got much bigger than in the first film Trolls [Missey et al. 2017]. The distinct musical genre that each Troll tribe was devoted to influenced their hair design and movement. The wide variety of hair effects, both passive and active, exhibited by the Trolls in various environments and situations, provided interesting challenges. This talk presents the techniques used to bring that expansive world of hair to life.
Related Skunk: DreamWorks Fur Motion System · Hair Emoting with Style Guides in Turning Red · XGen: Arbitrary Primitive Generator · Hummingbird: DreamWorks Feather System
how to read this ▾ how to read this ▴
- Category
- Production talk: hair effects breakdown
- Contributions
-
- Presents techniques for a wide variety of passive and active hair effects in Trolls World Tour
- Shows how each Troll tribe's musical genre shaped its hair design and movement
- Discusses meeting the challenges of diverse hair behavior across many environments and situations
- Context
- A sequel-driven escalation of the hair work from the first Trolls film, connected to DreamWorks fur-motion tooling such as the Skunk fur motion system. Builds on: Skunk: DreamWorks Fur Motion System
- Correctness
- Studio practice, not peer-reviewed; results are production-proven and art-directed for a stylized film, so methods are tuned to that look rather than offered as general algorithms.
- Clarity
- Accessible; a single pass conveys the effects and the design-driven approach.
- How to read it
- Read once for the catalogue of stylized hair effects and the genre-to-motion mapping; revisit specific effects only if you face a similar art-directed hair challenge.
CFX
-
, ,
Homogenization approach capturing yarn-level cloth mechanics in a continuum shell model, balancing fidelity and computational cost.
abstract ▾ abstract ▴
We present a method for animating yarn-level cloth effects using a thin-shell solver. We accomplish this through numerical homogenization: we first use a large number of yarn-level simulations to build a model of the potential energy density of the cloth, and then use this energy density function to compute forces in a thin shell simulator. We model several yarn-based materials, including both woven and knitted fabrics. Our model faithfully reproduces expected effects like the stiffness of woven fabrics, and the highly deformable nature and anisotropy of knitted fabrics. Our approach does not require any real-world experiments nor measurements; because the method is based entirely on simulations, it can generate entirely new material models quickly, without the need for testing apparatuses or human intervention. We provide data-driven models of several woven and knitted fabrics, which can be used for efficient simulation with an off-the-shelf cloth solver.
Related Mechanics-Aware Deformation of Yarn Pattern Geometry · Simulating Cloth Using Bilinear Elements · Dynamic Deformables: Implementation and Production Practicalities · Mixing Yarns and Triangles in Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: numerical homogenization for cloth simulation
- Contributions
-
- Animates yarn-level cloth effects using a thin-shell solver via numerical homogenization
- Builds a potential-energy-density model from many yarn-level simulations, then computes shell forces from it
- Provides data-driven models of several woven and knitted fabrics, derived purely from simulation with no real-world measurements
- Context
- Bridges yarn-level cloth simulation (in the tradition of Kaldor et al.'s yarn-level knitted cloth) and continuum thin-shell solvers through homogenization. Builds on: Simulating Knitted Cloth at the Yarn Level
- Correctness
- The fidelity rests on the assumption that an energy-density function fit to yarn-level simulations captures the relevant mechanics in a shell model; because it is calibrated to simulations rather than physical experiments, accuracy is relative to the underlying yarn model and effects beyond that model's scope may not transfer.
- Clarity
- Conceptually clear on a first pass; the homogenization and energy-density fitting need a careful second pass.
- How to read it
- First pass for the homogenization idea (yarn sims to a shell energy); second and possibly third pass on the energy-density construction and force derivation if implementing or extending the material model.
CFX
- Incremental Potential Contact: Intersection- and Inversion-free Large-Deformation Dynamics SIGGRAPH Academic 230 cites
, , , , , , ,
Guaranteed intersection- and inversion-free frictional contact for deformables via barrier-augmented incremental potential, independent of time step or material stiffness.
abstract ▾ abstract ▴
Contacts weave through every aspect of our physical world, from daily household chores to acts of nature. Modeling and predictive computation of these phenomena for solid mechanics is important to every discipline concerned with the motion of mechanical systems, including engineering and animation. Nevertheless, efficiently time-stepping accurate and consistent simulations of real-world contacting elastica remains an outstanding computational challenge. To model the complex interaction of deforming solids in contact we propose Incremental Potential Contact (IPC) - a new model and algorithm for variationally solving implicitly time-stepped nonlinear elastodynamics. IPC maintains an intersection- and inversion-free trajectory regardless of material parameters, time step sizes, impact velocities, severity of deformation, or boundary conditions enforced. Constructed with a custom nonlinear solver, IPC enables efficient resolution of time-stepping problems with separate, user-exposed accuracy tolerances that allow independent specification of the physical accuracy of the dynamics and the geometric accuracy of surface-to-surface conformation. This enables users to decouple, as needed per application, desired accuracies for a simulation's dynamics and geometry.
Related Adaptive Nonlinearity for Collisions in Complex Rod Assemblies · Codimensional Incremental Potential Contact · Dynamic Deformables: Implementation and Production Practicalities · Implicit Multibody Penalty-based Distributed Contact
how to read this ▾ how to read this ▴
- Category
- Method: contact model for elastodynamics
- Contributions
-
- Incremental Potential Contact (IPC), a model and algorithm for implicitly time-stepped nonlinear frictional elastodynamics
- Guarantees intersection- and inversion-free trajectories regardless of material, time step, impact velocity, deformation, or boundary conditions
- A custom nonlinear solver that decouples physical-accuracy and geometric-conformation tolerances via user-exposed parameters
- Context
- Advances implicit frictional contact for deformables, building on adaptive-cloth contact solvers such as Li et al.'s implicit frictional contact for cloth, via a barrier-augmented incremental potential. Builds on: An Implicit Frictional Contact Solver for Adaptive Cloth Simulation
- Correctness
- The non-penetration and non-inversion guarantees follow from a smooth barrier formulation enforced by a line-searched nonlinear solver; the practical caveat is computational cost, since the robustness comes from continuous collision detection and tight tolerances that can be expensive on large or stiff scenes.
- Clarity
- Dense; a first pass conveys the guarantees and intent, but the formulation demands a careful second and likely third pass.
- How to read it
- First pass for the guarantees and where they come from (barrier potential, decoupled tolerances); plan a second and third pass on the energy formulation, the friction model, and the nonlinear-solver/CCD machinery before implementing.
CFX
- Interactive Sculpting of Digital Faces Using an Anatomical Modeling Paradigm Eurographics Disney Research 9 cites
, , , , , ,
Interactive face sculpting tool guided by an anatomical model of skull, fat, and tissue, producing physiologically plausible digital faces.
abstract ▾ abstract ▴
Digitally sculpting 3D human faces is a very challenging task. It typically requires either 1) highly‐skilled artists using complex software packages for high quality results, or 2) highly‐constrained simple interfaces for consumer‐level avatar creation, such as in game engines. We propose a novel interactive method for the creation of digital faces that is simple and intuitive to use, even for novice users, while consistently producing plausible 3D face geometry, and allowing editing freedom beyond traditional video game avatar creation. At the core of our system lies a specialized anatomical local face model (ALM), which is constructed from a dataset of several hundred 3D face scans. User edits are propagated to constraints for an optimization of our data‐driven ALM model, ensuring the resulting face remains plausible even for simple edits like clicking and dragging surface points. We show how several natural interaction methods can be implemented in our framework, including direct control of the surface, indirect control of semantic features like age, ethnicity, gender, and BMI, as well as indirect control through manipulating the underlying bony structures. The result is a simple new method for creating digital human faces, for artists and novice users alike.
Related It's a UVN Face Rig, Charlie Brown: Facial Techniques for Peanuts · Direct Manipulation Blendshapes · FaceBaker: Baking Character Facial Rigs with Machine Learning · An Empirical Rig for Jaw Animation
how to read this ▾ how to read this ▴
- Category
- Method: an interactive, anatomically guided face-sculpting system
- Contributions
-
- An anatomical local face model (ALM) of skull, fat, and tissue, learned from several hundred 3D face scans.
- An interactive editing loop where user edits become constraints for a data-driven optimization that keeps results plausible.
- Several interaction modes: direct surface control, indirect semantic control (age, ethnicity, gender, BMI), and control via the underlying bony structures.
- Context
- Builds on anatomically constrained face modeling, notably Wu et al.'s 'An Anatomically Constrained Local Deformation Model for Monocular Face Capture', repurposing that representation from capture toward interactive authoring. Builds on: An Anatomically Constrained Local Deformation Model for Monocular Face Capture
- Correctness
- The plausibility guarantee rests on the scan dataset behind the ALM, so results should be trusted within that population's coverage, and edits far outside the learned distribution may not behave as expected.
- Clarity
- Accessible; a first pass conveys the interaction idea, do a second pass for the ALM optimization formulation.
- How to read it
- Focus first on how edits map to ALM constraints and how the optimization stays plausible; a second pass on the anatomical model and solver pays off if you want to reimplement or extend the editing modes.
Facial / Rigging
- Investigating Perceptually Based Models to Predict Importance of Facial Blendshapes MIG Academic 11 cites
, , ,
Perceptual study and predictive model ranking facial blendshape importance for games to enable efficient rig compression; won MIG 2020 Best Short Paper.
abstract ▾ abstract ▴
Blendshape facial rigs are used extensively in the industry for facial animation of virtual humans. However, storing and manipulating large numbers of facial meshes is costly in terms of memory and computation for gaming applications, yet the relative perceptual importance of blendshapes has not yet been investigated. Research in Psychology and Neuroscience has shown that our brains process faces differently than other objects, so we postulate that the perception of facial expressions will be feature-dependent rather than based purely on the amount of movement required to make the expression. In this paper, we explore the noticeability of blendshapes under different activation levels, and present new perceptually based models to predict perceptual importance of blendshapes. The models predict visibility based on commonly-used geometry and image-based metrics.
Related Animating Facial Expressions · A Muscle Model for Animating Three-Dimensional Facial Expression · FaceWarehouse: A 3D Facial Expression Database for Visual Computing · Expression Packing: As-Few-As-Possible Training Expressions for Blendshape Transfer
how to read this ▾ how to read this ▴
- Category
- Perceptual study plus a predictive model for blendshape importance
- Contributions
-
- A perceptual study of how noticeable individual facial blendshapes are at different activation levels.
- New perceptually based models that predict the importance (visibility) of blendshapes from common geometry and image-based metrics.
- A basis for rig compression in games, ranking blendshapes by perceptual rather than purely geometric significance.
- Context
- Relates to direct-manipulation blendshape rigs (Lewis and Anjyo's 'Direct Manipulation Blendshapes') and to psychology/neuroscience findings that faces are processed differently from other objects. Builds on: Direct Manipulation Blendshapes
- Correctness
- Models are grounded in a human perceptual study, so conclusions are tied to the stimuli, expressions, and viewing conditions tested; predictions may not transfer to faces or rigs far from those used in the study (it is a short paper, so scope is limited).
- Clarity
- Accessible; a first pass conveys the perceptual premise and the use case, a second pass clarifies the metrics behind the model.
- How to read it
- Read for the motivation and the chosen perceptual metrics; one careful pass is usually enough, with a second pass only if you intend to apply the ranking to compress a specific rig.
Facial
-
,
Introduction to KineFX character tools for game development, covering geometry-level rigging, motion editing, and animation retargeting to refine motion libraries.
Rigging / Retargeting
-
, , ,
Compresses motion matching into neural networks, keeping its controllability with a fraction of the memory.
abstract ▾ abstract ▴
This paper introduces Learned Motion Matching, a neural-network-based alternative to the Motion Matching algorithm that preserves its quality, control, and quick iteration time while achieving the scalability and low memory usage of generative models. The Motion Matching algorithm is broken into three stages, Projection, Stepping, and Decompression, each replaced by a specialized neural network, the Projector, Stepper, and Decompressor, together with an autoencoder-like Compressor that discovers latent variables. The resulting model removes the need to store the matching and animation databases in memory, so memory no longer scales linearly with the amount of animation data. The method is demonstrated on locomotion, rough terrain, chair interaction, character interactions, and quadruped characters, and a user study in a AAA production found participants could barely distinguish it from basic Motion Matching.
Related Neural State Machine for Character-Scene Interactions · Phase-Functioned Neural Networks for Character Control · DReCon: Data-Driven Responsive Control of Physics-Based Characters · Learning Robust and Scalable Motion Matching with Lipschitz Continuity and Sparse Mixture of Experts
how to read this ▾ how to read this ▴
- Category
- Method: a neural reformulation of Motion Matching
- Contributions
-
- Learned Motion Matching, replacing Motion Matching's Projection, Stepping, and Decompression stages with a Projector, Stepper, and Decompressor network.
- An autoencoder-like Compressor that discovers latent variables so the matching and animation databases need not be stored in memory.
- Memory that no longer scales linearly with the animation data while preserving the quality, control, and iteration speed of Motion Matching.
- Context
- Directly builds on Clavet's 'Motion Matching and The Road to Next-Gen Animation' and on neural character control such as Holden et al.'s Phase-Functioned Neural Networks. Builds on: Motion Matching and The Road to Next-Gen Animation · Phase-Functioned Neural Networks for Character Control
- Correctness
- Demonstrated across locomotion, rough terrain, chair interaction, character interactions, and quadrupeds, with a AAA-production user study reporting it was hard to distinguish from Motion Matching; as with learned models, behavior outside the training database is the main thing to watch.
- Clarity
- Accessible if you already know Motion Matching; a first pass conveys the three-network decomposition, a second pass clarifies training of each stage.
- How to read it
- Anchor on the original Motion Matching stages, then map each to its network; a second pass on the Compressor and training setup is worth it if you plan to deploy or reproduce it.
Motion Synthesis
-
, , , , , ,
CAPE trains a conditional Mesh-VAE-GAN to generate pose-dependent clothing deformations as an additive term on SMPL, introducing the first large-scale dynamic clothed human mesh dataset.
abstract ▾ abstract ▴
Three-dimensional human body models are widely used in the analysis of human pose and motion. Existing models, however, are learned from minimally-clothed humans and thus do not capture the complexity of dressed humans in common images and videos. To address this, we learn a generative 3D mesh model of clothing from 3D scans of people with varying pose. Going beyond previous work, our generative model is conditioned on different clothing types, giving the ability to dress different body shapes in a variety of clothing. To do so, we train a conditional Mesh-VAE-GAN on clothing displacements from a 3D SMPL body model. This generative clothing model enables us to sample various types of clothing, in novel poses, on top of SMPL. With a focus on clothing geometry, the model captures both global shape and local structure, effectively extending the SMPL model to add clothing. To our knowledge, this is the first conditional VAE-GAN that works on 3D meshes. For clothing specifically, it is the first such model that directly dresses 3D human body meshes and generalizes to different poses.
Related SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks · 3D Hair Synthesis Using Volumetric Variational Autoencoders · Learning-Based Animation of Clothing for Virtual Try-On · GarMatNet: A Learning-Based Method for Predicting 3D Garment Mesh with Parameterized Materials
how to read this ▾ how to read this ▴
- Category
- Method plus dataset: a generative model of pose-dependent clothing on SMPL
- Contributions
-
- CAPE, a conditional Mesh-VAE-GAN that generates clothing displacements as an additive term on the SMPL body model.
- Conditioning on clothing type and body shape, so different bodies can be dressed in varied clothing and posed in novel poses.
- A large-scale dynamic clothed-human mesh dataset, and what the authors describe as the first conditional VAE-GAN operating directly on 3D meshes.
- Context
- Extends the SMPL body model (Loper et al., 'SMPL: A Skinned Multi-Person Linear Model') by adding learned clothing geometry on top of the minimally-clothed base. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Clothing is modeled as displacements on SMPL, so it captures global shape and local structure but inherits SMPL's topology and may not represent loose or highly dynamic garments that depart strongly from the body; quality is bounded by the captured dataset.
- Clarity
- Moderately technical; a first pass conveys the additive-displacement idea, a second pass is needed for the VAE-GAN architecture and losses.
- How to read it
- Focus first on how clothing is posed as a residual on SMPL and what the conditioning controls; a second pass on the Mesh-VAE-GAN and the dataset pays off if you want to train or sample from it.
CFX / ML Deformation
-
, , ,
Local phase representation per body part enabling neural networks to learn multi-contact character movements without global phase ambiguity.
abstract ▾ abstract ▴
Training a bipedal character to play basketball and interact with objects, or a quadruped character to move in various locomotion modes, are difficult tasks due to the fast and complex contacts happening during the motion. In this paper, we propose a novel framework to learn fast and dynamic character interactions that involve multiple contacts between the body and an object, another character and the environment, from a rich, unstructured motion capture database. We use one-on-one basketball play and character interactions with the environment as examples. To achieve this task, we propose a novel feature called local motion phase, that can help neural networks to learn asynchronous movements of each bone and its interaction with external objects such as a ball or an environment. We also propose a novel generative scheme to reproduce a wide variation of movements from abstract control signals given by a gamepad, which can be useful for changing the style of the motion under the same context. Our scheme is useful for animating contact-rich, complex interactions for real-time applications such as computer games.
Related Neural State Machine for Character-Scene Interactions · A Deep Learning Framework for Character Motion Synthesis and Editing · Phase-Functioned Neural Networks for Character Control · Mode-Adaptive Neural Networks for Quadruped Motion Control
how to read this ▾ how to read this ▴
- Category
- Method: a motion representation for learning multi-contact character movement
- Contributions
-
- Local motion phase, a per-bone phase feature that helps networks learn asynchronous limb movements and contacts with external objects.
- A framework that learns fast, contact-rich interactions (body-object, body-character, body-environment) from rich unstructured motion capture.
- A generative scheme that reproduces wide motion variation, including style changes under the same context, from gamepad control signals.
- Context
- Builds on Starke et al.'s Neural State Machine for Character-Scene Interactions, addressing the global-phase ambiguity that arises in fast multi-contact motion. Builds on: Neural State Machine for Character-Scene Interactions
- Correctness
- Validated on examples such as one-on-one basketball play and environment interaction; as a data-driven real-time method, fidelity depends on the motion database and the local-phase labeling, and very novel contact configurations are the natural stress test.
- Clarity
- Accessible in concept, dense in detail; a first pass conveys why local phases beat a global phase, a second pass is needed for the feature extraction and network.
- How to read it
- Concentrate on the definition and extraction of local motion phase and why it resolves asynchronous contacts; a second pass on the generative control scheme is worth it for real-time applications.
Motion Synthesis
-
Ubisoft Montreal trained a deep reinforcement learning agent to follow motion-matching output while self-balancing physically, advancing toward interactive ragdoll characters that recover from disturbances.
Motion Synthesis / Muscles
- talk Machine Learning: Physics Simulation, Kolmogorov Complexity, and Squishy Bunnies GDC Industrial
Shows neural network approximations of interactive cloth and physics simulations achieving 300-5000x speedup, enabling new simulation budgets without replacing conventional physics solvers.
ML Deformation / CFX
-
, , ,
Describes the pipeline for Soul's volumetric characters, procedurally generating volumes and linework from standard surface geometry rigs on the renderfarm.
abstract ▾ abstract ▴
The soul characters in Disney/Pixar’s Soul have a stylized appearance that sets them into a unique world, which introduced many new challenges. Everyone from the art department and character modelers and shaders to the technical directors and developers in the effects, lighting, and software groups collaborated to bring this new visual style to screen. The soul world is abstract and ethereal; this needed to be balanced with visual clarity and design appeal. As our character rigging and animation tools use rigged surfaces, a key challenge was presenting a new representation derived from this data that meets our visual goals. To achieve softness of volumetric form and dynamically changing linework, we built a system to procedurally generate this data in Houdini. Significant numerical computation was required to create this data at the fidelity required. We developed an automated system for managing this computation in a configurable way, while keeping data for downstream renders in sync with changes to character performances.
Related USD and Scene Interoperability: Demystifying the State of the Art · Eyes Without a Face: Integrating Detached Facial Features into Pixar's Character Pipeline · USD in Production · Universal Scene Description: Open Source Release
how to read this ▾ how to read this ▴
- Category
- Production talk: a volumetric-character pipeline breakdown
- Contributions
-
- Demonstrates a pipeline that procedurally derives volumetric form and dynamically changing linework from standard rigged surface geometry (in Houdini).
- Shows an automated, configurable system for managing the heavy numerical computation on the renderfarm.
- Shows how downstream render data is kept in sync with changes to character performances.
- Context
- Relates to studio procedural-geometry frameworks such as Hankins' 'Dataflow', applied here to the stylized soul characters of Pixar's Soul. Builds on: Dataflow: ILM's Framework for Procedural Geometry Generation, Simulation Authoring, Crowds, and More
- Correctness
- Studio practice, not peer-reviewed; the results are production-proven on a single film with a specific stylized look, so methods are reported as what worked for this show rather than as generally validated claims.
- Clarity
- Accessible and narrative; a first pass conveys the goals and the pipeline shape without needing a formal second pass.
- How to read it
- Read for the workflow and the rigged-surface-to-volume strategy and how compute was managed at scale; skim once unless you are building a similar procedural volumetric pipeline.
Rigging
-
, , , ,
This work combines triangle-based and yarn-based models within a single cloth simulation to leverage the strengths of both.
abstract ▾ abstract ▴
This work combines triangle-based and yarn-based models within a single cloth simulation to leverage the strengths of both. Most of a garment is represented with an efficient triangle model that lowers computational and memory cost, while key regions use a yarn-level model that captures rich nonlinear and plastic effects. An enriched kinematic representation with an optimal kinematic filter enables a smooth transition between the two models, and a dedicated preconditioner handles the disparate inertia of triangle and yarn nodes.
Related Directing Cloth Draping through Blended UVs · Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Mechanics-Aware Deformation of Yarn Pattern Geometry
how to read this ▾ how to read this ▴
- Category
- Method: a hybrid triangle/yarn cloth simulation
- Contributions
-
- A single cloth simulation that combines an efficient triangle model for most of a garment with a yarn-level model in key regions to capture nonlinear and plastic effects.
- An enriched kinematic representation with an optimal kinematic filter that enables a smooth transition between the two models.
- A dedicated preconditioner that handles the disparate inertia of triangle and yarn nodes.
- Context
- Builds on yarn-level cloth simulation (Cirio et al.'s yarn-level woven-cloth work) and on standard triangle-mesh cloth models, seeking the strengths of both. Builds on: Yarn-Level Simulation of Woven Cloth
- Correctness
- The benefit hinges on placing yarn-level regions where rich behavior matters while keeping the rest as triangles; the coupling correctness depends on the kinematic filter and preconditioner, so region choice and interface handling are the limits to keep in mind.
- Clarity
- Technical; a first pass conveys the hybrid motivation, but the kinematic filter and preconditioner require a careful second (and likely third) pass.
- How to read it
- Read first for where and why yarns are mixed into triangles; a second pass on the kinematic filter and preconditioner is essential if you intend to implement the coupling.
CFX
- Model Predictive Control with a Visuomotor System for Physics-based Character Animation TOG Weta FX 33 cites
, , ,
Physics-based character animation using model predictive control paired with a visuomotor perception system for reactive locomotion.
abstract ▾ abstract ▴
This article presents a Model Predictive Control framework with a visuomotor system that synthesizes eye and head movements coupled with physics-based full-body motions while placing visual attention on objects of importance in the environment. As the engine of this framework, we propose a visuomotor system based on human visual perception and full-body dynamics with contacts. Relying on partial observations with uncertainty from a simulated visual sensor, an optimal control problem for this system leads to a Partially Observable Markov Decision Process, which is difficult to deal with. We approximate it as a deterministic belief Markov Decision Process for effective control. To obtain a solution for the problem efficiently, we adopt differential dynamic programming, which is a powerful scheme to find a locally optimal control policy for nonlinear system dynamics. Guided by a reference skeletal motion without any a priori gaze information, our system produces realistic eye and head movements together with full-body motions for various tasks such as catching a thrown ball, walking on stepping stones, balancing after being pushed, and avoiding moving obstacles.
Related ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · UniCon: Universal Neural Controller for Physics-Based Character Motion · PDP: Physics-Based Character Animation via Diffusion Policy
how to read this ▾ how to read this ▴
- Category
- Method: model predictive control with a visuomotor system for physics-based animation
- Contributions
-
- An MPC framework that synthesizes coupled eye, head, and physics-based full-body motion while attending to important objects.
- A visuomotor system based on human visual perception and full-body dynamics with contacts, using partial, uncertain observations from a simulated visual sensor.
- A tractable formulation that approximates the resulting POMDP as a deterministic belief MDP and solves it with differential dynamic programming.
- Context
- Sits in the physics-based control lineage exemplified by Peng et al.'s DeepMimic, adding a perception-driven gaze-and-body coupling rather than relying on a priori gaze data. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- Demonstrated on tasks such as catching a thrown ball, walking on stepping stones, balancing after a push, and avoiding moving obstacles; results rely on the belief-MDP approximation and DDP finding a locally optimal policy, so behavior is local and task-guided by a reference skeletal motion.
- Clarity
- Mathematically dense; a first pass conveys the perceive-then-control loop, but the POMDP approximation and DDP need a careful second pass.
- How to read it
- Focus first on how visual attention feeds the controller and what the reference motion provides; a second pass on the belief-MDP and DDP formulation is needed to follow or reproduce the control.
Motion Synthesis
-
, , ,
Hybrid SMPL-FEM avatar with custom nonlinear anisotropic material; skin thickness and mechanical properties optimized from 4D captures.
abstract ▾ abstract ▴
Data‐driven models of human avatars have shown very accurate representations of static poses with soft‐tissue deformations. However they are not yet capable of precisely representing very nonlinear deformations and highly dynamic effects. Nonlinear skin mechanics are essential for a realistic depiction of animated avatars interacting with the environment, but controlling physics‐only solutions often results in a very complex parameterization task. In this work, we propose a hybrid model in which the soft‐tissue deformation of animated avatars is built as a combination of a data‐driven statistical model, which kinematically drives the animation, an FEM mechanical simulation. Our key contribution is the definition of deformation mechanics in a reference pose space by inverse skinning of the statistical model. This way, we retain as much as possible of the accurate static data‐driven deformation and use a custom anisotropic nonlinear material to accurately represent skin dynamics. Model parameters including the heterogeneous distribution of skin thickness and material properties are automatically optimized from 4D captures of humans showing soft‐tissue deformations.
Related Data-driven Modeling of Skin and Muscle Deformation · How to Build a Human: Practical Physics-Based Character Animation · Data-Driven Physics for Human Soft Tissue Animation · Building Accurate Physics-based Face Models from Data
how to read this ▾ how to read this ▴
- Category
- Method: a hybrid data-driven/FEM model of nonlinear skin mechanics for avatars
- Contributions
-
- A hybrid avatar combining a data-driven statistical model (kinematically driving the animation) with an FEM mechanical simulation for dynamics.
- Defining deformation mechanics in a reference pose space via inverse skinning of the statistical model, retaining accurate static deformation while adding a custom anisotropic nonlinear material.
- Automatic optimization of model parameters, including heterogeneous skin thickness and material properties, from 4D captures.
- Context
- Builds on data-driven soft-tissue animation such as Kim et al.'s 'Data-Driven Physics for Human Soft Tissue Animation', layered onto an SMPL-style statistical body. Builds on: Data-Driven Physics for Human Soft Tissue Animation
- Correctness
- Parameters are fit from 4D captures of humans showing soft-tissue deformation, so accuracy is tied to that capture data and the chosen material model; the hybrid design aims to keep static accuracy while improving dynamics, but generalization beyond captured subjects/motions is the caveat.
- Clarity
- Technical; a first pass conveys the statistical-plus-FEM split, a second pass is needed for the reference-space mechanics and parameter estimation.
- How to read it
- Read first for how kinematic statistical deformation and FEM dynamics are combined in reference pose space; a second pass on the inverse skinning and material/parameter estimation pays off for implementation.
Skinning / Muscles
- MoGlow: Probabilistic and Controllable Motion Synthesis Using Normalising Flows SIGGRAPH Asia Academic 105 cites
, ,
Autoregressive normalising-flow model produces diverse, controllable locomotion sequences trained with exact maximum likelihood.
abstract ▾ abstract ▴
Data-driven modelling and synthesis of motion is an active research area with applications that include animation, games, and social robotics. This paper introduces a new class of probabilistic, generative, and controllable motion-data models based on normalising flows. Models of this kind can describe highly complex distributions, yet can be trained efficiently using exact maximum likelihood, unlike GANs or VAEs. Our proposed model is autoregressive and uses LSTMs to enable arbitrarily long time-dependencies. Importantly, is is also causal, meaning that each pose in the output sequence is generated without access to poses or control inputs from future time steps; this absence of algorithmic latency is important for interactive applications with real-time motion control. The approach can in principle be applied to any type of motion since it does not make restrictive, task-specific assumptions regarding the motion or the character morphology. We evaluate the models on motion-capture datasets of human and quadruped locomotion. Objective and subjective results show that randomly-sampled motion from the proposed method outperforms task-agnostic baselines and attains a motion quality close to recorded motion capture.
Related Local Motion Phases for Learning Multi-Contact Character Movements · Learned Motion Matching · Robust Motion In-Betweening · Multi-Objective Adversarial Gesture Generation
how to read this ▾ how to read this ▴
- Category
- Method: a normalising-flow model for probabilistic, controllable motion synthesis
- Contributions
-
- MoGlow, a probabilistic generative motion model based on normalising flows, trainable by exact maximum likelihood (unlike GANs or VAEs).
- An autoregressive design using LSTMs for arbitrarily long time-dependencies, and a causal formulation with no algorithmic latency for real-time control.
- A task-agnostic approach making no restrictive assumptions about the motion or character morphology, evaluated on human and quadruped locomotion.
- Context
- Relates to neural character control such as Holden et al.'s Phase-Functioned Neural Networks, offering a flow-based probabilistic alternative to GAN/VAE motion models. Builds on: Phase-Functioned Neural Networks for Character Control
- Correctness
- Objective and subjective evaluations on motion-capture locomotion data report sampled motion outperforming task-agnostic baselines; being task-agnostic and data-driven, quality and diversity remain bounded by the training motion, and the abstract's quality claim is truncated.
- Clarity
- Accessible in framing but requires flow-model background; a first pass conveys the probabilistic, causal, controllable design, a second pass clarifies the normalising-flow formulation.
- How to read it
- Focus first on why exact-likelihood flows and causality matter for interactive control; a second pass on the flow and autoregressive conditioning is worth it if you want to train or sample motions.
Motion Synthesis
- Motion Retargetting based on Dilated Convolutions and Skeleton-Specific Loss Functions CGF Academic 11 cites
, , ,
Unsupervised temporal dilated convolution network retargets motion across humanoids of different skeleton proportions while preserving high-frequency detail.
abstract ▾ abstract ▴
Motion retargetting refers to the process of adapting the motion of a source character to a target. This paper presents a motion retargetting model based on temporal dilated convolutions. In an unsupervised manner, the model generates realistic motions for various humanoid characters. The retargetted motions not only preserve the high‐frequency detail of the input motions but also produce natural and stable trajectories despite the skeleton size differences between the source and target. Extensive experiments are made using a 3D character motion dataset and a motion capture dataset. Both qualitative and quantitative comparisons against prior methods demonstrate the effectiveness and robustness of our method.
Related MarkerNet: A Divide-and-Conquer Solution to Motion Capture Solving From Raw Markers · Skeleton-Aware Networks for Deep Motion Retargeting · Motion Retargeting for Crowd Simulation · Sketch-based Motion Editing for Articulated Characters
how to read this ▾ how to read this ▴
- Category
- Method: a deep motion retargeting network
- Contributions
-
- An unsupervised motion retargeting model built on temporal dilated convolutions
- Retargeting across humanoids of differing skeleton proportions while preserving high-frequency motion detail
- Natural, stable trajectories despite source-target skeleton size differences
- Context
- Sits in the deep-learning retargeting line alongside Aberman et al.'s Skeleton-Aware Networks, swapping graph-style architectures for temporal dilated convolutions over the motion sequence. Builds on: Skeleton-Aware Networks for Deep Motion Retargeting
- Correctness
- Evaluated qualitatively and quantitatively on a 3D character motion dataset and a motion capture dataset against prior methods; as with most learned retargeters, generalization is bounded by the training distribution of skeletons and motions, so out-of-distribution proportions warrant caution.
- Clarity
- Accessible at the concept level; a first pass conveys the dilated-convolution idea, a second pass is needed for the skeleton-specific loss formulation.
- How to read it
- First pass for the unsupervised setup and why dilated convolutions capture temporal detail; do a second pass on the skeleton-specific loss functions and the comparison tables if you care about how proportion differences are handled.
Retargeting
-
, ,
Extends XPBD to muscle fibers and fascia constraints, enabling fast controllable character muscle and superficial fascia simulation without FEM complexity.
abstract ▾ abstract ▴
Recent research on muscle and fascia simulation for visual effects relies on numerical methods such as the finite element method or finite volume method. These approaches produce realistic results, but require high computational time and are complex to set up. On the other hand, position‐based dynamics offers a fast and controllable solution to simulate surfaces and volumes, but there is no literature on how to implement constraints that could be used to realistically simulate muscles and fascia for digital creatures with this method. In this paper, we extend the current state‐of‐the‐art in Position‐Based Dynamics to efficiently compute realistic skeletal muscle and superficial fascia simulation. In particular, we embed muscle fibres in the solver by adding an anisotropic component to the distance constraints between mesh points and apply overpressure to realistically model muscle volume changes under contraction. In addition, we also define a modified distance constraint for the fascia that allows compression and enables the user to scale the constraint's original distance to gain elastic potential at rest. Finally, we propose a modification of the extended position‐based dynamics algorithm to properly compute different sets of constraints and describe other details for proper simulation of character's muscle and fascia dynamics.
Related Smeat: ADMM Based Tools for Character Deformation · Art-Directed Muscle Simulation for High-End Facial Animation · Active Volumetric Musculoskeletal Systems · Flesh, Flab, and Fascia Simulation on Zootopia
how to read this ▾ how to read this ▴
- Category
- Method: a muscle and fascia simulation technique
- Contributions
-
- Extends Position-Based Dynamics with anisotropic distance constraints to embed muscle fibres
- Adds overpressure to model muscle volume change under contraction, plus a modified compressible distance constraint for fascia
- Offers a fast, controllable alternative to FEM/FVM for digital-creature muscle and superficial fascia
- Context
- Builds directly on Macklin et al.'s XPBD (compliant constrained dynamics), bringing muscle/fascia behavior into the position-based-dynamics family rather than the finite-element/finite-volume tradition. Builds on: XPBD: Position-Based Simulation of Compliant Constrained Dynamics
- Correctness
- Trades the physical fidelity of FEM/FVM for speed and artist control; the constraints are engineered approximations of fibre anisotropy and volume preservation, so results are plausible and controllable rather than biomechanically exact.
- Clarity
- Accessible if you already know PBD/XPBD; the constraint derivations reward a focused second pass.
- How to read it
- First pass to grasp how fibre anisotropy and overpressure map onto distance constraints; second pass on the constraint math and the XPBD modification if you intend to implement it.
Muscles / CFX
-
, , , ,
Audio-driven facial video synthesis via a latent 3D face model space, enabling video dubbing and cross-person reenactment with temporal stability.
abstract ▾ abstract ▴
We present Neural Voice Puppetry, a novel approach for audio-driven facial video synthesis. Given an audio sequence of a source person or digital assistant, we generate a photo-realistic output video of a target person that is in sync with the audio of the source input. This audio-driven facial reenactment is driven by a deep neural network that employs a latent 3D face model space. Through the underlying 3D representation, the model inherently learns temporal stability while we leverage neural rendering to generate photo-realistic output frames. Our approach generalizes across different people, allowing us to synthesize videos of a target actor with the voice of any unknown source actor or even synthetic voices that can be generated utilizing standard text-to-speech approaches. Neural Voice Puppetry has a variety of use-cases, including audio-driven video avatars, video dubbing, and text-driven video synthesis of a talking head. We demonstrate the capabilities of our method in a series of audio- and text-based puppetry examples, including comparisons to state-of-the-art techniques and a user study.
Related Codec Avatars: Photorealistic Telepresence at Scale · Realtime Performance-Based Facial Animation · Audiovisual Inputs for Learning Robust, Real-Time Facial Animation with Lip Sync · Displaced Dynamic Expression Regression for Real-Time Facial Tracking and Animation
how to read this ▾ how to read this ▴
- Category
- Method: audio-driven facial reenactment
- Contributions
-
- Audio-driven facial video synthesis that lip-syncs a target person to arbitrary source audio
- A deep network operating in a latent 3D face model space, with neural rendering for photo-realistic frames
- Generalizes across people and supports video dubbing, audio-driven avatars, and text-driven talking heads
- Context
- Extends the audio-to-face line of Karras et al. (joint end-to-end learning of pose and emotion) by routing prediction through a latent 3D face representation and neural rendering rather than rendering directly. Builds on: Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion
- Correctness
- The 3D intermediate is credited with inherent temporal stability, and claims are backed by comparisons to state-of-the-art methods and a user study; as audio-driven reenactment it carries the usual identity/expression-transfer and potential-misuse caveats, and quality still depends on target-actor footage.
- Clarity
- Accessible overview; the latent 3D space plus neural rendering pipeline benefits from a second pass.
- How to read it
- First pass for the use-cases and the role of the latent 3D model; second pass on the audio-to-expression mapping and neural renderer, and skim the user study for perceived quality.
Facial
-
Autodesk University class covering Maya 2020 rigging additions including Offset Parent Matrix, UV Pin, Proximity Pin, Rivet, Motion Library, and animation workflow enhancements for facial and character pipelines.
Rigging / Retargeting
-
, , , , ,
Uses a pose-conditioned neural network to invert LBS deformations, enabling efficient canonical-space queries such as signed distance lookups for deformed characters.
abstract ▾ abstract ▴
In this technical report, we investigate efficient representations of articulated objects (e.g. human bodies), which is an important problem in computer vision and graphics. To deform articulated geometry, existing approaches represent objects as meshes and deform them using "skinning" techniques. The skinning operation allows a wide range of deformations to be achieved with a small number of control parameters. This paper introduces a method to invert the deformations undergone via traditional skinning techniques via a neural network parameterized by pose. The ability to invert these deformations allows values (e.g., distance function, signed distance function, occupancy) to be pre-computed at rest pose, and then efficiently queried when the character is deformed. We leave empirical evaluation of our approach to future work.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation · SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks · A Statistical Model of Human Pose and Body Shape
how to read this ▾ how to read this ▴
- Category
- Method: neural inversion of linear blend skinning
- Contributions
-
- A pose-conditioned neural network that inverts traditional LBS deformations
- Enables values such as distance, signed distance, or occupancy to be precomputed at rest pose and queried efficiently when the character deforms
- Context
- Frames articulated-object representation against SMPL-style skinned body models, proposing a learned inverse of the standard skinning operation rather than a new forward deformer. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Presented as a technical report that explicitly leaves empirical evaluation to future work, so the inversion is proposed and motivated rather than validated; treat reported behavior as a design sketch, not a benchmarked result.
- Clarity
- Short and readable as a report; the LBS-inversion idea comes across in one pass.
- How to read it
- A single first pass is enough to capture the canonical-space query idea; only revisit for the parameterization details, and note the absence of empirical results when comparing to later work.
Skinning / ML Deformation
-
, , ,
This work extends projective dynamics to handle exact dry frictional contact governed by the Signorini-Coulomb law, integrating contact resolution into the local-global solver structure rather than re
abstract ▾ abstract ▴
This work extends projective dynamics to handle exact dry frictional contact governed by the Signorini-Coulomb law, integrating contact resolution into the local-global solver structure rather than relying on penalty forces. A semi-implicit time stepping scheme alternates between the inexpensive global elasticity solve and local contact and friction projections, preserving the efficiency that makes projective dynamics attractive. The method robustly simulates cloth and other deformable objects in collision-rich scenarios while respecting non-penetration and Coulomb friction constraints.
Related Nonlinear Cloth Simulation with Isogeometric Analysis · Interactive Hair Simulation on the GPU Using ADMM · Projective Dynamics: Fusing Constraint Projections for Fast Simulation · An Implicit Frictional Contact Solver for Adaptive Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a frictional contact solver for projective dynamics
- Contributions
-
- Extends projective dynamics to exact dry frictional contact under the Signorini-Coulomb law
- Integrates contact and friction projections into the local-global solver instead of using penalty forces
- A semi-implicit scheme alternating a global elasticity solve with local contact/friction projections
- Context
- Builds on Bouaziz et al.'s Projective Dynamics, grafting non-smooth Signorini-Coulomb contact onto its local-global structure rather than the penalty-based contact common in fast simulators. Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- Demonstrated on cloth and other deformables in collision-rich scenes, respecting non-penetration and Coulomb friction; the appeal hinges on preserving PD's efficiency, so the relevant question is how well exact friction scales as contact density grows.
- Clarity
- Dense in the contact formulation; the high-level alternation scheme is graspable on a first pass, the projections are not.
- How to read it
- First pass for how friction is folded into the local-global loop; budget a careful second and likely third pass on the Signorini-Coulomb projections if you plan to implement or extend it.
CFX
-
, , , , ,
Couples cage and skeleton deformation spaces into a unified real-time framework, accessing poses unreachable by either control structure alone.
abstract ▾ abstract ▴
Skeleton‐based and cage‐based deformation techniques represent the two most popular approaches to control real‐time deformations of digital shapes and are, to a vast extent, complementary to one another. Despite their complementary roles, high‐end modelling packages do not allow for seamless integration of such control structures, thus inducing a considerable burden on the user to maintain them synchronized. In this paper, we propose a framework that seamlessly combines rigging skeletons and deformation cages, granting artists with a real‐time deformation system that operates using any smooth combination of the two approaches. By coupling the deformation spaces of cages and skeletons, we access a much larger space, containing poses that are impossible to obtain by acting solely on a skeleton or a cage. Our method is oblivious to the specific techniques used to perform skinning and cage‐based deformation, securing it compatible with pre‐existing tools. We demonstrate the usefulness of our hybrid approach on a variety of examples.
Related Harmonic Coordinates for Character Articulation · Learning Skeletal Articulations with Neural Blend Shapes · Real-Time Skeletal Skinning with Optimized Centers of Rotation · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks
how to read this ▾ how to read this ▴
- Category
- Method: a hybrid real-time deformation framework
- Contributions
-
- Couples skeleton-based and cage-based deformation into a unified real-time system
- Accesses a larger pose space, reaching poses impossible with a skeleton or a cage alone
- Oblivious to the specific skinning and cage techniques used, keeping it compatible with existing tools
- Context
- Bridges the two complementary control paradigms (rigging skeletons and deformation cages), in the lineage of cage-based skinning work such as Ju et al.'s reusable skinning templates. Builds on: Reusable Skinning Templates Using Cage-based Deformations
- Correctness
- Demonstrated qualitatively on a variety of examples; the central claim is expressive coverage and seamless artist control, so it is an interaction/representation contribution rather than a physically validated one, and benefits accrue mainly where both control structures are wanted at once.
- Clarity
- Accessible; the coupling concept reads clearly in a first pass.
- How to read it
- First pass for the motivation and the coupled deformation space; second pass only if you need the blending math between cage and skeleton coordinates.
Skinning / Rigging
-
GDC talk presenting DReCon-based production system combining motion matching and physics control for responsive game character animation.
Motion Synthesis
-
, , , ,
Neural network predicting skeleton topology and skinning weights from 3D character meshes, automating the character rigging process.
abstract ▾ abstract ▴
We present RigNet, an end-to-end automated method for producing animation rigs from input character models. Given an input 3D model representing an articulated character, RigNet predicts a skeleton that matches the animator expectations in joint placement and topology. It also estimates surface skin weights based on the predicted skeleton. Our method is based on a deep architecture that directly operates on the mesh representation without making assumptions on shape class and structure. The architecture is trained on a large and diverse collection of rigged models, including their mesh, skeletons and corresponding skin weights. Our evaluation is three-fold: we show better results than prior art when quantitatively compared to animator rigs; qualitatively we show that our rigs can be expressively posed and animated at multiple levels of detail; and finally, we evaluate the impact of various algorithm choices on our output rigs.1
Related MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds · A Statistical Model of Human Pose and Body Shape · Robust and Accurate Skeletal Rigging from Mesh Sequences · S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
how to read this ▾ how to read this ▴
- Category
- Method: neural auto-rigging from meshes
- Contributions
-
- An end-to-end network that predicts a skeleton (joint placement and topology) from an input 3D character mesh
- Estimates surface skin weights based on the predicted skeleton
- Operates directly on the mesh with no assumptions about shape class or structure
- Context
- Advances learned rigging in the line of Liu et al.'s NeuroSkinning and the classic Baran-Popovic automatic rigging, predicting both skeleton and weights rather than weights alone. Builds on: NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Automatic Rigging and Animation of 3D Characters
- Correctness
- Validated three ways (quantitative comparison to animator rigs, qualitative posing/animation at multiple levels of detail, and ablations of algorithm choices); as a learned rigger trained on a rigged-model collection, output quality and topology fidelity remain bounded by that training data.
- Clarity
- Accessible at the pipeline level; the mesh-based architecture rewards a second pass.
- How to read it
- First pass for the two-stage skeleton-then-weights pipeline and the evaluation against animator rigs; second pass on the mesh architecture and the ablation study if you care about why design choices matter.
Rigging / ML Deformation
-
, , ,
Recurrent neural network for robust keyframe-to-keyframe motion inbetweening producing natural transitions across long time spans.
abstract ▾ abstract ▴
In this work we present a novel, robust transition generation technique that can serve as a new tool for 3D animators, based on adversarial recurrent neural networks. The system synthesises high-quality motions that use temporally-sparse keyframes as animation constraints. This is reminiscent of the job of in-betweening in traditional animation pipelines, in which an animator draws motion frames between provided keyframes. We first show that a state-of-the-art motion prediction model cannot be easily converted into a robust transition generator when only adding conditioning information about future keyframes. To solve this problem, we then propose two novel additive embedding modifiers that are applied at each timestep to latent representations encoded inside the network's architecture. One modifier is a time-to-arrival embedding that allows variations of the transition length with a single model. The other is a scheduled target noise vector that allows the system to be robust to target distortions and to sample different transitions given fixed keyframes. To qualitatively evaluate our method, we present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios.
Related Motion Graphs · Learned Motion Matching · Taming Diffusion Probabilistic Models for Character Control · Multi-Objective Adversarial Gesture Generation
how to read this ▾ how to read this ▴
- Category
- Method: a neural motion in-betweening tool
- Contributions
-
- An adversarial recurrent network that synthesizes transitions from temporally-sparse keyframes
- A time-to-arrival embedding that varies transition length with a single model
- A scheduled target-noise vector for robustness to target distortion and for sampling varied transitions from fixed keyframes
- Context
- Reframes the traditional keyframe in-betweening task (descended from Burtnyk-Wein keyframe animation) as learned transition generation, noting a state-of-the-art motion-prediction model does not trivially convert into a robust in-betweener. Builds on: Interactive Skeleton Techniques for Enhancing Motion Dynamics in Key Frame Animation
- Correctness
- Motivated by showing naive future-keyframe conditioning is insufficient, then addressed with the two embedding modifiers; demonstrated qualitatively including a MotionBuilder plugin, so robustness is shown in practice while generalization stays tied to the training motion distribution.
- Clarity
- Accessible narrative; the two embedding modifiers are the crux and merit a second pass.
- How to read it
- First pass for why plain conditioning fails and what the two embeddings fix; second pass on the time-to-arrival and scheduled-noise mechanisms, which are the transferable ideas.
Motion Synthesis
-
, , ,
Suite of geometric tools for sculpt generation and reuse in character rigs: sculpt transfer, surface reconstruction, and relaxation, showcased on Onward and Soul.
abstract ▾ abstract ▴
Pose-space sculpting is a key component in character rigging workflows used by digital artists to create shape corrections that fire on top of deformation rigs. However, hand-crafting sculpts one pose at time is notoriously laborious, involving multiple cleanup passes as well as repetitive manual edits. In this work, we present a suite of geometric tools that have significantly sped up the generation and reuse of production-quality sculpts in character rigs at Pixar. These tools include a transfer technique that refits sculpts from one model to another, a surface reconstruction method that resolves entangled regions, and a relaxation scheme that restores surface details. Importantly, our approach allows riggers to focus their time on making creative sculpt edits to meet stylistic goals, thus enabling quicker turnarounds and larger design changes with a reduced impact on production. We showcase the results generated by our tools with examples from Pixar’s feature films Onward and Soul.
Related Animation Setup Transfer for 3D Characters · Wires: A Geometric Deformation Technique · SkinMixer: Blending 3D Animated Models · Avatar Reshaping and Automatic Rigging Using a Deformable Model
how to read this ▾ how to read this ▴
- Category
- Method: geometric tools for pose-space sculpting in rigs (production-driven)
- Contributions
-
- A transfer technique that refits sculpts from one model to another
- A surface reconstruction method that resolves entangled regions
- A relaxation scheme that restores surface details, reducing manual sculpt cleanup
- Context
- Extends de Goes' patch-based surface relaxation into a sculpt-processing suite for pose-space corrective shapes in character rigging at Pixar. Builds on: Patch-based Surface Relaxation
- Correctness
- Showcased on production characters from Onward and Soul; results are demonstrated as production-quality and labor-saving rather than benchmarked against alternatives, so the evidence is practical and example-driven, with effectiveness tied to the rigging workflows it targets.
- Clarity
- Clear and well-motivated by the artist workflow; the geometric methods reward a second pass.
- How to read it
- First pass for the three tools and the workflow pain they remove; second pass on the transfer and reconstruction math if you build rigging pipelines.
Rigging / Skinning
-
, , ,
Constitutive model for active elasticity that drives simulated flesh and muscle toward target shapes with production-friendly artist controls.
abstract ▾ abstract ▴
The recent “Phace” facial modeling and animation framework [Ichim et al. 2017] introduced a specific formulation of an elastic energy potential that induces mesh elements to approach certain prescribed shapes, modulo rotations. This target shape is defined for each element as an input parameter, and is a multi-dimensional analogue of activation parameters in fiber-based anisotropic muscle models. We argue that the constitutive law suggested by this energy formulation warrants consideration as a highly versatile and practical model of active elastic materials, and could rightfully be regarded as a “baseline” parametric description of active elasticity, in the same fashion that corotational elasticity has largely established itself as the prototypical rotation-invariant model of isotropic elasticity. We present a formulation of this constitutive model in the spirit and style of Finite Element Methods for continuum mechanics, complete with closed form expressions for strain tensors and exact force derivatives for use in implicit and quasistatic schemes. We demonstrate the versatility of the model through various examples in which active elements are employed.
Related Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies · Art-Directed Muscle Simulation for High-End Facial Animation · Steklov-Poincare Skinning · EMU: Efficient Muscle Simulation in Deformation Space
how to read this ▾ how to read this ▴
- Category
- Method: an active-elasticity constitutive model
- Contributions
-
- Promotes the Phace-style shape-targeting energy into a general, baseline model of active elasticity
- A continuum-mechanics FEM formulation with closed-form strain tensors and exact force derivatives for implicit and quasistatic schemes
- Per-element target shapes act as a multi-dimensional analogue of muscle activation parameters
- Context
- Generalizes the energy potential from Ichim et al.'s Phace facial framework and positions it, against the fibre-based muscle tradition of Teran et al., as an active-elasticity counterpart to corotational isotropic elasticity. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- The argument is that this constitutive law deserves baseline status for active materials; backed by closed-form derivatives and varied active-element examples, it is demonstrated for versatility rather than exhaustively benchmarked, so the practical reach across material regimes is the thing to watch.
- Clarity
- Mathematically dense; the motivation reads on a first pass, the constitutive derivation does not.
- How to read it
- First pass for the analogy to corotational elasticity and what shape targeting buys you; second and likely third pass on the strain tensors and force derivatives if you implement the model.
Muscles
- Single-Shot High-Quality Facial Geometry and Skin Appearance Capture SIGGRAPH Disney Research 76 cites
, , , ,
Single-shot facial capture combining polarized spherical gradient illumination with stereo to recover geometry and skin reflectance simultaneously.
abstract ▾ abstract ▴
We propose a new light-weight face capture system capable of reconstructing both high-quality geometry and detailed appearance maps from a single exposure. Unlike currently employed appearance acquisition systems, the proposed technology does not require active illumination and hence can readily be integrated with passive photogrammetry solutions. These solutions are in widespread use for 3D scanning humans as they can be assembled from off-the-shelf hardware components, but lack the capability of estimating appearance. This paper proposes a solution to overcome this limitation, by adding appearance capture to photogrammetry systems. The only additional hardware requirement to these solutions is that a subset of the cameras are cross-polarized with respect to the illumination, and the remaining cameras are parallel-polarized. The proposed algorithm leverages the images with the two different polarization states to reconstruct the geometry and to recover appearance properties. We do so by means of an inverse rendering framework, which solves per texel diffuse albedo, specular intensity, and high-resolution normals, as well as global specular roughness considering the subsurface scattering nature of skin.
Related Creating an Actor-Specific Facial Rig from Performance Capture · Acquiring the Reflectance Field of a Human Face · FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction · The Digital Emily Project: Achieving a Photorealistic Digital Actor
how to read this ▾ how to read this ▴
- Category
- Method / capture system: single-shot facial geometry and appearance acquisition
- Contributions
-
- A light-weight face capture system that reconstructs both high-quality geometry and detailed appearance maps from a single exposure, without active illumination
- Adds appearance capture to passive photogrammetry rigs by cross-polarizing a subset of cameras and parallel-polarizing the rest
- An inverse-rendering framework solving per-texel diffuse albedo, specular intensity, high-resolution normals, and a global specular roughness
- Context
- Extends single-shot passive photogrammetry capture (in the lineage of Beeler et al.'s High-Quality Single-Shot Capture of Facial Geometry) by folding appearance estimation into the same off-the-shelf, passive rig. Builds on: High-Quality Single-Shot Capture of Facial Geometry
- Correctness
- Demonstrated as an inverse-rendering recovery that depends on the cross/parallel-polarization split and an assumed reflectance model (diffuse, specular intensity, global roughness, subsurface scattering); a reader should note that a single global roughness and the polarization assumptions bound how much spatial appearance variation can be recovered.
- Clarity
- Accessible at the system level; a first pass conveys the polarization-plus-stereo idea, a second pass is needed for the inverse-rendering formulation.
- How to read it
- First pass for the hardware setup (who is cross- vs parallel-polarized) and which maps come out; do a second pass on the inverse-rendering objective if you intend to reproduce the appearance solve.
Facial
-
, , , , ,
Skeleton-aware graph network for motion retargeting that adapts motion to diverse skeleton proportions while preserving stylistic details.
abstract ▾ abstract ▴
We introduce a novel deep learning framework for data-driven motion retargeting between skeletons, which may have different structure, yet corresponding to homeomorphic graphs. Importantly, our approach learns how to retarget without requiring any explicit pairing between the motions in the training set. We leverage the fact that different homeomorphic skeletons may be reduced to a common primal skeleton by a sequence of edge merging operations, which we refer to as skeletal pooling. Thus, our main technical contribution is the introduction of novel differentiable convolution, pooling, and unpooling operators. These operators are skeleton-aware, meaning that they explicitly account for the skeleton's hierarchical structure and joint adjacency, and together they serve to transform the original motion into a collection of deep temporal features associated with the joints of the primal skeleton. In other words, our operators form the building blocks of a new deep motion processing framework that embeds the motion into a common latent space, shared by a collection of homeomorphic skeletons. Thus, retargeting can be achieved simply by encoding to, and decoding from this latent space. Our experiments show the effectiveness of our framework for motion retargeting, as well as motion processing in general, compared to existing approaches.
Related Motion Retargetting based on Dilated Convolutions and Skeleton-Specific Loss Functions · Learning Character-Agnostic Motion for Motion Retargeting in 2D · Contact-Aware Retargeting of Skinned Motion · Adult2Child: Motion Style Transfer Using CycleGANs
how to read this ▾ how to read this ▴
- Category
- Method: deep motion retargeting via skeleton-aware networks
- Contributions
-
- A deep framework that retargets motion between skeletons of different structure (homeomorphic graphs) without requiring paired training motions
- Novel skeleton-aware differentiable convolution, pooling, and unpooling operators that respect joint hierarchy and adjacency
- A shared latent space reached by skeletal pooling to a common primal skeleton, so retargeting becomes encode-then-decode
- Context
- Builds on the authors' earlier character-agnostic 2D motion retargeting work, moving it into a graph-based, skeleton-aware deep operator framework for 3D. Builds on: Learning Character-Agnostic Motion for Motion Retargeting in 2D
- Correctness
- Trained without explicit motion pairing and relying on the assumption that target skeletons reduce to a common primal skeleton via edge merging; a reader should keep in mind this homeomorphism requirement limits applicability to topologically compatible skeletons.
- Clarity
- Moderately technical; a first pass conveys the pooling-to-primal-skeleton idea, a second pass is needed to understand the operator definitions.
- How to read it
- First pass on the skeletal pooling concept and the encode/decode latent-space picture; second pass on the convolution/pooling/unpooling operators if you plan to implement or extend them.
Retargeting / ML Deformation
- SoftSMPL: Data-driven Modeling of Nonlinear Soft-tissue Dynamics for Parametric Humans CGF Academic 52 cites
, , ,
Recurrent network regresses real-time soft-tissue dynamics as a function of body shape and motion encoded in a nonlinear deformation subspace.
abstract ▾ abstract ▴
We present SoftSMPL, a learning‐based method to model realistic soft‐tissue dynamics as a function of body shape and motion. Datasets to learn such task are scarce and expensive to generate, which makes training models prone to overfitting. At the core of our method there are three key contributions that enable us to model highly realistic dynamics and better generalization capabilities than state‐of‐the‐art methods, while training on the same data. First, a novel motion descriptor that disentangles the standard pose representation by removing subject‐specific features; second, a neural‐network‐based recurrent regressor that generalizes to unseen shapes and motions; and third, a highly efficient nonlinear deformation subspace capable of representing soft‐tissue deformations of arbitrary shapes. We demonstrate qualitative and quantitative improvements over existing methods and, additionally, we show the robustness of our method on a variety of motion capture databases.
Related SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes · SNUG: Self-Supervised Neural Dynamic Garments · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies · Learning Skeletal Articulations with Neural Blend Shapes
how to read this ▾ how to read this ▴
- Category
- Method: learning-based soft-tissue dynamics for parametric human bodies
- Contributions
-
- A motion descriptor that disentangles the standard pose representation by removing subject-specific features
- A recurrent neural regressor that generalizes soft-tissue dynamics to unseen body shapes and motions
- An efficient nonlinear deformation subspace that represents soft-tissue deformation for arbitrary shapes
- Context
- Adds learned, motion-dependent soft-tissue dynamics on top of the SMPL parametric body model (Loper et al.), targeting realism beyond static pose-corrective deformation. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Addresses the scarcity and cost of dynamics training data, with the disentangled descriptor and nonlinear subspace aimed at avoiding overfitting; a reader should keep in mind that generalization claims rest on the chosen mocap databases and on how well the subspace captures unseen shapes.
- Clarity
- Reasonably accessible if you already know SMPL; a first pass conveys the three-part design, a second pass clarifies the recurrent regressor and subspace.
- How to read it
- First pass on the three contributions and why each fights overfitting; second pass on the motion descriptor disentanglement and the recurrent dynamics regressor if you work with body dynamics.
Skinning / ML Deformation
-
, ,
STAR learns spatially local pose-corrective blend shapes with sparse joint influence, reducing SMPL parameters by 80% while improving deformation realism.
abstract ▾ abstract ▴
The SMPL body model is widely used for the estimation, synthesis, and analysis of 3D human pose and shape. While popular, we show that SMPL has several limitations and introduce STAR, which is quantitatively and qualitatively superior to SMPL. First, SMPL has a huge number of parameters resulting from its use of global blend shapes. These dense pose-corrective offsets relate every vertex on the mesh to all the joints in the kinematic tree, capturing spurious long-range correlations. To address this, we define per-joint pose correctives and learn the subset of mesh vertices that are influenced by each joint movement. This sparse formulation results in more realistic deformations and significantly reduces the number of model parameters to 20% of SMPL. When trained on the same data as SMPL, STAR generalizes better despite having many fewer parameters. Second, SMPL factors pose-dependent deformations from body shape while, in reality, people with different shapes deform differently. Consequently, we learn shape-dependent pose-corrective blend shapes that depend on both body pose and BMI. Third, we show that the shape space of SMPL is not rich enough to capture the variation in the human population. We address this by training STAR with an additional 10,000 scans of male and female subjects, and show that this results in better model generalization.
Related SUPR: A Sparse Unified Part-Based Human Representation · SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks · Dyna: A Model of Dynamic Human Shape in Motion · Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies
how to read this ▾ how to read this ▴
- Category
- Method: a sparse, articulated parametric human body model
- Contributions
-
- STAR, a body model with per-joint pose-corrective blend shapes and learned sparse joint-to-vertex influence, replacing SMPL's dense global blend shapes and cutting parameters substantially
- Shape-dependent pose-corrective blend shapes that depend on both pose and BMI, since differently shaped people deform differently
- A richer shape space than SMPL to better capture human body variation
- Context
- A direct re-formulation and critique of SMPL (Loper et al.), keeping the skinned-linear paradigm but localizing the corrective deformations. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Reports being quantitatively and qualitatively superior to SMPL when trained on the same data, attributing it to removing spurious long-range vertex-to-joint correlations; a reader should note the improvements are relative to SMPL on shared training data, so the comparison is the right frame rather than an absolute ceiling.
- Clarity
- Accessible, especially with SMPL background; a first pass conveys the sparsity argument, a second pass covers the corrective-blendshape formulation.
- How to read it
- First pass on the three SMPL limitations and how STAR addresses each; second pass on the sparse per-joint correctives and the pose-plus-BMI shape dependence if you build on parametric bodies.
Skinning / ML Deformation
- talk Technical Artist Summit: Freeform Animation Rigging: Evolving the Animation Pipeline GDC Industrial
Unity presented Freeform Animation, a non-destructive technique to preserve motion content while restructuring control rigs, reducing counter-animation work and bottlenecks in standard rigging workflows.
Rigging
-
, , ,
Single neural controller masters thousands of motion styles by training on large-scale mocap, showing strong transfer to unseen motions and characters.
abstract ▾ abstract ▴
The field of physics-based animation is gaining importance due to the increasing demand for realism in video games and films, and has recently seen wide adoption of data-driven techniques, such as deep reinforcement learning (RL), which learn control from (human) demonstrations. While RL has shown impressive results at reproducing individual motions and interactive locomotion, existing methods are limited in their ability to generalize to new motions and their ability to compose a complex motion sequence interactively. In this paper, we propose a physics-based universal neural controller (UniCon) that learns to master thousands of motions with different styles by learning on large-scale motion datasets. UniCon is a two-level framework that consists of a high-level motion scheduler and an RL-powered low-level motion executor, which is our key innovation. By systematically analyzing existing multi-motion RL frameworks, we introduce a novel objective function and training techniques which make a significant leap in performance. Once trained, our motion executor can be combined with different high-level schedulers without the need for retraining, enabling a variety of real-time interactive applications.
Related QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars · DReCon: Data-Driven Responsive Control of Physics-Based Characters · ReGAIL: Toward Agile Character Control From a Single Reference Motion · AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control
how to read this ▾ how to read this ▴
- Category
- Method: physics-based universal neural controller for character motion
- Contributions
-
- UniCon, a single physics-based controller that learns to master thousands of motions of different styles by training on large-scale motion datasets
- A two-level design pairing a high-level motion scheduler with an RL-powered low-level motion executor
- A new objective function and training techniques that improve over existing multi-motion RL frameworks, with the executor reusable across schedulers without retraining
- Context
- Builds on example-guided deep RL for physics-based skills (in the lineage of DeepMimic by Peng et al.), scaling from individual motions toward a universal multi-motion controller. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- Claims strong transfer to unseen motions and characters and interactive composition via swappable schedulers; a reader should keep in mind that physics-based RL results depend heavily on the simulator, reward design, and the breadth of the training mocap, so generalization is empirical.
- Clarity
- Conceptually accessible at the scheduler/executor level; a first pass conveys the two-level idea, deeper passes are needed for the objective and training details.
- How to read it
- First pass on the scheduler/executor split and the reusability claim; second pass on the new objective and training techniques if you care about why it scales past single-motion RL.
Motion Synthesis
-
, , , ,
Encodes motion into disentangled content and style latent codes, applying temporally invariant AdaIN to transfer style extracted directly from RGB video onto 3D animation.
abstract ▾ abstract ▴
Transferring the motion style from one animation clip to another, while preserving the motion content of the latter, has been a long-standing problem in character animation. Most existing data-driven approaches are supervised and rely on paired data, where motions with the same content are performed in different styles. In addition, these approaches are limited to transfer of styles that were seen during training. In this paper, we present a novel data-driven framework for motion style transfer, which learns from an unpaired collection of motions with style labels, and enables transferring motion styles not observed during training. Furthermore, our framework is able to extract motion styles directly from videos, bypassing 3D reconstruction, and apply them to the 3D input motion. Our style transfer network encodes motions into two latent codes, for content and for style, each of which plays a different role in the decoding (synthesis) process. While the content code is decoded into the output motion by several temporal convolutional layers, the style code modifies deep features via temporally invariant adaptive instance normalization (AdaIN). Moreover, while the content code is encoded from 3D joint rotations, we learn a common embedding for style from either 3D or 2D joint positions, enabling style extraction from videos.
Related Adult2Child: Motion Style Transfer Using CycleGANs · How to Train Your Dog: Neural Enhancement of Quadruped Animations · Robust Motion In-Betweening · Sketch-based Motion Editing for Articulated Characters
how to read this ▾ how to read this ▴
- Category
- Method: unpaired motion style transfer from video to 3D animation
- Contributions
-
- A data-driven framework that learns motion style transfer from an unpaired collection of style-labeled motions, with no paired same-content data required
- The ability to extract motion style directly from RGB video, bypassing 3D reconstruction, and apply it to 3D input motion
- An encoder that splits motion into content and style codes, decoding content via temporal convolutions while style modulates features through temporally invariant AdaIN, including styles unseen in training
- Context
- Builds on the authors' skeleton-aware retargeting work and adapts the content/style + AdaIN disentanglement idea from image style transfer to character motion. Builds on: Skeleton-Aware Networks for Deep Motion Retargeting
- Correctness
- Demonstrated as unpaired, label-driven transfer that can generalize to unseen styles and to styles read from video; a reader should keep in mind that style quality from video depends on the underlying motion extraction and that disentanglement is learned rather than guaranteed.
- Clarity
- Accessible if you know AdaIN-style transfer; a first pass conveys the content/style split, a second pass covers the temporally invariant AdaIN mechanism.
- How to read it
- First pass on what 'content' vs 'style' mean here and the video-to-animation pathway; second pass on the temporally invariant AdaIN and unpaired training setup if you intend to reproduce it.
Retargeting / Motion Synthesis
-
, ,
A 3D convolutional neural network places skeleton joints in bipedal characters from mesh volume alone, automating a step most auto-rigging tools leave manual.
abstract ▾ abstract ▴
Character rigging for 3D media production still depends on a manual, time-consuming skeleton setup step, even though many existing tools focus on building the control rig on top of that skeleton. This thesis proposes using a 3D convolutional neural network to automatically place joints in bipedal characters based on mesh volume. While prior work has explored automating character deformation, far less attention has gone to better automatic joint placement. The study demonstrates that 3D-CNNs can effectively handle joint placement for bipedal characters with standardized joint configurations, and identifies refinement of the method and expanded training data as the main avenues for improvement.
Related Stable and Efficient Differential IK · A.C.M.E. Multilimb System · Wires: A Geometric Deformation Technique · LibEE: A Multithreaded Dependency Graph for Character Animation
how to read this ▾ how to read this ▴
- Category
- Master of Science thesis (Drexel University, 2020) on machine-learning auto-rigging: automatic skeleton joint placement for bipedal characters.
- Contributions
-
- Frames automatic joint placement as a learnable task distinct from the better-studied problem of automating skin deformation.
- Trains a 3D convolutional neural network that consumes a voxelized mesh volume and predicts joint positions for a standardized bipedal skeleton.
- Shows a 3D-CNN can produce usable joint placements for bipeds, and reports where it falls short so the approach can be extended.
- Context
- Part of the 2020 wave that pushed deep learning into the rigging pipeline. It sits alongside the contemporaneous RigNet, which predicts both skeleton topology and skinning weights from a mesh; this thesis narrows the scope to joint placement only and to bipeds with a fixed joint layout. The motivation echoes the long line of automatic-rigging work descending from Pinocchio-style automatic rigging: the control rig has many tools, but the underlying skeleton is still placed by hand.
- Correctness
- Findings rest on a single network trained on a limited dataset of bipedal characters with one standardized joint configuration, so the claims are scoped to that setting rather than general articulated shapes. The author is explicit that results are promising but preliminary, and that accuracy depends heavily on more and more varied training data. Treat it as a feasibility study, not a production-ready method.
- Clarity
- A readable graduate thesis: it spends real space motivating why joint placement is the neglected half of auto-rigging, then walks through the volumetric representation and network. Longer and more tutorial in tone than a conference paper, which helps if the 3D-CNN framing is new to you.
- How to read it
- First pass: read the abstract and the problem-motivation section to see the manual-skeleton-setup gap it targets, then skip to the results figures to judge placement quality. Second pass: study how the mesh is voxelized into the network input and how the standardized joint set is defined, and read the limitations and future-work section, then compare its scope against RigNet to see what a full topology-plus-weights system adds.
Rigging
2019
54-
Insomniac Games described integrating approximately 150 vendor-supplied facial rigs for Spider-Man, with streamlined tooling to minimize overhead on character-specific revisions and delivery cycles.
Facial / Rigging
-
, , , , ,
Comprehensive SIGGRAPH 2019 course on USD composition, authoring, and UsdSkel skinning schema, including pipeline case studies from multiple studios.
Rigging / Skinning
-
Ubisoft La Forge presented neural networks for automatic mocap cleanup, end-to-end facial animation from video, and a machine learning system generating facial animation directly from raw audio.
Facial / ML Deformation
-
Described Weta's FACS-plus-deep-learning performance capture pipeline used to translate Rosa Salazar's performance onto the hyper-stylised Alita character.
Facial / Retargeting
-
, , , ,
AMASS unifies 15 optical mocap datasets into a single archive of 40+ hours, 300+ subjects, using MoSh++ to fit SMPL meshes to marker data.
abstract ▾ abstract ▴
Large datasets are the cornerstone of recent advances in computer vision using deep learning. In contrast, existing human motion capture (mocap) datasets are small and the motions limited, hampering progress on learning models of human motion. While there are many different datasets available, they each use a different parameterization of the body, making it difficult to integrate them into a single meta dataset. To address this, we introduce AMASS, a large and varied database of human motion that unifies 15 different optical marker-based mocap datasets by representing them within a common framework and parameterization. We achieve this using a new method, MoSh++, that converts mocap data into realistic 3D human meshes represented by a rigged body model. Here we use SMPL [Loper et al., 2015], which is widely used and provides a standard skeletal representation as well as a fully rigged surface mesh. The method works for arbitrary marker sets, while recovering soft-tissue dynamics and realistic hand motion. We evaluate MoSh++ and tune its hyperparameters using a new dataset of 4D body scans that are jointly recorded with markerbased mocap. The consistent representation of AMASS makes it readily useful for animation, visualization, and generating training data for deep learning.
how to read this ▾ how to read this ▴
- Category
- Dataset plus a fitting method
- Contributions
-
- Introduces AMASS, unifying 15 optical marker-based mocap datasets into one large database under a common parameterization
- Proposes MoSh++, a method that converts arbitrary marker-set mocap into rigged SMPL meshes while recovering soft-tissue and hand motion
- Evaluates and tunes MoSh++ using a new dataset of 4D body scans jointly recorded with marker-based mocap
- Context
- Builds on the SMPL skinned body model (Loper et al., 2015), using it as the common skeletal and surface representation to make heterogeneous mocap datasets interoperable. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- The unification assumes SMPL can faithfully represent motions parameterized differently across source datasets; MoSh++ is validated against jointly captured 4D scans, but recovered soft-tissue and hand detail is a fit, not ground truth, so fidelity varies by marker set.
- Clarity
- Clearly motivated and accessible; a first pass conveys the dataset value, with a second pass needed for the MoSh++ formulation and hyperparameter tuning.
- How to read it
- If you want the data, a first pass suffices; if you intend to fit your own mocap, do a second pass on MoSh++ and the 4D-scan evaluation to judge marker-set generality.
Retargeting / Motion Synthesis
-
Santa Monica Studio detailed how they rebuilt Kratos's animation pipeline to convey humanity and narrative depth, transforming the franchise's approach to character performance.
Motion Synthesis / Rigging
-
,
The human face is an anatomical system whose heterogeneous and anisotropic mechanical behavior produces complex deformations even in neutral expressions under external forces such as gravity.
abstract ▾ abstract ▴
The human face is an anatomical system whose heterogeneous and anisotropic mechanical behavior produces complex deformations even in neutral expressions under external forces such as gravity. This work builds a volumetric model from magnetic resonance images of a neutral face and registers 3D scans captured under varying gravity directions and expressions, then solves an inverse physics problem that learns heterogeneous stiffness and prestrain from the training scans. The resulting physics-based model generalizes to new 3D scans and predicts facial deformations more accurately than prior physics-based techniques.
Related Fully Automatic Generation of Anatomical Face Simulation Models · Automatic Determination of Facial Muscle Activations from Sparse Motion Capture Marker Data · Phace: Physics-based Face Modeling and Animation · Volume Preserving Simulation of Soft Tissue with Skin
how to read this ▾ how to read this ▴
- Category
- Method: data-driven physics-based face modeling
- Contributions
-
- Builds a volumetric face model from MRI of a neutral face and registers 3D scans captured under varying gravity directions and expressions
- Solves an inverse physics problem to learn heterogeneous stiffness and prestrain from the training scans
- Demonstrates a model that generalizes to new 3D scans and predicts deformations more accurately than prior physics-based techniques
- Context
- Extends physics-based face modeling such as Phace (Ichim et al., 2017) and the authors' prior personalized anatomical body models (Kadlecek and Kavan, 2016), shifting from hand-set to data-learned material parameters. Builds on: Phace: Physics-based Face Modeling and Animation · Reconstructing Personalized Anatomical Models for Physics-based Body Animation
- Correctness
- Relies on MRI-derived volumetric geometry and scans under controlled gravity and expression conditions; the accuracy claim is relative to prior physics-based methods, and generalization is shown on new scans of presumably the same modeling regime, so per-subject capture cost and coverage are limits to keep in mind.
- Clarity
- Technically dense; a first pass conveys the inverse-physics idea, but the stiffness and prestrain estimation needs a careful second pass.
- How to read it
- Read for the inverse-physics formulation: on a second pass focus on how stiffness and prestrain are recovered from gravity-varied scans, and on the validation against prior physics-based baselines.
Facial / Muscles
-
, , , ,
VOCA: speech-driven 3D facial animation system trained on 29 minutes of 4D scans at 60 fps from 12 speakers, generalizing across identities.
abstract ▾ abstract ▴
Audio-driven 3D facial animation has been widely explored, but achieving realistic, human-like performance is still unsolved. This is due to the lack of available 3D datasets, models, and standard evaluation metrics. To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers. We then train a neural network on our dataset that factors identity from facial motion. The learned model, VOCA (Voice Operated Character Animation) takes any speech signal as input, even speech in languages other than English, and realistically animates a wide range of adult faces. Conditioning on subject labels during training allows the model to learn a variety of realistic speaking styles. VOCA also provides animator controls to alter speaking style, identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball rotations) during animation. To our knowledge, VOCA is the only realistic 3D facial animation model that is readily applicable to unseen subjects without retargeting. This makes VOCA suitable for tasks like in-game video, virtual reality avatars, or any scenario in which the speaker, speech, or language is not known in advance. We make the dataset and model available for research purposes at http://voca.is.tue.mpg.de.
Related FaceFormer: Speech-Driven 3D Facial Animation with Transformers · FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion · MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement · ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE
how to read this ▾ how to read this ▴
- Category
- Method plus dataset: speech-driven 3D facial animation
- Contributions
-
- Introduces a 4D face dataset of about 29 minutes of scans at 60 fps with synchronized audio from 12 speakers
- Trains VOCA, a neural model that factors identity from facial motion and animates unseen adult faces from any speech, including non-English
- Provides animator controls over speaking style, identity-dependent shape, and head, jaw, and eyeball pose
- Context
- Builds on audio-driven facial animation (Karras et al., 2017) and the FLAME face model from 4D scans (Li et al., 2017), aiming for a model that applies to unseen subjects without retargeting. Builds on: Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion · Learning a Model of Facial Shape and Expression from 4D Scans
- Correctness
- Trained on 12 speakers, so identity and style coverage is bounded by that set; the cross-language and unseen-subject generalization is the headline claim, and the field still lacks standard metrics, so realism judgments are partly qualitative.
- Clarity
- Accessible and well-motivated; a first pass conveys the system and controls, with a second pass for the identity-from-motion factorization.
- How to read it
- Read for the dataset and the identity-versus-motion factoring; a first pass conveys the capability, do a second pass if you care about how the speaking-style conditioning and animator controls are built.
Facial / Motion Synthesis
-
, , , , , , ,
Blue Sky Studios describes Conduit, a USD-centric open-source pipeline framework mapping legacy character and shot workflows into USD constructs for Nimona.
abstract ▾ abstract ▴
We present our modern pipeline, Conduit, developed for Blue Sky's upcoming feature film, Nimona. Conduit refers to a set of tools and web services that allow artists to find, track, version and quality control their work. In addition to describing the system and implementation, we will discuss the challenges and opportunities of developing and deploying a pipeline with the intention of open sourcing the resulting toolset. We found that communicating concepts and progress updates both internally and externally throughout the development process ultimately resulted in a more robust solution.
Related USD in Production · Building Scalable and Evolutive USD Pipelines on Distributed Architecture at Ubisoft · A Deep Dive into Universal Scene Description and Hydra · Achieving and Maintaining Real-Time Rigs
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline system description
- Contributions
-
- Presents Conduit, a USD-centric set of tools and web services for finding, tracking, versioning, and quality-controlling artist work on the film Nimona
- Describes the system and implementation, including mapping legacy character and shot workflows into USD constructs
- Reflects on the challenges and opportunities of building a pipeline intended to be open-sourced
- Context
- A Blue Sky Studios pipeline built around Pixar's Universal Scene Description, reframing legacy workflows in USD terms with open-source release as a design goal. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Studio practice, not peer-reviewed; results are production-oriented and specific to Blue Sky's workflows, and the open-sourcing reflection is a process lesson rather than a measured outcome.
- Clarity
- Accessible system overview; one pass conveys the architecture and the open-source intent, with detail available for the USD mapping.
- How to read it
- Read for the pipeline architecture and the open-source-from-the-start lessons; one pass suffices unless you are mapping your own legacy workflows into USD.
Rigging
-
Framestore FMX talk on photo-realistic furry creature creation, exploring grooming techniques developed across multiple productions including advertising and episodic work.
abstract ▾ abstract ▴
Ahmed Gharraph of Framestore details the studio's grooming and fur workflow in Houdini across commercials and television creatures including a wildebeest, an Ikea sheep, McDonald's reindeer with a Digi-double Santa beard, and a black bear. He explains a guide-based approach that starts with curve flows and hand-planted guides, mirrors one side, then builds clumps within clumps using stacked hairgen nodes whose density dictates clump size, all driven by painted maps and masks. The talk covers procedural dirt setups using an occlusion-shop technique from Matt Estella's CG Wiki, color and melanin driven by point attributes rather than textures, and Framestore's custom HairGen and HairDeformer that interpolate over the surface to cut guide-capture time on the reindeer from twenty hours to about five minutes. He closes with an in-development feather system that piggybacks on the Houdini groom tools and uses Vellum to solve feather-on-feather stacking.
Related Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · Feathers: From Model to Groom to Render | nineteentwenty | Character FX & Crowds Production Talks · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023
how to read this ▾ how to read this ▴
- Category
- Production talk / grooming and fur breakdown
- Contributions
-
- Demonstrates a guide-based Houdini fur workflow (curve flows, hand-planted guides, mirroring, stacked hairgen clumps-within-clumps driven by painted maps) across creatures including a wildebeest, an Ikea sheep, McDonald's reindeer, and a black bear
- Shows procedural dirt via an occlusion-shop technique and color and melanin driven by point attributes rather than textures
- Presents Framestore's custom HairGen and HairDeformer (cutting reindeer guide capture from about twenty hours to five minutes) and an in-development Vellum-based feather system
- Context
- A Framestore creature-FX talk continuing the studio's grooming and Houdini practice (related to its London Creatures and Houdini production talks). Builds on: Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on specific commercials and TV creatures, and figures such as the twenty-hours-to-five-minutes speedup are reported gains rather than controlled measurements.
- Clarity
- Hands-on and tool-specific; accessible to Houdini groom artists, with concrete node-level technique that rewards a careful watch.
- How to read it
- Watch for the clumps-within-clumps guide method and the HairGen/HairDeformer interpolation; focus on the attribute-driven color and the in-progress Vellum feather system if grooming is your area.
CFX
-
,
Folds the iterative delta mush smoothing into a single precomputed direct skinning operator, making it real-time and game-ready.
abstract ▾ abstract ▴
A significant fraction of the world's population have experienced virtual characters through games and movies, and the possibility of online VR social experiences may greatly extend this audience. At present, the skin deformation for interactive and real-time characters is typically computed using geometric skinning methods. These methods are efficient and simple to implement, but obtaining quality results requires considerable manual "rigging" effort involving trial-and-error weight painting, the addition of virtual helper bones, etc. The recently introduced Delta Mush algorithm largely solves this rig authoring problem, but its iterative computational approach has prevented direct adoption in real-time engines. This paper introduces Direct Delta Mush, a new algorithm that simultaneously improves on the efficiency and control of Delta Mush while generalizing previous algorithms. Specifically, we derive a direct rather than iterative algorithm that has the same ballpark computational form as some previous geometric weight blending algorithms. Straightforward variants of the algorithm are then proposed to further optimize computational and storage cost with insignificant quality losses. These variants are equivalent to special cases of several previous skinning algorithms. Our algorithm simultaneously satisfies the goals of reasonable efficiency, quality, and ease of authoring.
Related Delta Mush: Smoothing Deformations While Preserving Detail · Real-Time Skeletal Skinning with Optimized Centers of Rotation · Direct Delta Mush Skinning Compression with Continuous Examples · Elasticity-Inspired Deformers for Character Articulation
how to read this ▾ how to read this ▴
- Category
- Method: a skinning algorithm
- Contributions
-
- Derives Direct Delta Mush, a non-iterative skinning operator that folds Delta Mush smoothing into a single precomputed form
- Achieves a computational form comparable to prior geometric weight-blending methods, making it suitable for real-time engines
- Proposes variants that further reduce computation and storage cost with insignificant quality loss, and generalizes previous algorithms
- Context
- Builds directly on Delta Mush (Mancewicz et al., 2014), removing its iterative bottleneck so the rig-authoring benefit can be used in interactive and game contexts. Builds on: Delta Mush: Smoothing Deformations While Preserving Detail
- Correctness
- The core claim is that an iterative smoother can be reformulated as a precomputed direct operator at little quality cost; the precompute and per-vertex storage are the practical tradeoffs a reader should weigh against the runtime savings.
- Clarity
- Well-motivated and accessible in intent; a first pass conveys the direct-versus-iterative idea, but the derivation warrants a careful second pass.
- How to read it
- Read the abstract and intuition first, then do a second pass on the derivation that turns iterative Delta Mush into the direct operator, plus the variants if you care about storage and runtime budgets.
Skinning
-
, , ,
Physics-based character controller combining motion matching for reference generation with tracking control for responsive, physically plausible animation.
abstract ▾ abstract ▴
Interactive control of self-balancing, physically simulated humanoids is a long standing problem in the field of real-time character animation. While physical simulation guarantees realistic interactions in the virtual world, simulated characters can appear unnatural if they perform unusual movements in order to maintain balance. Therefore, obtaining a high level of responsiveness to user control, runtime performance, and diversity has often been overlooked in exchange for motion quality. Recent work in the field of deep reinforcement learning has shown that training physically simulated characters to follow motion capture clips can yield high quality tracking results. We propose a two-step approach for building responsive simulated character controllers from unstructured motion capture data. First, meaningful features from the data such as movement direction, heading direction, speed, and locomotion style, are interactively specified and drive a kinematic character controller implemented using motion matching. Second, reinforcement learning is used to train a simulated character controller that is general enough to track the entire distribution of motion that can be generated by the kinematic controller. Our design emphasizes responsiveness to user input, visual quality, and low runtime cost for application in video-games.
Related SuperTrack: Motion Tracking for Physically Simulated Characters Using Supervisory Signals · Character Controllers Using Motion VAEs · AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control · Near-Optimal Character Animation with Continuous Control
how to read this ▾ how to read this ▴
- Category
- Method: data-driven physics-based character control
- Contributions
-
- Proposes a two-step controller that builds responsive simulated characters from unstructured motion capture data
- Uses motion matching as an interactively specified kinematic controller (movement direction, heading, speed, locomotion style) to generate reference motion
- Trains a reinforcement-learning tracking controller general enough to follow the full distribution of kinematically generated motion
- Context
- Combines DeepMimic-style example-guided RL tracking (Peng et al., 2018) with Motion Matching (Clavet, 2016) to balance responsiveness against physical plausibility. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Motion Matching and The Road to Next-Gen Animation
- Correctness
- Assumes a kinematic motion-matching layer can produce a trackable reference distribution that the RL policy then follows under simulation; results target self-balancing humanoids and real-time control, so behavior outside the captured motion distribution and tuning effort remain considerations.
- Clarity
- Clearly structured around the two stages; a first pass conveys the pipeline, with a second pass for the RL tracking and reward details.
- How to read it
- Read for the two-stage design: understand how motion matching feeds the RL tracker on a first pass, then do a second pass on the policy training if you intend to reproduce or extend the controller.
Motion Synthesis
- Dynamic Hair Modeling from Monocular Videos Using Deep Neural Networks SIGGRAPH Asia Academic 50 cites
, , ,
Two-network framework (HairSpatNet + HairTempNet) inferring 3D occupancy and orientation fields from monocular video to model moving hairstyles dynamically.
abstract ▾ abstract ▴
We introduce a deep learning based framework for modeling dynamic hairs from monocular videos, which could be captured by a commodity video camera or downloaded from Internet. The framework mainly consists of two neural networks, i.e., HairSpatNet for inferring 3D spatial features of hair geometry from 2D image features, and HairTempNet for extracting temporal features of hair motions from video frames. The spatial features are represented as 3D occupancy fields depicting the hair volume shapes and 3D orientation fields indicating the hair growing directions. The temporal features are represented as bidirectional 3D warping fields, describing the forward and backward motions of hair strands cross adjacent frames. Both HairSpatNet and HairTempNet are trained with synthetic hair data. The spatial and temporal features predicted by the networks are subsequently used for growing hair strands with both spatial and temporal consistency. Experiments demonstrate that our method is capable of constructing plausible dynamic hair models that closely resemble the input video, and compares favorably to previous single-view techniques.
Related Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction · BlendSim: Simulation on Parametric Blendshapes using Spacetime Projective Dynamics · Learning Motion Manifolds with Convolutional Autoencoders · 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting
how to read this ▾ how to read this ▴
- Category
- Method: deep learning for dynamic hair capture
- Contributions
-
- Introduces a two-network framework for modeling dynamic hair from monocular video
- HairSpatNet infers 3D spatial features (occupancy fields for volume shape and orientation fields for growing direction) from 2D image features, and HairTempNet extracts temporal features as bidirectional 3D warping fields
- Grows hair strands with spatial and temporal consistency, comparing favorably to prior single-view techniques
- Context
- Extends single-view hair reconstruction such as HairNet (Zhou et al., 2018) from static stills to temporally consistent dynamic hair from video. Builds on: HairNet: Single-View Hair Reconstruction Using Convolutional Neural Networks
- Correctness
- Both networks are trained on synthetic hair data, so real-video performance depends on the synthetic-to-real gap; results are plausible reconstructions that resemble the input rather than measured ground-truth strand geometry, and monocular ambiguity persists.
- Clarity
- Accessible framing of a two-network split; a first pass conveys the spatial-plus-temporal decomposition, with a second pass for the field representations and warping.
- How to read it
- Read for the spatial/temporal network split and the occupancy-plus-orientation-plus-warping representation; do a second pass on training data and the field formulations to gauge how it transfers to real footage.
CFX / ML Deformation
-
Demonstrates the Offset Parent Matrix attribute in Maya 2020, which drives transforms via a single matrix connection, eliminating offset group nodes and reducing scene complexity for lighter rigs.
abstract ▾ abstract ▴
This Autodesk talk explains the offset parent matrix attribute introduced in Maya 2020, walking through how transform matrices represent position and orientation in 3D and how the attribute applies a matrix to a node before its own translate, rotate and scale channels. It demonstrates replacing parent and scale constraints by plugging a cylinder's world matrix into a cube's offset parent matrix in the node editor, and using a pick matrix node to selectively carry over transforms. The practical example reworks finger controls on a character hand by feeding compound control joints into the FK controls via the matrix attribute, distinguishing matrix from world matrix to avoid inheriting parent transforms, and cutting 20 joints from a single hand while keeping compound and FK posing intact.
Related Maya 2022: New Features for Rigging · Premo: Powerful Character Rigging, Fast Animation · Group Based Rigging of Realistically Feathered Wings · ChopRig System
how to read this ▾ how to read this ▴
- Category
- Production talk / tool walkthrough (rigging)
- Contributions
-
- Explains the Offset Parent Matrix attribute in Maya 2020, which applies a matrix to a node before its own translate, rotate and scale channels
- Shows matrix-driven constraints (feeding a world matrix into offset parent matrix, with pick matrix for selective transforms) as a lighter alternative to parent and scale constraints
- Reworks finger controls by driving FK from compound control joints via the matrix attribute, cutting 20 joints from a single hand
- Context
- A vendor tutorial on Maya's matrix-based transform workflow, building on the broader move in rigging toward matrix node networks instead of constraint and offset-group hierarchies.
- Correctness
- Studio/vendor practice rather than peer-reviewed; the joint-reduction and posing claims are demonstrated on a single hand example, so generality to other rig topologies and the matrix-vs-world-matrix pitfall around inherited parent transforms are things to verify in your own scene.
- Clarity
- Accessible; a single viewing conveys the concept, and the node-editor example makes it concrete.
- How to read it
- Watch once for the mental model of pre-multiplying a matrix before local TRS, then keep it open as a reference while replicating the node graph; no deep second pass needed beyond reproducing the hand setup yourself.
Rigging
-
, , , , , ,
SMPL-X unifies body, face, and fully articulated hands into a single 10,475-vertex parametric model with 54 joints and a learned variational pose prior.
abstract ▾ abstract ▴
To facilitate the analysis of human actions, interactions and emotions, we compute a 3D model of human body pose, hand pose, and facial expression from a single monocular image. To achieve this, we use thousands of 3D scans to train a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with fully articulated hands and an expressive face. Learning to regress the parameters of SMPL-X directly from images is challenging without paired images and 3D ground truth. Consequently, we follow the approach of SMPLify, which estimates 2D features and then optimizes model parameters to fit the features. We improve on SMPLify in several significant ways: (1) we detect 2D features corresponding to the face, hands, and feet and fit the full SMPL-X model to these; (2) we train a new neural network pose prior using a large MoCap dataset; (3) we define a new interpenetration penalty that is both fast and accurate; (4) we automatically detect gender and the appropriate body models (male, female, or neutral); (5) our PyTorch implementation achieves a speedup of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild. We evaluate 3D accuracy on a new curated dataset comprising 100 images with pseudo ground-truth. This is a step towards automatic expressive human capture from monocular RGB data.
Related SUPR: A Sparse Unified Part-Based Human Representation · SMPL: A Skinned Multi-Person Linear Model · ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling · NIMBLE: A Non-rigid Hand Model with Bones and Muscles
how to read this ▾ how to read this ▴
- Category
- Method + model: a unified parametric body model and single-image fitting pipeline
- Contributions
-
- SMPL-X, a unified parametric model extending the body with fully articulated hands and an expressive face
- SMPLify-X, which fits SMPL-X to detected 2D face, hand and foot features from a single monocular image
- Supporting pieces: a neural-network pose prior trained on MoCap, a fast interpenetration penalty, automatic gender detection, and a faster PyTorch implementation
- Context
- Extends SMPL (Loper et al.) toward whole-body expressiveness, drawing the face component from the FLAME line (Li et al.) and relating to single-image pose estimation work such as VNect (Mehta et al.). Builds on: SMPL: A Skinned Multi-Person Linear Model · Learning a Model of Facial Shape and Expression from 4D Scans · VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera at over 30fps
- Correctness
- An optimization-based fit to 2D features, so results depend on detector quality and the learned pose prior; demonstrated on controlled and in-the-wild images, but single-image monocular fitting carries inherent depth and occlusion ambiguity that a reader should keep in mind.
- Clarity
- Readable structure; a first pass conveys the model and pipeline, while the energy terms and pose prior reward a second pass.
- How to read it
- First pass for what SMPL-X adds over SMPL and how SMPLify-X differs from SMPLify; do a second pass on the objective terms (pose prior, interpenetration penalty) if you intend to fit or extend the model.
Facial / Retargeting / Skinning
-
Presented real-time facial performance capture and face reenactment using commodity RGB cameras, demonstrating practical applications for character animation.
Facial / Retargeting
- Fast Simulation of Deformable Characters with Articulated Skeletons in Projective Dynamics SCA Academic 18 cites
, ,
Projective dynamics method for simulating deformable character bodies coupled to articulated skeletons with robust contact handling.
abstract ▾ abstract ▴
We propose a fast and robust solver to simulate continuum-based deformable models with constraints, in particular, rigid-body and joint constraints useful for soft articulated characters. Our method embeds degrees of freedom of both articulated rigid bodies and deformable bodies in one unified optimization problem, thus coupling the deformable and rigid bodies. Our method can efficiently simulate character models, with rigid-body parts (bones) being correctly coupled with deformable parts (flesh). Our method is stable because backward Euler time integration is applied to rigid as well as deformable degrees of freedom. Our method is rigorously derived from constrained Newtonian mechanics. In an example simulation with rigid bodies only, we demonstrate that our method converges to the same motion as classical explicitly integrated rigid body simulator.
Related EMU: Efficient Muscle Simulation in Deformation Space · Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies · Interactive Skeleton-Driven Dynamic Deformations · Physically Based Rigging for Deformable Characters
how to read this ▾ how to read this ▴
- Category
- Method: a coupled rigid-skeleton and deformable-body simulation solver
- Contributions
-
- A unified optimization that embeds articulated rigid-body (and joint) constraints together with deformable degrees of freedom, coupling flesh to bones
- A stable formulation using backward Euler integration on both rigid and deformable DOFs, derived from constrained Newtonian mechanics
- Robust contact and constraint handling for soft articulated characters
- Context
- Builds on Projective Dynamics (Bouaziz et al.), extending its fast constraint-projection framework to handle articulated rigid bodies and joints alongside continuum deformables. Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- Rigorously derived and shown to converge to a classical rigid-body simulator in a rigid-only case; as with projective dynamics the speed comes from a particular constraint formulation, so material accuracy and stiffness behavior under that approximation are worth checking for a given use case.
- Clarity
- Technical; a first pass gives the coupling idea, but the unified optimization and integration scheme need a careful second pass.
- How to read it
- First pass for the coupling concept and why backward Euler on both DOF types gives stability; second pass on the optimization derivation and constraint formulation if implementing or comparing solvers.
Skinning / Muscles
- talk Feathers: From Model to Groom to Render | nineteentwenty | Character FX & Crowds Production Talks Houdini Industrial
Head of 3D at nineteentwenty walks the full feather pipeline from modelling through grooming to rendering in Houdini for Royal Marines and KFC commercial creature work.
abstract ▾ abstract ▴
Chris King, Head of 3D at nineteentwenty, walks through the studio's first full-geometry feather pipeline built in Houdini and rendered in Mantra for the KFC Christmas campaign, covering a Rhode Island red chicken and a black turkey. He explains building poly planes along each hair using a stable orient, generating feathers with the Houdini feather builder, and breaking each feather into barb segments instanced as packed disc primitives along hair points to keep render times and memory low instead of using textured planes. Dynamics are handled with the wire solver using wide hair and skin collisions, with intersections addressed via post-sim interpolation between neighboring feathers and a per-feather wire-mesh simulation for extreme close-ups. He also covers paintable per-feather and per-barb shader pickups, delta mush smoothing in the neck, rigging primary and secondary flight feathers, and notes Vellum as a future option.
Related Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Creating a Photorealistic Hyena · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023
how to read this ▾ how to read this ▴
- Category
- Production talk (feather pipeline, model to groom to render)
- Contributions
-
- Demonstrates a full-geometry feather pipeline in Houdini and Mantra, building poly planes along hair with a stable orient and the feather builder
- Shows instancing barb segments as packed disc primitives along hair points to keep render time and memory low instead of textured planes
- Covers wire-solver dynamics with hair and skin collisions, post-sim interpolation and per-feather wire-mesh sims for close-ups, plus paintable per-feather/per-barb shader pickups and neck delta mush
- Context
- A studio breakdown of creature feather work for commercial productions, situated in the Houdini grooming/wire-solver ecosystem and the broader move from textured-plane feathers to full geometry.
- Correctness
- Studio practice, not peer-reviewed; the techniques are production-proven on the described chicken and turkey shots, so the memory/render tradeoffs and Vellum being noted only as a future option are context-specific choices rather than benchmarked claims.
- Clarity
- Accessible to FX artists; a single viewing conveys the pipeline shape, though some Houdini node specifics assume familiarity.
- How to read it
- Watch once end-to-end for the modelling-to-render flow, then revisit the barb-instancing and intersection-fix segments if you are actually building a feather setup.
CFX
- talk Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks Houdini Industrial
Framestore's head of CG digs into their creature archive spanning standout productions, discussing grooming and solver challenges overcome with Houdini at the London HUG workshop.
abstract ▾ abstract ▴
Ahmed Gharraph of Framestore walks through the studio's creature work in Houdini, centering on the blubbery cat Mog, a Mercedes hair piece, a Sony wildebeest herd, and McDonald's reindeer. For Mog he describes driving fleshy deformation with Houdini's FEM solver, embedding an animated skeleton inside an FEM volume to get volume-preserving secondary motion instead of relying on blendshapes and skin weights, and exposing a squishiness attribute in the Maya rig plus an automated farm pipeline that simulated CFX and fur from each animation cache. The reindeer combined an FBX skeleton, a Houdini muscle rig with jiggle controls embedded in an FEM solid, and a hybrid object for skin sliding and wrinkles, with grooming built from manually combed guide curves and layered sub-clumps. He also covers using the Houdini fur tools and wire solver for hair-through-fingers collisions on the Mercedes job, time-stretched fur simulation for the 4000-frames-per-second wildebeest, and driving fur color and the Arnold melanin shader from painted point attributes rather than Mari color maps.
Related Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Creating a Photorealistic Hyena · Automation of Creature FX in a Small Studio Pipeline
how to read this ▾ how to read this ▴
- Category
- Production talk (creature FX, grooming and solver case studies)
- Contributions
-
- Demonstrates driving fleshy deformation with Houdini's FEM solver, embedding an animated skeleton in an FEM volume for volume-preserving secondary motion (the cat Mog)
- Shows a reindeer pipeline combining an FBX skeleton, a Houdini muscle rig with jiggle in an FEM solid, and a hybrid object for skin sliding and wrinkles
- Covers fur tools and the wire solver for hair-through-fingers collisions, time-stretched fur for a very-high-fps wildebeest, and driving fur color via painted point attributes into the Arnold melanin shader
- Context
- A studio survey of creature work across multiple productions, grounded in Houdini's FEM and fur/wire solvers and a Maya-rig-plus-farm CFX pipeline.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on the named shots, and the FEM-embedding choices over blendshape/skin-weight approaches reflect art-direction and pipeline tradeoffs rather than measured comparisons.
- Clarity
- Accessible to FX TDs; a single viewing conveys the approaches, with solver specifics assuming Houdini familiarity.
- How to read it
- Watch once for the catalog of techniques, then revisit the Mog FEM-embedding and reindeer hybrid-object segments if you need volume-preserving flesh or skin-sliding setups.
CFX
-
SideFX TD presents H17 grooming toolset with worked hair examples, covering guide-curve challenges, detailing techniques, and rendering workflows at the London Character FX workshop.
abstract ▾ abstract ▴
A SideFX developer presents the Houdini 17 fur and grooming toolset, focusing first on performance: rewriting the most-used operators (hair generate, hair clump, guide mask, parts of guide process) from VEX-based HDAs into single C++ core operators, giving roughly 4x faster cooks and up to 10x in interactive sessions. He explains attribute data IDs as the caching mechanism that lets hair generate reuse topology and neighbor lookups when only point positions change, stressing the importance of keeping the rest attribute static and avoiding older operators that bump all IDs. The grooming portion demonstrates non-obvious workflows: building eyelashes by snapping a curve to the mesh with the curve SOP and growing hair from it, creating long-hair ponytails with hair generate and a tightness attribute for taper, using clump IDs as cluster attributes in Vellum glue constraints to keep nearby clumps separate, attaching rigid spheres to hair via guide deform and extract transform, and shaping mushroom-like clumps with hair clump attribute copying plus guide process bend.
Related Creating a Photorealistic Hyena · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Stags & Stripes, Creating Photoreal Characters | Framestore | FMX HIVE Europe 2021
how to read this ▾ how to read this ▴
- Category
- Production talk / tool walkthrough (H17 grooming toolset)
- Contributions
-
- Presents performance gains from rewriting key grooming operators from VEX HDAs into C++ core operators (roughly 4x faster cooks, up to 10x interactive)
- Explains attribute data IDs as the caching mechanism that lets hair generate reuse topology and neighbor lookups when only point positions change
- Demonstrates non-obvious grooming workflows: snapped-curve eyelashes, taper via a tightness attribute, clump IDs as Vellum glue cluster attributes, attaching rigid spheres via guide deform, and mushroom clumps with hair clump attribute copy plus guide process bend
- Context
- A vendor walkthrough of the Houdini 17 fur/grooming system, situated in the procedural guide-curve grooming paradigm and the introduction of Vellum constraints.
- Correctness
- Vendor practice, not peer-reviewed; the speedups are reported by the developer and the caching benefits depend on keeping rest attributes static and avoiding older operators that bump all IDs, which is a real usage caveat to heed.
- Clarity
- Accessible to grooming artists; a single viewing conveys both the performance story and the recipes.
- How to read it
- Watch once for the data-ID caching mental model (it explains why grooms recook fast), then return to specific recipe segments as reference when building eyelashes, ponytails or clumps.
CFX
- talk Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks Houdini Industrial
, ,
Axis Studios team presents their move to Houdini for hair simulation, feathers, and fur, detailing technical solutions and challenges across AAA game trailer creature productions.
abstract ▾ abstract ▴
Axis Studios' Hudson, Camille and Philip describe moving their character FX pipeline from Maya nCloth hair to a full Houdini workflow for the Elder Scrolls game cinematic trailer, built in roughly 12 weeks on Houdini 16.5 before Vellum existed. Camille covers the grooming approach, including a two-tier guide system where low-resolution guides are exported as ribbons for animators to place hero hair, a custom Axis Hair Deformer that caches Hair Gen output and deforms it with a less averaged falloff than the native point deformer, and a procedural feather system that scatters barbs along guide curves for the griffin. Philip details a grain-based constraint hair solver built in SOPs with local and global constraints to preserve groom shapes, a post-simulation tool in OpenCL that keeps strand length to add noise, blend shapes and wind without full simulation, and feathers and strings scattered and solved together with the hair. They close on switching to Vellum in Houdini 17, which replaces their custom solver, removes jitter and instability, and lets the faster hair generator run live to skip guide-deformation interpolation errors, with rendering in Mantra and comp moving from Fusion toward Nuke.
Related Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Feathers: From Model to Groom to Render | nineteentwenty | Character FX & Crowds Production Talks · Grooming and Simulation Methods for Different Hair Types | Andriy Bilichenko | Paris HIVE 2023 · Creating a Photorealistic Hyena
how to read this ▾ how to read this ▴
- Category
- Production talk (character FX pipeline move to Houdini)
- Contributions
-
- Describes moving from Maya nCloth hair to a full Houdini character-FX workflow for a game cinematic trailer, including a two-tier guide system and a custom Axis Hair Deformer with a less-averaged falloff
- Details a grain-based constraint hair solver in SOPs with local and global constraints to preserve groom shape, plus a procedural feather system scattering barbs along guides
- Shows an OpenCL post-simulation tool that preserves strand length while adding noise, blendshapes and wind without full simulation, and the later switch to Vellum in H17 to remove jitter and instability
- Context
- A studio breakdown of a creature-FX pipeline transition, situated against Maya nCloth hair and the emergence of Houdini Vellum as a replacement for custom grain/constraint solvers.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on the named trailer under a tight ~12-week schedule, and the custom solver and post-sim choices reflect a pre-Vellum era now partly superseded, which is worth keeping in mind.
- Clarity
- Accessible to FX TDs; multi-speaker but a single viewing conveys the grooming, solver and post-sim threads.
- How to read it
- Watch once for the pipeline arc from custom solver to Vellum; revisit the grain-solver constraints and OpenCL post-sim segments if building groom-preserving hair, but note Vellum may now replace the custom pieces.
CFX
-
, ,
Acquires complete hand bone anatomy in multiple poses via stabilized MRI and builds animation-ready volumetric hand rigs from medical imaging data.
abstract ▾ abstract ▴
We demonstrate how to acquire complete human hand bone anatomy (meshes) in multiple poses using magnetic resonance imaging (MRI). Such acquisition was previously difficult because MRI scans must be long for high-precision results (over 10 minutes) and because humans cannot hold the hand perfectly still in non-trivial and badly supported poses. We invent a manufacturing process whereby we use lifecasting materials commonly employed in film special effects industry to generate hand molds, personalized to the subject, and to each pose. These molds are both ergonomic and encasing, and they stabilize the hand during scanning. We also demonstrate how to efficiently segment the MRI scans into individual bone meshes in all poses, and how to correspond each bone's mesh to same mesh connectivity across all poses. Next, we interpolate and extrapolate the MRI-acquired bone meshes to the entire range of motion of the hand, producing an accurate data-driven animation-ready rig for bone meshes. We also demonstrate how to acquire not just bone geometry (using MRI) in each pose, but also a matching highly accurate surface geometry (using optical scanners) in each pose, modeling skin pores and wrinkles.
Related Data-Driven Physics for Human Soft Tissue Animation · Steklov-Poincare Skinning · Simulation of Hand Anatomy Using Medical Imaging · Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies
how to read this ▾ how to read this ▴
- Category
- Capture system + method (MRI-based hand anatomy acquisition and rigging)
- Contributions
-
- A stabilization process using film-industry lifecasting materials to make personalized, encasing molds that hold the hand still across multiple poses during long MRI scans
- A workflow to segment MRI scans into individual bone meshes with consistent connectivity across all poses
- Interpolation and extrapolation of acquired bone meshes across the full range of motion to produce an animation-ready bone rig, plus matched optical surface scans capturing pores and wrinkles
- Context
- Acquires the anatomical bone data underlying biomechanical hand models, relating to simulation/control work on hands and tendinous systems (Sachdeva et al.). Builds on: Biomechanical Simulation and Control of Hands and Tendinous Systems
- Correctness
- A data-driven acquisition approach; accuracy hinges on the molds genuinely stabilizing the hand and on faithful cross-pose correspondence, and results are demonstrated per-subject, so generality and the labor of mold-making and segmentation are practical limits to note.
- Clarity
- Accessible in its motivation and pipeline; a first pass conveys the method, with segmentation and interpolation details for a second pass.
- How to read it
- First pass for the molding-and-acquisition idea and what data it yields; second pass on segmentation, correspondence and range-of-motion interpolation if you plan to build or use such a rig.
Muscles / Skinning
-
, , ,
Makes a muscle-based facial system fully differentiable by coupling it with a blendshape basis, enabling both optimization and learning-based performance capture.
abstract ▾ abstract ▴
Muscle-based systems have the potential to provide both anatomical accuracy and semantic interpretability as compared to blendshape models; however, a lack of expressivity and differentiability has limited their impact. Thus, we propose modifying a recently developed rather expressive muscle-based system in order to make it fully-differentiable; in fact, our proposed modifications allow this physically robust and anatomically accurate muscle model to conveniently be driven by an underlying blendshape basis. Our formulation is intuitive, natural, as well as monolithically and fully coupled such that one can differentiate the model from end to end, which makes it viable for both optimization and learning-based approaches for a variety of applications. We illustrate this with a number of examples including both shape matching of three-dimensional geometry as as well as the automatic determination of a three-dimensional facial pose from a single two-dimensional RGB image without using markers or depth information.
Related Art-Directed Muscle Simulation for High-End Facial Animation · Lessons from the Evolution of an Anatomical Facial Muscle Model · Fully Automatic Generation of Anatomical Face Simulation Models · Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer
how to read this ▾ how to read this ▴
- Category
- Method: a differentiable anatomical-muscle facial model for capture
- Contributions
-
- Modifies an expressive muscle-based facial system to be fully differentiable so it can be driven by an underlying blendshape basis
- A monolithically and fully coupled formulation that is differentiable end to end, making it usable for both optimization and learning-based approaches
- Demonstrations including 3D shape matching and automatic 3D facial pose from a single 2D RGB image without markers or depth
- Context
- Builds on anatomical face simulation (Cong et al.), combining muscle-based physical accuracy with the differentiability and control of blendshape models. Builds on: Fully Automatic Generation of Anatomical Face Simulation Models
- Correctness
- Couples a physically robust muscle model to a blendshape basis; the single-image pose result is markerless and depth-free, so reconstruction quality depends on the muscle model's expressivity and the blendshape coupling, and monocular ambiguity remains a caveat.
- Clarity
- Technical; a first pass conveys why differentiability matters, while the coupled formulation needs a careful second pass.
- How to read it
- First pass for the motivation (muscle accuracy plus blendshape control plus differentiability); second pass on the coupling and differentiation if integrating into an optimization or learning pipeline.
Facial / Muscles
-
, ,
Practical force and constraint techniques for art-directing hair simulation shape across Brave, Inside Out, Good Dinosaur, Coco, and Incredibles 2.
abstract ▾ abstract ▴
Hair simulation models are based on physics, but require additional controls to achieve certain looks or art directions. A common simulation control is to use hard or soft constraints on the kinematic points provided by the articulation of the scalp or explicit rigging of the hair [Kaur et al. 2018; Soares et al. 2012]. While following the rigged points adds explicit control during shot work, we want to author information during the setup phase to better follow the groomed shape automatically during simulation (Figure 1). We have found that there is no single approach that satisfies every artistic requirement, and have instead developed several practical force-and constraint-based techniques over the course of the making of Brave, Inside Out, The Good Dinosaur, Coco, Incredibles 2, and Toy Story 4. We have also discovered that kinematic constraints can sometimes be adversely affected by mesh deformation and discuss how to mitigate this effect for both articulated and simulated hair.
Related Artistic Simulation of Curly Hair · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Scriptable Character FX Solution · Hair Emoting with Style Guides in Turning Red
how to read this ▾ how to read this ▴
- Category
- Production-grounded technique paper (art-directable hair simulation)
- Contributions
-
- Several practical force- and constraint-based techniques to make hair simulation follow a groomed shape automatically during setup, beyond hard/soft constraints on rigged points
- Methods to author shape-holding information at setup time so simulation better follows the groom in shot work
- Mitigations for cases where kinematic constraints are adversely affected by mesh deformation, for both articulated and simulated hair
- Context
- Extends the studio's curly-hair simulation line (Iben et al.) and relates to prior kinematic-constraint hair control, developed across several feature productions. Builds on: Artistic Simulation of Curly Hair
- Correctness
- Explicitly notes there is no single approach satisfying every artistic requirement, so these are pragmatic, production-validated techniques rather than a unified theory; their suitability is art-direction dependent and demonstrated through film examples.
- Clarity
- Accessible and practitioner-oriented; a first pass conveys the techniques, with constraint and force details rewarding a second pass.
- How to read it
- First pass for the catalog of shape-holding controls and when each applies; second pass on the constraint formulations and the mesh-deformation mitigation if implementing in a hair pipeline.
CFX
-
Autodesk AREA tutorial on MotionBuilder character retargeting, covering characterization, merging source and target skeletons, IK Blend settings, fixing retargeting errors, and baking animation to a character.
abstract ▾ abstract ▴
This Autodesk tutorial walks through retargeting motion capture onto a target character in MotionBuilder. It covers loading a source mocap character, putting it in a T-stance, adding namespaces to avoid naming conflicts, and file-merging the target character before connecting them via input type and activating the characterization. The workflow demonstrates creating a reusable mapping file, importing mocap into a selected hierarchy, and fixing common errors such as drooping shoulders, feet below the floor, and stride mismatches using F-curve edits, retargeting offsets and the match source option. It closes with tuning reach IK, lower-back and realistic shoulder solving for crawling poses, then plotting the animation to either the skeleton or the control rig for export or further editing.
Related MotionBuilder: Essentials Characterization, Retargeting and Baking Animations · Autodesk MotionBuilder 2022 · Motion Builder Characterization and Retargeting Tutorial · Meet MotionMaker: New AI Animation Tool In Maya
how to read this ▾ how to read this ▴
- Category
- Production talk / tool tutorial (MotionBuilder retargeting workflow)
- Contributions
-
- Walks through characterizing a source mocap skeleton and a target character, then connecting them via input type and an active characterization
- Shows practical fixes for common retargeting errors (drooping shoulders, feet below floor, stride mismatch) using F-curve edits, offsets and match-source
- Covers tuning reach IK, lower-back and shoulder solving, then plotting to skeleton or control rig for export
- Context
- A hands-on MotionBuilder application of the classic motion-retargeting problem framed by Gleicher's 'Retargeting Motion to New Characters' (1998), here realized through Autodesk's characterization and IK-blend pipeline. Builds on: Retargeting Motion to New Characters
- Correctness
- Vendor tutorial, not peer-reviewed; the workflow is production-proven inside MotionBuilder but assumes a clean T-stance, consistent characterization and tool-specific solver behavior, so results depend on careful per-rig setup rather than a general guarantee.
- Clarity
- Accessible and step-driven; a single watch-through conveys the workflow, with replays useful only when reproducing a specific fix.
- How to read it
- Treat it as a recipe: follow along in MotionBuilder while characterizing and merging, then revisit the error-fixing and IK-tuning sections only when your own retarget shows that exact artifact.
Retargeting / Motion Synthesis
-
, , ,
Hummingbird is the DreamWorks feather system used to interactively groom body feathers and model scales in real time.
abstract ▾ abstract ▴
Hummingbird is the DreamWorks feather system used to interactively groom body feathers and model scales in real time. It also drives feather motion such as secondary movement and wind response, special effects like ruffling and puffing, and final feather refinement within the studio's animation pipeline.
Related A Modernization of the DreamWorks Feather System · Mesh-Driven Generation and Animation of Groomed Feathers · Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · Animating Puss in Boots' Feather in Shrek 2
how to read this ▾ how to read this ▴
- Category
- Production talk / system breakdown (feather and scale grooming system)
- Contributions
-
- Presents Hummingbird, DreamWorks' system for interactively grooming body feathers and modeling scales in real time
- Drives feather motion such as secondary movement and wind response within the studio pipeline
- Handles effects like ruffling and puffing plus final feather refinement
- Context
- A production feather pipeline that extends the lineage of structured feather modeling, relating to Weber et al.'s 'Collision-free Construction of Animated Feathers Using Implicit Constraint Surfaces' (2009). Builds on: Collision-free Construction of Animated Feathers Using Implicit Constraint Surfaces
- Correctness
- Studio practice, not peer-reviewed; the system is production-proven on DreamWorks features, but the description is high-level, so internal algorithms and trade-offs are not laid out for independent evaluation.
- Clarity
- Brief and accessible at a conceptual level; one pass conveys what the tool does, with little formal detail to revisit.
- How to read it
- Read for the capability map and where feather grooming sits in a production pipeline; do not expect reproducible method detail, so a single pass suffices unless you are designing a comparable tool.
CFX
-
,
Interactive system for editing performance-captured facial animation, allowing artists to intuitively adjust and correct captured data.
abstract ▾ abstract ▴
While performance-based facial animation efficiently produces realistic animation, it still needs additional editing after automatic solving and retargeting. We review why additional editing is required and present a set of interactive editing solutions for VFX studios. The presented solutions allow artists to enhance the result of the automatic solve-retarget with a few tweaks. The methods are integrated into our performance-based facial animation framework and have been actively used in high-quality movie production.
Related Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer · Dynamic 3D Avatar Creation from Hand-Held Video Input · Emotion Challenge: Building a New Photoreal Facial Performance Pipeline for Games · Creating an Actor-Specific Facial Rig from Performance Capture
how to read this ▾ how to read this ▴
- Category
- Method / production system (interactive editing of captured facial animation)
- Contributions
-
- Analyzes why performance-based facial animation still needs editing after automatic solve and retarget
- Presents a set of interactive editing tools that let artists correct solved-retargeted results with a few tweaks
- Integrates the methods into a production facial-animation framework used in high-quality movie work
- Context
- Builds on performance-capture facial pipelines, specifically Seol and colleagues' 'Creating an Actor-Specific Facial Rig from Performance Capture' (2016), adding an artist-facing correction layer on top of the automatic solve. Builds on: Creating an Actor-Specific Facial Rig from Performance Capture
- Correctness
- Described as actively used in high-quality movie production, which is strong practical validation, but evidence is qualitative and studio-internal rather than benchmarked, so generality beyond their framework is not established.
- Clarity
- Accessible motivation and tool descriptions; a first pass conveys the editing ideas, a second pass helps if you want to map each tool to a specific solve artifact.
- How to read it
- Focus first on the taxonomy of why edits are needed, then skim each editing solution; a second pass pays off only if you are building an artist-correction layer for your own facial pipeline.
Facial
- talk Introducing the New Animation Rigging Features (Presented by Unity Technologies) GDC Industrial
, ,
Unity showcased predefined constraints for procedurally controlling character deformations, physics-based secondary motion simulation, and runtime blending within their Animation Rigging package.
Rigging / Skinning
-
, , , ,
Autoencoder neural network defines a nonlinear reduced space for deformable solid dynamics, solving implicit integration in latent space.
abstract ▾ abstract ▴
We propose the first reduced model simulation framework for deformable solid dynamics using autoencoder neural networks. We provide a data‐driven approach to generating nonlinear reduced spaces for deformation dynamics. In contrast to previous methods using machine learning which accelerate simulation by approximating the time‐stepping function, we solve the true equations of motion in the latent‐space using a variational formulation of implicit integration. Our approach produces drastically smaller reduced spaces than conventional linear model reduction, improving performance and robustness. Furthermore, our method works well with existing force‐approximation cubature methods.
Related Subspace Neural Physics: Fast Data-Driven Interactive Simulation · A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains · FEM Simulation of 3D Deformable Solids: A Practitioner's Guide to Theory, Discretization and Model Reduction · Pose-Space Subspace Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: neural reduced-order model for deformable simulation
- Contributions
-
- Proposes an autoencoder-defined nonlinear reduced space for deformable solid dynamics
- Solves the true equations of motion in latent space via a variational formulation of implicit integration, rather than approximating the time-stepping function
- Yields much smaller reduced spaces than linear model reduction and stays compatible with existing force-approximation cubature
- Context
- Sits between data-driven dimensionality reduction and physics-based simulation, relating to the implicit-integration-as-optimization view popularized by Bouaziz et al.'s 'Projective Dynamics' (2014). Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- The central claim is that learning a nonlinear latent space and integrating physics within it beats linear reduction on size and robustness; this depends on training data covering the deformation regime, and like all reduced models it can struggle with motions outside that training distribution.
- Clarity
- Conceptually clear if you know model reduction; a first pass conveys the idea, while the variational implicit-integration formulation rewards a careful second pass.
- How to read it
- First pass for the autoencoder-replaces-linear-basis idea; do a focused second pass on the latent-space variational integrator and the cubature coupling if you intend to implement or extend it.
ML Deformation
- Learning an Intrinsic Garment Space for Interactive Authoring of Garment Animation SIGGRAPH Asia Academic 77 cites
, , ,
Encodes garment deformations in a low-dimensional intrinsic space learned from simulation, enabling interactive authoring and blending of garment animations.
abstract ▾ abstract ▴
Authoring dynamic garment shapes for character animation on body motion is one of the fundamental steps in the CG industry. Established workflows are either time and labor consuming (i.e., manual editing on dense frames with controllers), or lack keyframe-level control (i.e., physically-based simulation). Not surprisingly, garment authoring remains a bottleneck in many production pipelines. Instead, we present a deep-learning-based approach for semi-automatic authoring of garment animation, wherein the user provides the desired garment shape in a selection of keyframes, while our system infers a latent representation for its motion-independent intrinsic parameters (e.g., gravity, cloth materials, etc.). Given new character motions, the latent representation allows to automatically generate a plausible garment animation at interactive rates. Having factored out character motion, the learned intrinsic garment space enables smooth transition between keyframes on a new motion sequence. Technically, we learn an intrinsic garment space with an motion-driven autoencoder network, where the encoder maps the garment shapes to the intrinsic space under the condition of body motions, while the decoder acts as a differentiable simulator to generate garment shapes according to changes in character body motion and intrinsic parameters.
Related PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Physics-Inspired Upsampling for Cloth Simulation in Games · Strain Based Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: learning-based garment animation authoring
- Contributions
-
- Learns an intrinsic garment space with a motion-driven autoencoder, factoring out body motion from motion-independent parameters such as gravity and cloth material
- Lets artists specify garment shape on a few keyframes and infers a latent intrinsic representation for the rest
- Generates plausible garment animation on new character motion at interactive rates with smooth keyframe-to-keyframe transitions
- Context
- Addresses the keyframe-control-versus-simulation tradeoff in garment authoring, in the same learning-based cloth lineage as Santesteban et al.'s 'Learning-Based Animation of Clothing for Virtual Try-On' (2019). Builds on: Learning-Based Animation of Clothing for Virtual Try-On
- Correctness
- The method assumes garment behavior can be disentangled into body-motion-conditioned and motion-independent intrinsic factors learned from simulation; this enables interactive authoring but ties output quality and plausibility to the coverage and realism of the training simulations.
- Clarity
- Accessible framing of the authoring problem; a first pass conveys the keyframe-plus-latent idea, with the encoder conditioning details warranting a second pass.
- How to read it
- Read first for the authoring workflow and the motion-conditioned autoencoder concept; revisit the network design and the intrinsic-versus-motion factorization if you care about controllability or reproducing the system.
CFX
-
,
Parametric physics-based controller generalizes locomotion across characters with different heights, weights, and body proportions.
abstract ▾ abstract ▴
Recently, deep reinforcement learning (DRL) has attracted great attention in designing controllers for physics-based characters. Despite the recent success of DRL, the learned controller is viable for a single character. Changes in body size and proportions require learning controllers from scratch. In this paper, we present a new method of learning parametric controllers for body shape variation. A single parametric controller enables us to simulate and control various characters having different heights, weights, and body proportions. The users are allowed to create new characters through body shape parameters, and they can control the characters immediately. Our characters can also change their body shapes on the fly during simulation. The key to the success of our approach includes the adaptive sampling of body shapes that tackles the challenges in learning parametric controllers, which relies on the marginal value function that measures control capabilities of body shapes. We demonstrate parametric controllers for various physically simulated characters such as bipeds, quadrupeds, and underwater animals.
Related DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Physics-Based Motion Retargeting from Sparse Inputs · Physics-Based Character Controllers Using Conditional VAEs · Character Controllers Using Motion VAEs
how to read this ▾ how to read this ▴
- Category
- Method: parametric DRL controller for body-shape variation
- Contributions
-
- Learns a single parametric physics-based controller that generalizes locomotion across heights, weights and body proportions
- Lets users author new characters via body-shape parameters and control them immediately, including changing shape on the fly during simulation
- Introduces adaptive sampling of body shapes guided by a marginal value function to make parametric control learnable
- Context
- Extends deep-reinforcement-learning character control, building directly on Peng et al.'s 'DeepMimic' (2018), generalizing from a single-character controller to a body-shape-parametric one. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- The key assumption is that one controller can span a continuous body-shape space if shapes are sampled adaptively by control capability; it is demonstrated on bipeds, quadrupeds and underwater animals, though performance still hinges on the sampling strategy and the chosen shape parameterization.
- Clarity
- Clear motivation, with depth in the DRL machinery; a first pass conveys the goal, while the adaptive sampling and marginal value function need a second pass.
- How to read it
- First pass for the parametric-controller idea and why naive training fails; second pass on the adaptive body-shape sampling and marginal value function, which are the technical heart.
Motion Synthesis
-
, , , ,
Disentangled motion representation separating content from style enabling video-to-video motion retargeting without 3D reconstruction.
abstract ▾ abstract ▴
Analyzing human motion is a challenging task with a wide variety of applications in computer vision and in graphics. One such application, of particular importance in computer animation, is the retargeting of motion from one performer to another. While humans move in three dimensions, the vast majority of human motions are captured using video, requiring 2D-to-3D pose and camera recovery, before existing retargeting approaches may be applied. In this paper, we present a new method for retargeting video-captured motion between different human performers, without the need to explicitly reconstruct 3D poses and/or camera parameters. In order to achieve our goal, we learn to extract, directly from a video, a high-level latent motion representation, which is invariant to the skeleton geometry and the camera view. Our key idea is to train a deep neural network to decompose temporal sequences of 2D poses into three components: motion, skeleton, and camera view-angle. Having extracted such a representation, we are able to re-combine motion with novel skeletons and camera views, and decode a retargeted temporal sequence, which we compare to a ground truth from a synthetic dataset. We demonstrate that our framework can be used to robustly extract human motion from videos, bypassing 3D reconstruction, and outperforming existing retargeting methods, when applied to videos in-the-wild.
Related Contact-Aware Retargeting of Skinned Motion · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Skeleton-Aware Networks for Deep Motion Retargeting
how to read this ▾ how to read this ▴
- Category
- Method: deep learning for 2D video motion retargeting
- Contributions
-
- Retargets video-captured motion between performers without explicit 3D pose or camera reconstruction
- Trains a network to decompose 2D pose sequences into three disentangled components: motion, skeleton, and camera view-angle
- Re-combines extracted motion with novel skeletons and camera views to decode a retargeted sequence, evaluated against synthetic ground truth
- Context
- Reframes the classic retargeting problem of Gleicher's 'Retargeting Motion to New Characters' (1998) in 2D video space via a learned, skeleton- and view-invariant latent motion representation. Builds on: Retargeting Motion to New Characters
- Correctness
- The core assumption is that 2D pose sequences cleanly factor into motion, skeleton, and view; quantitative evaluation uses synthetic data with known ground truth, so reported accuracy is most trustworthy in-distribution and real-video generalization should be read with caution.
- Clarity
- Well-motivated and accessible; a first pass conveys the three-way disentanglement, with the network architecture and training rewarding a second pass.
- How to read it
- First pass for the motion/skeleton/view decomposition idea and why it avoids 3D recovery; second pass on the architecture and synthetic-data evaluation to judge how far it transfers to real footage.
Retargeting / ML Deformation
-
, ,
Recurrent neural network predicts garment drape and wrinkles as a function of body shape and dynamics in a few milliseconds.
abstract ▾ abstract ▴
This paper presents a learning‐based clothing animation method for highly efficient virtual try‐on simulation. Given a garment, we preprocess a rich database of physically‐based dressed character simulations, for multiple body shapes and animations. Then, using this database, we train a learning‐based model of cloth drape and wrinkles, as a function of body shape and dynamics. We propose a model that separates global garment fit, due to body shape, from local garment wrinkles, due to both pose dynamics and body shape. We use a recurrent neural network to regress garment wrinkles, and we achieve highly plausible nonlinear effects, in contrast to the blending artifacts suffered by previous methods. At runtime, dynamic virtual try‐on animations are produced in just a few milliseconds for garments with thousands of triangles. We show qualitative and quantitative analysis of results.
Related GarMatNet: A Learning-Based Method for Predicting 3D Garment Mesh with Parameterized Materials · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · SMPLicit: Topology-aware Generative Model for Clothed People · SNUG: Self-Supervised Neural Dynamic Garments
how to read this ▾ how to read this ▴
- Category
- Method: learning-based clothing animation for virtual try-on
- Contributions
-
- Trains a learned cloth model from a database of physically based dressed-character simulations across many body shapes and animations
- Separates global garment fit (from body shape) from local wrinkles (from pose dynamics and body shape)
- Uses a recurrent network to regress nonlinear wrinkles and produces try-on animations in a few milliseconds, avoiding prior blending artifacts
- Context
- Combines a parametric body model with learned cloth deformation, building on Loper et al.'s 'SMPL' (2015) and the real-time clothing-space idea of de Aguiar et al.'s 'Stable Spaces for Real-time Clothing' (2010). Builds on: SMPL: A Skinned Multi-Person Linear Model · Stable Spaces for Real-time Clothing
- Correctness
- The fit-versus-wrinkle decomposition and RNN dynamics give plausible, fast results, but quality is bounded by the precomputed simulation database covering the relevant body shapes and motions, and it targets a fixed garment per trained model.
- Clarity
- Clear structure and well-motivated; a first pass conveys the fit/wrinkle split, while the RNN formulation and training setup reward a second pass.
- How to read it
- First pass for the global-fit-plus-local-wrinkle decomposition and the millisecond runtime claim; second pass on the RNN and the simulation-database construction if you plan to train or extend it.
CFX / ML Deformation
-
Will Telford introduces the Proximity Pin node in Maya 2020, which tracks a transform to the closest point on a deforming surface to support facial rigging and secondary attachment workflows.
abstract ▾ abstract ▴
This Autodesk demo introduces the Proximity Pin node in Maya 2020, which tracks a transform to the closest point on a deforming surface and, unlike UV pin, does not require binding in the rest pose because the pin node is aware of the original geometry. Using an animated character with a run cycle, the presenter pins a locator to the knee, then slides the rest relationship up the thigh or down the shin and scrubs independent translation and orientation offsets between zero and one to maintain distance or alignment through the deforming range. It also shows feeding multiple locator inputs into a single shared proximity pin node so that offset settings apply to all outputs simultaneously.
Related Group Based Rigging of Realistically Feathered Wings · Sparse Rig Parameter Optimization for Character Animation · ChopRig System · Maya 2017 Update 3: Tension Deformer and Bake Deformer Tool
how to read this ▾ how to read this ▴
- Category
- Production talk / tool demo (Maya 2020 Proximity Pin node)
- Contributions
-
- Introduces the Proximity Pin node that tracks a transform to the closest point on a deforming surface without needing a rest-pose bind, since it is aware of the original geometry
- Demonstrates sliding the rest relationship along a limb and scrubbing independent translation/orientation offsets to maintain distance or alignment through deformation
- Shows feeding multiple locator inputs into one shared pin node so offset settings apply to all outputs at once
- Context
- A rigging-tool companion to Maya's proximity-based deformation features, presented alongside the 'Maya 2020 Proximity Wrap Deformer' (2019) for facial rigging and secondary attachment workflows. Builds on: Maya 2020 | Proximity Wrap Deformer
- Correctness
- Vendor demo, not peer-reviewed; behavior is product-proven but tied to Maya 2020 node semantics, and closest-point tracking assumes well-behaved surface geometry where the nearest point stays meaningful through deformation.
- Clarity
- Very accessible and concrete; a single watch conveys what the node does and how it differs from UV pin.
- How to read it
- Watch once to learn when to reach for Proximity Pin over UV pin; revisit only the offset-scrubbing and multi-locator portions when wiring it into a specific rig.
Rigging
-
Senior Maya Product Owner Will Telford demonstrates the GPU-accelerated Proximity Wrap deformer introduced in Maya 2020, which binds geometry by proximity without requiring a rest-pose and is topology-independent.
abstract ▾ abstract ▴
This Autodesk demo shows the Proximity Wrap deformer in Maya 2020 driving a character head with a modeled cage, highlighting that it is topology independent and can bind geometry in a non-rest state. The presenter creates the proximity wrap from the deform menu, adds the face mesh as driven geometry and the cage as a driver, then tunes settings such as the bind distance, smoothing of incoming influences, normal smoothing and soft normalization to clean up the deformation. It also demonstrates adding and removing eyebrow geometry from the deformation on the fly and swapping between high and low resolution cages through a choice node in the node editor without rebinding.
Related Maya 2022: New Features for Rigging · Maya 2017 Update 3: Tension Deformer and Bake Deformer Tool · Rumba Rig: A Procedural Rigging Framework with Direct Graph-Based Control · Digital Humans: Inside Epic's MetaHuman Creator
how to read this ▾ how to read this ▴
- Category
- Production talk / tool demo (Maya 2020 Proximity Wrap deformer)
- Contributions
-
- Demonstrates the GPU-accelerated, topology-independent Proximity Wrap deformer binding a head mesh to a modeled cage without requiring a rest pose
- Tunes bind distance, influence smoothing, normal smoothing and soft normalization to clean up the deformation
- Adds/removes eyebrow geometry on the fly and swaps high- and low-resolution cages via a choice node without rebinding
- Context
- A practical realization of cage- and proximity-driven deformation, conceptually rooted in pose- and example-driven deformation work such as Lewis et al.'s 'Pose Space Deformation' (2000). Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Vendor demo, not peer-reviewed; the deformer is product-proven and GPU-accelerated, but quality depends on cage placement and the proximity/normalization settings, and non-rest binding assumes the driver cage adequately envelops the driven mesh.
- Clarity
- Accessible and hands-on; one watch conveys the workflow and the key parameters.
- How to read it
- Watch once for the bind-without-rest-pose capability and the parameter roles; revisit the smoothing/normalization and cage-swapping sections when tuning your own proximity-wrap setup.
Skinning / Rigging
-
, ,
MPC's proprietary grooming software Furtility has long created hair, fur and feathers for film characters, but feather grooms were time consuming and technically challenging.
abstract ▾ abstract ▴
MPC's proprietary grooming software Furtility has long created hair, fur and feathers for film characters, but feather grooms were time consuming and technically challenging. This work extends the toolset with a geometry-based, mesh-driven feather system that streamlines the workflow, reducing the time to finalize a hero feathered character from months to weeks.
Related Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · Hummingbird: DreamWorks Feather System · A Modernization of the DreamWorks Feather System · Feathers for Mystical Creatures: Pegasus
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline tool: a mesh-driven feather grooming system
- Contributions
-
- Extends MPC's Furtility grooming toolset with a geometry-based, mesh-driven feather system
- Streamlines the feather grooming workflow, reducing hero feathered-character finalization from months to weeks
- Context
- Builds on MPC's in-house Furtility grooming software (hair, fur, feathers) and on prior feather work for film creatures such as 'Feathers for Mystical Creatures: Pegasus' (2010). Builds on: Feathers for Mystical Creatures: Pegasus
- Correctness
- Studio practice rather than peer-reviewed; the time-saving claim is production-proven on hero characters but is anecdotal to MPC's pipeline and tooling, so generalization to other studios is not established.
- Clarity
- Accessible workflow-oriented write-up; a single first pass conveys the approach and the artist-time payoff.
- How to read it
- Read once for the workflow and the mesh-driven idea; focus on how geometry drives the groom and where artist time is saved, no deep second pass needed unless you are building a similar grooming tool.
CFX
- talk ML Tutorial Day: From Motion Matching to Motion Synthesis, and All the Hurdles In Between GDC Industrial
EA covered the gap between motion matching and full synthesis, detailing Phase-Functioned and Mode-Adaptive Neural Networks as practical next steps for animation system design in games.
Motion Synthesis / ML Deformation
-
, ,
GAN framework maps speech to 3D conversational gesture motion using separate adversaries for dynamics, joint plausibility, and diversity.
abstract ▾ abstract ▴
Applications for conversational virtual agents are on the rise, but producing realistic non-verbal behavior for spoken utterances remains an unsolved problem. We explore the use of a generative adversarial training paradigm to map speech to 3D gesture motion. We define the gesture generation problem as a series of smaller sub-problems, including plausible gesture dynamics, realistic joint configurations, and diverse and smooth motion. Each sub-problem is monitored by separate adversaries. For the problem of enforcing realistic gesture dynamics in our output, we train a classifier to automatically detect gesture phases. We find adversarial training to be superior to the use of a standard regression loss and discuss the benefit of each of our training objectives. We recorded a dataset of over 6 hours of natural, unrehearsed speech with high-quality motion capture, as well as audio and video recording.
Related Robust Motion In-Betweening · ZeroEGGS: Zero-Shot Example-Based Gesture Generation from Speech · MoGlow: Probabilistic and Controllable Motion Synthesis Using Normalising Flows · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters
how to read this ▾ how to read this ▴
- Category
- Method: a GAN-based speech-to-gesture motion generation framework
- Contributions
-
- Maps speech to 3D conversational gesture motion using a multi-objective adversarial training paradigm
- Decomposes generation into sub-problems (gesture dynamics, joint plausibility, diversity) each monitored by a separate adversary, including a learned gesture-phase classifier
- Records over 6 hours of natural unrehearsed speech with high-quality motion capture plus audio and video
- Context
- Relates to data-driven non-verbal behavior synthesis for conversational virtual agents, framing speech-to-gesture as an adversarial (GAN) alternative to standard regression losses.
- Correctness
- Authors report adversarial training as superior to a regression baseline and discuss each objective's benefit; assessment of gesture realism is inherently subjective and tied to their own captured corpus, so claims should be read as comparative within that setup.
- Clarity
- Reasonably accessible if familiar with GANs; a first pass conveys the multi-adversary decomposition, a second pass is needed for the per-adversary losses and phase classifier.
- How to read it
- First pass for the problem decomposition and why multiple adversaries; do a second pass on the adversary definitions and ablation discussion if you care about why adversarial beats regression here.
Motion Synthesis / Facial
-
, , ,
Neural state machine for character-scene interaction synthesis generating contextual motions like sitting, picking up objects, and navigating environments.
abstract ▾ abstract ▴
We propose Neural State Machine, a novel data-driven framework to guide characters to achieve goal-driven actions with precise scene interactions. Even a seemingly simple task such as sitting on a chair is notoriously hard to model with supervised learning. This difficulty is because such a task involves complex planning with periodic and non-periodic motions reacting to the scene geometry to precisely position and orient the character. Our proposed deep auto-regressive framework enables modeling of multi-modal scene interaction behaviors purely from data. Given high-level instructions such as the goal location and the action to be launched there, our system computes a series of movements and transitions to reach the goal in the desired state. To allow characters to adapt to a wide range of geometry such as different shapes of furniture and obstacles, we incorporate an efficient data augmentation scheme to randomly switch the 3D geometry while maintaining the context of the original motion. To increase the precision to reach the goal during runtime, we introduce a control scheme that combines egocentric inference and goal-centric inference.
Related Local Motion Phases for Learning Multi-Contact Character Movements · Phase-Functioned Neural Networks for Character Control · Learned Motion Matching · Mode-Adaptive Neural Networks for Quadruped Motion Control
how to read this ▾ how to read this ▴
- Category
- Method: a data-driven neural framework for character-scene interaction synthesis
- Contributions
-
- Proposes the Neural State Machine, a deep auto-regressive framework that synthesizes goal-driven motions with precise scene interactions (sitting, picking up objects, navigation)
- Introduces a data augmentation scheme that randomly switches 3D geometry while preserving motion context, so characters adapt to varied furniture and obstacles
- Adds a runtime control scheme combining egocentric and goal-centric inference to improve goal-reaching precision
- Context
- Builds on data-driven neural motion controllers, specifically the Mode-Adaptive Neural Networks for quadruped motion control (Zhang et al. 2018), extending that lineage to scene-aware human interactions. Builds on: Mode-Adaptive Neural Networks for Quadruped Motion Control
- Correctness
- Demonstrated on multi-modal interaction behaviors learned purely from data; precise contact and goal-state placement is the hard part the authors target, and quality depends on captured interaction data and the augmentation covering the target scene geometry.
- Clarity
- Dense but well-motivated; a first pass conveys the goal-driven interaction idea, a second pass is needed for the auto-regressive architecture and the dual-inference control.
- How to read it
- First pass for the interaction-synthesis idea and the geometry-augmentation trick; second pass on the network structure and egocentric/goal-centric blending if you intend to reimplement or build scene-aware controllers.
Motion Synthesis
- NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks SIGGRAPH Academic 53 cites
, , , , ,
Deep graph network for automatic skinning weight prediction generalizing to production characters from learned geometric and topological features.
abstract ▾ abstract ▴
We present a deep-learning-based method to automatically compute skin weights for skeleton-based deformation of production characters. Given a character mesh and its associated skeleton hierarchy in rest pose, our method constructs a graph for the mesh, each node of which encodes the mesh-skeleton attributes of a vertex. An end-to-end deep graph convolution network is then introduced to learn the mesh-skeleton binding patterns from a set of character models with skin weights painted by artists. The network can be used to predict the skin weight map for a new character model, which describes how the skeleton hierarchy influences the mesh vertices during deformation. Our method is designed to work for non-manifold meshes with multiple disjoint or intersected components, which are common in game production and require complex skeleton hierarchies for animation control. We tested our method on the datasets of two commercial games. Experiments show that the predicted skin weight maps can be readily applied to characters in the production pipeline to generate high-quality deformations.
Related Geodesic Voxel Binding for Production Character Meshes · NiLBS: Neural Inverse Linear Blend Skinning · Learning Skeletal Articulations with Neural Blend Shapes · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method: deep graph network for automatic skinning-weight prediction
- Contributions
-
- Builds a graph over a character mesh whose nodes encode per-vertex mesh-skeleton attributes
- Introduces an end-to-end graph convolution network that learns artist-painted skinning patterns and predicts skin-weight maps for new characters
- Designed to handle non-manifold meshes with disjoint or intersecting components and complex skeleton hierarchies common in game production
- Context
- Relates to skinning-weight computation in production rigging and to skinning decomposition work such as Smooth Skinning Decomposition with Rigid Bones (Le and Deng 2012), recasting weight assignment as a learned graph-convolution problem. Builds on: Smooth Skinning Decomposition with Rigid Bones
- Correctness
- Validated on the datasets of two commercial games with weights painted by artists; quality of predictions is bounded by the training characters' style and topology, so transfer to very different rigs or art styles is not guaranteed.
- Clarity
- Accessible if familiar with graph neural networks and skinning; a first pass conveys the formulation, a second pass clarifies the graph construction and node features.
- How to read it
- First pass for the graph formulation and the production framing (non-manifold, multi-component meshes); second pass on node-feature design and the network if you plan to apply or extend it to your own rigs.
Skinning / ML Deformation
-
, , ,
Multi-view imaging system and parametric rig estimate person-specific eyeball shape, rotation center, and visual axis at submillimeter accuracy.
abstract ▾ abstract ▴
We present a novel parametric eye rig for eye animation, including a new multi‐view imaging system that can reconstruct eye poses at submillimeter accuracy to which we fit our new rig. This allows us to accurately estimate person‐specific eyeball shape, rotation center, interocular distance, visual axis, and other rig parameters resulting in an animation‐ready eye rig. We demonstrate the importance of several aspects of eye modeling that are often overlooked, for example that the visual axis is not identical to the optical axis, that it is important to model rotation about the optical axis, and that the rotation center of the eye should be measured accurately for each person. Since accurate rig fitting requires hand annotation of multi‐view imagery for several eye gazes, we additionally propose a more user‐friendly “lightweight” fitting approach, which leverages an average rig created from several pre‐captured accurate rigs. Our lightweight rig fitting method allows for the estimation of eyeball shape and eyeball position given only a single pose with a known look‐at point (e.g. looking into a camera) and few manual annotations.
how to read this ▾ how to read this ▴
- Category
- Capture system and parametric model: a person-specific eye rig
- Contributions
-
- Presents a parametric eye rig plus a multi-view imaging system that reconstructs eye poses at submillimeter accuracy
- Accurately estimates person-specific eyeball shape, rotation center, interocular distance, and visual axis, stressing that visual and optical axes differ and that rotation about the optical axis matters
- Proposes a lightweight fitting variant using an average rig that needs only a single known look-at pose and few manual annotations
- Context
- Builds on production facial performance capture, notably the Medusa system (Beeler et al. 2012), extending high-accuracy capture and parametric fitting to the eyeball region. Builds on: Medusa: A Production-Ready Photoreal Facial Performance Capture System
- Correctness
- Demonstrated to submillimeter accuracy via multi-view capture; accurate fitting depends on hand annotation across several gazes, and the lightweight variant trades accuracy for an average-rig prior, a limitation the authors acknowledge.
- Clarity
- Clear and well-motivated, with the often-overlooked eye-anatomy points made explicit; a first pass conveys the model, a second pass is needed for the fitting math.
- How to read it
- First pass for the rig parameterization and the anatomical insights (visual vs optical axis, rotation center); second pass on the fitting procedure and the lightweight approximation if you build or evaluate eye rigs.
Rigging / Facial
-
,
This work presents a method for procedurally generating biologically driven geometry to model feathers for computer graphics.
abstract ▾ abstract ▴
This work presents a method for procedurally generating biologically driven geometry to model feathers for computer graphics. Parameters and structure are derived from a variety of real-world feather specimens so that the resulting models follow natural feather morphology. The approach produces varied, plausible feather geometry suitable for use across bird and non-avian dinosaur reconstructions.
Related A Biologically-Parameterized Feather Model · Modeling and Rendering of Realistic Feathers · Biological Modeling of Feathers by Morphogenesis Simulation · Microstructure-based Appearance Rendering for Feathers
how to read this ▾ how to read this ▴
- Category
- Method: procedural, biologically driven feather geometry generation
- Contributions
-
- Presents a method for procedurally generating biologically driven feather geometry for computer graphics
- Derives parameters and structure from real-world feather specimens so models follow natural feather morphology
- Produces varied, plausible feathers suitable for bird and non-avian dinosaur reconstructions
- Context
- Builds on biologically parameterized feather modeling, specifically the lineage of A Biologically-Parameterized Feather Model (Streit and Heidrich 2002), grounding the geometry in real specimen data. Builds on: A Biologically-Parameterized Feather Model
- Correctness
- Grounded in real feather specimens to drive morphology; plausibility is the stated goal rather than measured biological fidelity, so results should be read as visually and morphologically plausible rather than validated against ground-truth anatomy.
- Clarity
- Accessible; a first pass conveys the procedural approach and the specimen-driven parameters without needing heavy formalism.
- How to read it
- First pass for the parameterization and how specimen data maps to geometry; revisit specific sections only if generating feathers for a particular species or reconstruction.
CFX
-
,
First real-time physics-based facial animation method driven by performance capture, using projective dynamics on volumetric skin.
abstract ▾ abstract ▴
We present the first realtime method for generating facial animations enhanced by physical simulation from realtime performance capture data. Unlike purely data‐based techniques, our method is able to produce physical effects on the fly through the simulation of volumetric skin behaviour, lip contacts and sticky lips. It remains however practical as it does not require any physical/medical data which are complex to acquire and process, and instead relies only on the input of a blendshapes model. We achieve realtime performance on the CPU by introducing an efficient progressive Projective Dynamics solver to efficiently solve the physical integration steps even when confronted to constantly changing constraints. Also key to our realtime performance is a new Taylor approximation and memoization scheme for the computation of the Singular Value Decompositions required for the simulation of volumetric skin. We demonstrate the applicability of our method by animating blendshape characters from a simple webcam feed .
Related BlendForces: A Dynamic Framework for Facial Animation · High Fidelity Facial Animation Capture and Retargeting with Contours · Physically-based Sticky Lips · Example-Based Facial Rigging
how to read this ▾ how to read this ▴
- Category
- Method: real-time physics-based facial animation from performance capture
- Contributions
-
- Presents the first real-time method enhancing performance-driven facial animation with physical simulation of volumetric skin, lip contacts, and sticky lips
- Introduces an efficient progressive Projective Dynamics solver to handle constantly changing constraints in real time on the CPU
- Adds a Taylor approximation and memoization scheme to speed up the SVD computations needed for volumetric skin simulation, driving blendshape characters from a simple webcam feed
- Context
- Builds on the authors' BlendForces dynamic facial-animation framework (Barrielle et al. 2016) and on blendshape-based capture, adding real-time physical simulation on top of a blendshapes model without requiring physical or medical data. Builds on: BlendForces: A Dynamic Framework for Facial Animation
- Correctness
- Relies only on a blendshapes model (no medical/physical data) and is demonstrated driving characters from webcam input; real-time CPU performance hinges on the progressive solver and SVD approximations, whose accuracy tradeoffs a reader should keep in mind.
- Clarity
- Technically dense in the solver sections; a first pass conveys the contributions and pipeline, a second pass is required for the Projective Dynamics and SVD-approximation details.
- How to read it
- First pass for what physical effects are added and the practical webcam-driven setup; second pass on the progressive Projective Dynamics solver and SVD memoization if real-time performance or the numerics matter to you.
Facial
-
Ubisoft ALICE studios showed how neural networks can fully automate motion capture cleanup and delivery, covering implementation challenges and practical integration into capture studio workflows.
Motion Synthesis / ML Deformation
-
, , ,
Automatic method for repairing missing and noisy marker trajectories in mocap data using kinematic skeleton as a reference.
abstract ▾ abstract ▴
Processing motion capture data from optical markers for use in computer animations presents numerous technical challenges. Artifacts caused by noise, marker swaps, and marker occlusions often require manual intervention of a professionally trained marker tracking artist that spends large amounts of time and effort fixing these issues. Existing automatic solutions that attempt to fix marker data lack robustness due to either failing to properly detect and fix marker paths, or generating solutions that are challenging to integrate within current animation pipelines. In this paper, we present a method that robustly identifies invalid marker paths, removes the associated segments and generates new kinematically correct paths. We start by comparing the kinematic solutions generated by commercial software against the one generated by the state-of-the-art methods, using this information to determine which animation keyframes are invalid. Subsequently, we regenerate marker paths from the neural network based method [Holden 2018] and use a sophisticated marker filling algorithm to combine them with the original marker paths at sections where we detect the original data to be invalid. Our method outperforms alternatives by generating solutions that are both closer to the ground truth and more robust, allowing for manual intervention if required.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Retargeting for Crowd Simulation · MarkerNet: A Divide-and-Conquer Solution to Motion Capture Solving From Raw Markers · Aura Mesh: Motion Retargeting to Preserve the Spatial Relationships between Skinned Characters
how to read this ▾ how to read this ▴
- Category
- Method: automatic repair of optical mocap marker trajectories
- Contributions
-
- Robustly identifies invalid marker paths (noise, marker swaps, occlusions), removes the bad segments, and regenerates kinematically correct paths
- Detects invalid keyframes by comparing the kinematic solution from commercial software against a state-of-the-art neural method
- Combines neural-regenerated marker paths (Holden 2018) with original marker data via a marker-filling algorithm at detected invalid sections
- Context
- Relates to optical mocap cleanup and builds directly on neural marker-solving work (Holden 2018), using a kinematic skeleton reference to gate where regeneration is applied.
- Correctness
- Targets robustness over prior automatic methods and pipeline integrability; validity detection depends on agreement between commercial and neural kinematic solvers, so failure modes where both agree on a wrong solution are a reader caveat.
- Clarity
- Accessible to anyone familiar with mocap pipelines; a first pass conveys the detect-remove-regenerate strategy, a second pass clarifies the filling algorithm.
- How to read it
- First pass for the detect-and-repair pipeline and where it slots into existing tools; second pass on the invalid-keyframe detection and marker-filling steps if you process optical mocap.
Retargeting
-
, , ,
Two-level imitation learning drives a full-body model with 346 muscles to reproduce diverse locomotion skills at interactive rates.
abstract ▾ abstract ▴
Many anatomical factors, such as bone geometry and muscle condition, interact to affect human movements. This work aims to build a comprehensive musculoskeletal model and its control system that reproduces realistic human movements driven by muscle contraction dynamics. The variations in the anatomic model generate a spectrum of human movements ranging from typical to highly stylistic movements. To do so, we discuss scalable and reliable simulation of anatomical features, robust control of under-actuated dynamical systems based on deep reinforcement learning, and modeling of pose-dependent joint limits. The key technical contribution is a scalable, two-level imitation learning algorithm that can deal with a comprehensive full-body musculoskeletal model with 346 muscles. We demonstrate the predictive simulation of dynamic motor skills under anatomical conditions including bone deformity, muscle weakness, contracture, and the use of a prosthesis. We also simulate various pathological gaits and predictively visualize how orthopedic surgeries improve post-operative gaits.
Related Functionality-Driven Musculature Retargeting · Physical Based Motion Reconstruction From Videos Using Musculoskeletal Model · Generative GaitNet · Physics-Based Character Controllers Using Conditional VAEs
how to read this ▾ how to read this ▴
- Category
- Method: muscle-actuated full-body musculoskeletal simulation and control
- Contributions
-
- Builds a comprehensive full-body musculoskeletal model driven by muscle contraction dynamics, with pose-dependent joint limits
- Introduces a scalable two-level imitation learning algorithm that controls a model with 346 muscles to reproduce diverse locomotion skills at interactive rates
- Predictively simulates motor skills under anatomical conditions (bone deformity, muscle weakness, contracture, prosthesis) and visualizes how orthopedic surgery affects gait
- Context
- Builds on example-guided deep reinforcement learning for physics-based character control, specifically DeepMimic (Peng et al. 2018), scaling that imitation-learning approach to a detailed muscle-actuated model. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- Demonstrated as predictive simulation across a spectrum of typical to pathological gaits; results are physics-and-anatomy-based predictions, not clinically validated outcomes, so the surgical/pathological visualizations should be read as illustrative rather than medically conclusive.
- Clarity
- Conceptually heavy (musculoskeletal modeling plus RL); a first pass conveys the scope and two-level scheme, deeper passes are needed for the control formulation.
- How to read it
- First pass for the model scope and the two-level imitation-learning idea; second pass on the control algorithm and muscle dynamics if you work on physics-based or biomechanical character control.
Muscles / Motion Synthesis
-
, ,
Scriptable CFX framework integrating cloth, hair, and secondary dynamics into a unified character effects pipeline.
abstract ▾ abstract ▴
We would like to present a scriptable interactive data manipulation tool, heavily used on Disney's Moana and Ralph Breaks the Internet. Its expression-driven interface makes it a versatile "Swiss Army Knife" for Technical Animation: a single tool with many functions, which could be applied to hair, cloth, and final cleanup tasks. Consequently, this provides a tremendous benefit for both developers and end users. The single code base can be maintained and upgraded efficiently. The artists, familiar with the tool in the context of one task, can take full advantage of the flexible interface and easily apply the tool to another task.
Related Choreography of Hair and Cloth in Disney's Moana 2 · Simulating Wind Effects on Cloth and Hair in Disney's Frozen · Art-Directing Asha's Braids in Disney's Wish · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact
how to read this ▾ how to read this ▴
- Category
- Production talk: a scriptable character-FX data-manipulation tool
- Contributions
-
- Demonstrates a scriptable, interactive, expression-driven tool used across hair, cloth, and final cleanup tasks in technical animation
- Shows a single 'Swiss Army knife' code base that serves many CFX functions, easing maintenance for developers and letting artists reuse one familiar interface across tasks
- Context
- Relates to unified procedural CFX/dataflow frameworks such as ILM's Dataflow (Hankins 2015), and was used in production on Disney's Moana and Ralph Breaks the Internet. Builds on: Dataflow: ILM's Framework for Procedural Geometry Generation, Simulation Authoring, Crowds, and More
- Correctness
- Studio practice, not peer-reviewed; benefits are production-proven on named films but are reported as workflow and maintenance advantages rather than measured comparisons.
- Clarity
- Accessible and practice-oriented; a single first pass conveys the tool's philosophy and use cases.
- How to read it
- Read once for the design philosophy (one expression-driven tool spanning hair, cloth, and cleanup) and the maintainability argument; no second pass needed unless designing a similar CFX framework.
CFX / Rigging
-
,
Extends Kelvinlet sculpting with a multi-scale convolution scheme enabling cusp-like elastic edits with sharp, localised falloff profiles.
abstract ▾ abstract ▴
In this work, we present an extension of the regularized Kelvinlet technique suited to non-smooth, cusp-like edits. Our approach is based on a novel multi-scale convolution scheme that layers Kelvinlet deformations into a finite but spiky solution, thus offering physically based volume sculpting with sharp falloff profiles. We also show that the Laplacian operator provides a simple and effective way to achieve elastic displacements with fast far-field decay, thereby avoiding the need for multi-scale extrapolation. Finally, we combine the multi-scale convolution and Laplacian machinery to produce Sharp Kelvinlets, a new family of analytic fundamental solutions of linear elasticity with control over both the locality and the spikiness of the brush profile. Closed-form expressions and reference implementation are also provided.
Related Interactive Skeleton-Driven Dynamic Deformations · Physically Based Rigging for Deformable Characters · Harmonic Coordinates for Character Articulation · Regularized Kelvinlets: Sculpting Brushes Based on Fundamental Solutions of Elasticity
how to read this ▾ how to read this ▴
- Category
- Method: an analytic elastic-deformation brush for volume sculpting
- Contributions
-
- A multi-scale convolution scheme that layers Kelvinlet deformations into a spiky solution for cusp-like, non-smooth edits
- Use of the Laplacian operator to obtain fast far-field decay without multi-scale extrapolation
- Sharp Kelvinlets, a family of closed-form elasticity solutions with control over both locality and spikiness, plus a reference implementation
- Context
- A direct extension of Regularized Kelvinlets (de Goes and James 2017), which built sculpting brushes on fundamental solutions of linear elasticity; this work adds non-smooth, sharply-localized falloff profiles. Builds on: Regularized Kelvinlets: Sculpting Brushes Based on Fundamental Solutions of Elasticity
- Correctness
- Grounded in closed-form fundamental solutions of linear elasticity, so the deformations are physically based within that linear regime; the brushes are a controllable sculpting tool rather than a full physical simulation, so behavior under large or coupled deformations is outside the stated scope.
- Clarity
- Clear and self-contained with closed-form expressions and reference code; a first pass conveys the brush idea, a second pass is needed to follow the convolution and Laplacian derivations.
- How to read it
- Read 2017 Regularized Kelvinlets first; on the first pass focus on what cusp/spikiness control buys you, then do a formula-level second pass on the multi-scale convolution and Laplacian construction if you intend to implement it.
Rigging / Skinning
-
,
DreamWorks production fur dynamics system driving secondary motion on characters through efficient simulation and procedural techniques.
abstract ▾ abstract ▴
This talk presents DreamWorks' fur motion system Skunk which is used to produce motion for fur on characters, garments, and props. Skunk's ease of use, speed, stability, interactive nature, flexible framework, layered simulation approach, on the fly fur setup capabilities, consistency, and artist controls pushed boundaries of fur motion and interaction, and expanded artist usage at DreamWorks. The system was widely used in the film How to Train Your Dragon: The Hidden World, the short Bilby, and is being used on current feature films and shorts at DreamWorks.
Related Hair Effects in Trolls World Tour · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · XGen: Arbitrary Primitive Generator
how to read this ▾ how to read this ▴
- Category
- Production talk: a studio fur motion system
- Contributions
-
- Skunk, DreamWorks' fur dynamics system for secondary motion on characters, garments, and props
- A layered, interactive simulation approach with on-the-fly fur setup, stability, and artist controls
- Production deployment across How to Train Your Dragon: The Hidden World, the short Bilby, and ongoing features
- Context
- Relates to production fur and secondary-motion dynamics pipelines; presented as an internal DreamWorks system rather than building on a single cited prior method.
- Correctness
- Studio practice, not peer-reviewed; the claims of speed, stability, and ease of use are production-proven on shipped films rather than benchmarked, so reported strengths reflect artist workflow rather than controlled comparison.
- Clarity
- Accessible and workflow-oriented; a single read conveys the system's scope and motivation without heavy formalism.
- How to read it
- Read once for the layered-simulation and artist-control design choices and how interactivity shaped adoption; no formal second pass needed, mine it for pipeline ideas rather than algorithms.
CFX
-
, ,
Head rig for Spies in Disguise pigeons where beak and facial features slide freely on curved head volumes, enabling 2D-inspired expressions on 3D characters.
abstract ▾ abstract ▴
The birds of Spies in Disguise required several technological advancements and techniques to achieve the simple graphic style of the film. One technology was a re-designed wing rig with unique mechanics that allowed for clean lines and graphic shapes rather than our previous anatomical based wing rig. The production style also required extreme posing involving sliding limbs, large open mouth ranges and jiggly eyes. These requirements were achieved with a combination of new workflow techniques, updates to the pipeline and the creation and updating of proprietary deformers.
Related The Versatile Rigging of Splat in Strange World · Building and Animating User-Specific Volumetric Face Rigs · Stable and Efficient Differential IK · Using Deep Learning to Approximate Joint Placement in 3D Bipedal Characters
how to read this ▾ how to read this ▴
- Category
- Production talk: a character rigging and deformer breakdown
- Contributions
-
- A re-designed wing rig with mechanics for clean graphic shapes rather than anatomical wings
- Techniques for extreme posing including sliding limbs, large open-mouth ranges, and jiggly eyes
- New workflow, pipeline updates, and proprietary deformers to slide facial features over curved head volumes
- Context
- Builds on prior proprietary rigging tooling (referenced ChopRig System), applying it to the stylized graphic look of Spies in Disguise birds. Builds on: ChopRig System
- Correctness
- Studio practice, not peer-reviewed; the rig is validated by shipped shots and a specific stylistic target, so the techniques are production-proven for this art direction rather than shown to generalize.
- Clarity
- Accessible and example-driven; one read conveys the rigging problems and solutions, with detail tied to a proprietary toolset.
- How to read it
- Read once focusing on how the sliding-feature and graphic-wing mechanics achieve a 2D look on 3D geometry; treat it as a design-pattern reference rather than a reproducible method.
Rigging / Facial
-
, , , , , ,
Shows that many small substeps with simple per-step constraint projection outperforms fewer large-step implicit integration for cloth and soft bodies.
abstract ▾ abstract ▴
In this paper we re-examine the idea that implicit integrators with large time steps offer the best stability/performance trade-off for stiff systems. We make the surprising observation that performing a single large time step with n constraint solver iterations is less effective than computing n smaller time steps, each with a single constraint solver iteration. Based on this observation, our approach is to split every visual time step into n substeps of length Δt/n and to perform a single iteration of extended position-based dynamics (XPBD) in each such substep. When compared to a traditional implicit integrator with large time steps we find constraint error and damping are significantly reduced. When compared to an explicit integrator we find that our method is more stable and robust for a wider range of stiffness parameters. This result holds even when compared against more sophisticated implicit solvers based on Krylov methods. Our method is straightforward to implement, and is not sensitive to matrix conditioning nor is it to overconstrained problems.
Related Efficient and Stable Approach to Elasticity and Collisions for Hair Animation · Robust Treatment of Collisions, Contact and Friction for Cloth Animation · Position Based Dynamics · Large Steps in Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a time-integration strategy for physics simulation
- Contributions
-
- The observation that many small substeps with one constraint iteration each beat a single large step with many iterations
- A substepping scheme applying one iteration of XPBD per substep to reduce constraint error and damping
- Demonstrated improved stability and robustness over explicit and sophisticated implicit (Krylov-based) solvers across a wide stiffness range
- Context
- Directly extends XPBD (Macklin et al. 2016) and re-examines the conventional wisdom that large-step implicit integrators give the best stiffness/performance trade-off. Builds on: XPBD: Position-Based Simulation of Compliant Constrained Dynamics
- Correctness
- The central claim is shown empirically for stiff cloth and soft bodies and argued via reduced damping and conditioning insensitivity; it is a trade-off result, so the substep count and benefits still depend on the stiffness and contact regime tested.
- Clarity
- Notably clear and easy to implement; a first pass conveys the core insight, a second pass clarifies the substep/XPBD formulation and the Verlet interpretation.
- How to read it
- Read the abstract and core argument first to grasp why substepping reduces damping, then do a second pass on the per-substep XPBD update and comparisons if you plan to adopt or benchmark it.
CFX
- SoftCon: Simulation and Control of Soft-Bodied Animals with Biomimetic Actuators SIGGRAPH Asia Academic 61 cites
, , , ,
Simulates and controls deformable invertebrate characters using muscle-like actuators with contact-rich locomotion.
abstract ▾ abstract ▴
We present a novel and general framework for the design and control of underwater soft-bodied animals. The whole body of an animal consisting of soft tissues is modeled by tetrahedral and triangular FEM meshes. The contraction of muscles embedded in the soft tissues actuates the body and limbs to move. We present a novel muscle excitation model that mimics the anatomy of muscular hydrostats and their muscle excitation patterns. Our deep reinforcement learning algorithm equipped with the muscle excitation model successfully learned the control policy of soft-bodied animals, which can be physically simulated in real-time, controlled interactively, and resilient to external perturbations. We demonstrate the effectiveness of our approach with various simulated animals including octopuses, lampreys, starfishes, stingrays and cuttlefishes. They learn diverse behaviors such as swimming, grasping, and escaping from a bottle. We also implemented a simple user interface system that allows the user to easily create their creatures.
Related Generative GaitNet · Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies · Character Controllers Using Motion VAEs · DReCon: Data-Driven Responsive Control of Physics-Based Characters
how to read this ▾ how to read this ▴
- Category
- Method: simulation and learned control of soft-bodied characters
- Contributions
-
- A general framework modeling whole soft-bodied animals with tetrahedral and triangular FEM meshes actuated by embedded muscles
- A muscle excitation model mimicking muscular-hydrostat anatomy and excitation patterns
- A deep reinforcement learning controller yielding real-time, interactive, perturbation-resilient behaviors across several invertebrates
- Context
- Relates to muscle-actuated physical character control (referenced Scalable Muscle-Actuated Human Simulation and Control), extending it from articulated humans to soft underwater invertebrates. Builds on: Scalable Muscle-Actuated Human Simulation and Control
- Correctness
- Demonstrated on simulated octopuses, lampreys, starfish, stingrays, and cuttlefish learning swimming, grasping, and escape; results are in simulation with a biomimetic actuator abstraction, so fidelity to real animal biomechanics and transfer beyond the shown creatures are not claimed.
- Clarity
- Readable with strong visual results; a first pass conveys the actuator and RL idea, a second pass is needed for the FEM and excitation-model details.
- How to read it
- On the first pass focus on the muscle excitation model and how the RL policy is set up; second pass for the FEM actuation coupling and training specifics if reproducing the control results.
Muscles / Motion Synthesis
-
, , ,
First method to capture hair with sub-millimeter strand-level accuracy from a multi-view rig using slanted stereo correspondences and mean-shift strand growth.
abstract ▾ abstract ▴
This paper presents a method to capture high-fidelity hair geometry with strand-level accuracy from multi-view images. In the first stage, a line-based PatchMatch multi-view stereo reformulates traditional MVS with a slanted strand-line assumption, using a cost function combining photo-consistency and a geometric term that reconstructs each hair pixel as a 3D line and merges the depth maps into a point cloud with per-point line directions. A mean-shift based strand reconstruction algorithm then converts the noisy point data into a set of strands, and a multi-view hair growing step elongates short strands and recovers missing ones. Evaluated on synthetic and real captured data, the method reconstructs hair strands with sub-millimeter accuracy and pixel-accurate projection to novel views.
Related Robust Hair Capture Using Simulated Examples · Structure-Aware Hair Capture · Hair Modeling and Simulation by Style · Fast Cloth Simulation on Moving Humanoids
how to read this ▾ how to read this ▴
- Category
- Capture method: multi-view hair geometry reconstruction
- Contributions
-
- A line-based PatchMatch multi-view stereo that reconstructs each hair pixel as a 3D line under a slanted strand-line assumption
- A mean-shift strand reconstruction that converts noisy oriented point data into discrete strands
- A multi-view hair-growing step that elongates short strands and recovers missing ones, reported at sub-millimeter strand accuracy
- Context
- Builds on structure-aware hair capture (referenced Structure-Aware Hair Capture, Luo et al. 2013), reformulating traditional MVS with a strand-line model for finer geometry. Builds on: Structure-Aware Hair Capture
- Correctness
- Evaluated on synthetic and real captured data with sub-millimeter accuracy and pixel-accurate reprojection; relies on a multi-view rig and the slanted-line strand assumption, so results depend on capture setup and may degrade on very occluded or fine wispy regions not stressed in the stated evaluation.
- Clarity
- Clear staged pipeline; a first pass conveys the line-MVS plus growth idea, a second pass is needed for the cost function and mean-shift formulation.
- How to read it
- Read the three-stage pipeline structure first, then a second pass on the photo-consistency-plus-geometric cost and the strand-growing step if you work on capture; check the evaluation conditions before trusting the accuracy claim for your setup.
CFX
-
, , ,
Neural network trained in a learned subspace to replace expensive physics simulation with fast interactive deformation for characters.
abstract ▾ abstract ▴
Data-driven methods for physical simulation are an attractive option for interactive applications due to their ability to trade precomputation and memory footprint in exchange for improved runtime performance. Yet, existing data-driven methods fall short of the extreme memory and performance constraints imposed by modern interactive applications like AAA games and virtual reality. Here, performance budgets for physics simulation range from tens to hundreds of micro-seconds per frame, per object. We present a data-driven physical simulation method that meets these constraints. Our method combines subspace simulation techniques with machine learning which, when coupled, enables a very efficient subspace-only physics simulation that supports interactions with external objects - a longstanding challenge for existing subspace techniques. We also present an interpretation of our method as a special case of subspace Verlet integration, where we apply machine learning to efficiently approximate the physical forces of the system directly in the subspace. We propose several practical solutions required to make effective use of such a model, including a novel training methodology required for prediction stability, and a GPU-friendly subspace decompression algorithm to accelerate rendering.
Related Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Nonlinear Cloth Simulation with Isogeometric Analysis · Stable Spaces for Real-time Clothing · Projective Dynamics: Fusing Constraint Projections for Fast Simulation
how to read this ▾ how to read this ▴
- Category
- Method: data-driven neural physics for interactive deformation
- Contributions
-
- A neural network operating in a learned subspace to approximate physical forces and replace costly full simulation
- Support for interaction with external objects, a long-standing limitation of subspace techniques
- An interpretation as subspace Verlet integration, plus a training methodology to make the model usable under tight memory and time budgets
- Context
- Relates to data-driven deformation approximation (referenced Fast and Deep Deformation Approximations, Bailey 2018) and to classical subspace simulation, coupling the two with machine learning. Builds on: Fast and Deep Deformation Approximations
- Correctness
- Targets the tens-to-hundreds-of-microseconds-per-object budgets of games and VR and is presented as an approximation that trades precomputation and memory for runtime; being subspace and data-driven, accuracy is bounded by the learned basis and training distribution, so out-of-distribution motions and interactions are a natural risk.
- Clarity
- Reasonably accessible with a helpful Verlet-integration framing; a first pass conveys the subspace-plus-learning idea, a second pass clarifies the force approximation and training methodology.
- How to read it
- Read for the subspace-Verlet interpretation and the external-interaction handling first; second pass on the training methodology and the practical solutions section if you intend to hit interactive budgets yourself.
ML Deformation / Skinning
2018
47-
, , , ,
Represents the 3D hairstyle manifold via a volumetric VAE trained on orientation fields; synthesizes new hairstyles from a single image in one second.
abstract ▾ abstract ▴
Recent advances in single-view 3D hair digitization have made the creation of high-quality CG characters scalable and accessible to end-users, enabling new forms of personalized VR and gaming experiences. To handle the complexity and variety of hair structures, most cutting-edge techniques rely on the successful retrieval of a particular hair model from a comprehensive hair database. Not only are the aforementioned data-driven methods storage intensive, but they are also prone to failure for highly unconstrained input images, complicated hairstyles, and failed face detection. Instead of using a large collection of 3D hair models directly, we propose to represent the manifold of 3D hairstyles implicitly through a compact latent space of a volumetric variational autoencoder (VAE). This deep neural network is trained with volumetric orientation field representations of 3D hair models and can synthesize new hairstyles from a compressed code. To enable end-to-end 3D hair inference, we train an additional embedding network to predict the code in the VAE latent space from any input image. Strand-level hairstyles can then be generated from the predicted volumetric representation. Our fully automatic framework does not require any ad-hoc face fitting, intermediate classification and segmentation, or hairstyle database retrieval.
Related Motion Guided Deep Dynamic 3D Garments · N-Cloth: Predicting 3D Cloth Deformation with Mesh-Based Networks · HAAR: Text-Conditioned Generative Model of 3D Strand-Based Human Hairstyles · Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction
how to read this ▾ how to read this ▴
- Category
- Method: deep generative model for single-image 3D hair synthesis
- Contributions
-
- Represents the manifold of 3D hairstyles implicitly through the compact latent space of a volumetric VAE trained on orientation-field representations
- Adds an embedding network that predicts the latent code from any input image for end-to-end 3D hair inference
- Generates strand-level hairstyles from the predicted volumetric representation, reportedly from a single image in about one second
- Context
- Departs from retrieval-based single-view hair digitization (e.g. AutoHair: Fully Automatic Hair Modeling from a Single Image) by replacing a large 3D hair database with a learned, compact latent space. Builds on: AutoHair: Fully Automatic Hair Modeling from a Single Image
- Correctness
- Assumes the volumetric orientation-field VAE captures enough of the hairstyle manifold to generalize beyond retrieval; this should reduce storage and the failure modes of database lookup (unconstrained images, failed face detection), but quality is bounded by the training data and the volumetric-to-strand conversion, so very complex or out-of-distribution styles remain a question.
- Clarity
- Requires familiarity with VAEs and volumetric fields; a first pass conveys the pipeline, a second pass is needed for the representation and training.
- How to read it
- First pass for the retrieval-versus-latent-space framing and the orientation-field representation; second pass on the VAE and embedding networks if building generative hair or single-image reconstruction.
CFX / ML Deformation
-
, ,
Framework for abstracting core rigging concepts to build extensible, future-proof character rigging systems for production.
abstract ▾ abstract ▴
DNEG's Loom framework abstracts rigging concepts into a DCC-agnostic system that separates pure rigging logic from implementation details, surviving the discontinuation of Fabric Engine. The system uses operator graphs and procedural rigs, achieving better performance than Maya's GPU pipeline through optimized CPU and parallel evaluation implementations.
Related A.C.M.E. Multilimb System · Premo: Powerful Character Rigging, Fast Animation · Geodesic Voxel Binding for Production Character Meshes · FIRA: Portable Realtime Rig Deformation
how to read this ▾ how to read this ▴
- Category
- Production talk: a rigging-framework architecture
- Contributions
-
- Presents DNEG's Loom framework, which abstracts rigging concepts into a DCC-agnostic system separating pure rigging logic from implementation details
- Uses operator graphs and procedural rigs to survive the discontinuation of an underlying engine (Fabric Engine)
- Reports better performance than Maya's GPU pipeline via optimized CPU and parallel evaluation
- Context
- Relates to multithreaded dependency-graph evaluation for character animation (e.g. LibEE: A Multithreaded Dependency Graph for Character Animation), applying graph-based, abstracted evaluation to a future-proof rigging framework. Builds on: LibEE: A Multithreaded Dependency Graph for Character Animation
- Correctness
- Studio practice, not peer-reviewed; the performance and portability claims are production-validated within DNEG's pipeline, so the comparison against Maya's GPU path reflects their setup rather than a controlled benchmark.
- Clarity
- Accessible to pipeline and rigging engineers; one read conveys the abstraction strategy and its motivation.
- How to read it
- Read once for the separation-of-concerns design and the operator-graph approach; revisit only if you are designing a DCC-agnostic rigging framework yourself.
Rigging
-
,
System for continuously monitoring rig performance health metrics at Blue Sky Studios, enabling real-time rig evaluation during production using ChopRig and Conduit.
abstract ▾ abstract ▴
Our characters have a lot of moving parts. This complexity makes achieving and maintaining real-time performance a challenge. Our journey of bringing our rigs to 24 fps consisted of many different milestones. We aggressively adopted cutting edge technology during active production and developed a system to continuously monitor asset "health" performance metrics. New applications were created for production to monitor asset health using Blue Sky's next generation pipeline, Conduit.
Related LibEE: A Multithreaded Dependency Graph for Character Animation · A Pipeline Retrospective on USD and Conduit · Premo: Powerful Character Rigging, Fast Animation · LibEE 2: Enabling Fast Edits and Evaluation
how to read this ▾ how to read this ▴
- Category
- Production talk: a rig-performance monitoring system
- Contributions
-
- Describes the effort to bring Blue Sky Studios characters to real-time (24 fps) evaluation during active production
- Presents a system to continuously monitor asset health performance metrics
- Introduces production applications (ChopRig, Conduit) for tracking rig health in the next-generation pipeline
- Context
- Relates to real-time rig evaluation and performance optimization in production pipelines, treating sustained rig speed as a measurable, monitored asset-health property rather than a one-time tuning task.
- Correctness
- Studio practice, not peer-reviewed; the 24 fps target and health-monitoring approach are production-proven at Blue Sky but specific to its characters and pipeline, so numbers reflect their assets rather than a general benchmark.
- Clarity
- Accessible to pipeline practitioners; one read conveys the monitoring philosophy and the tooling.
- How to read it
- Read once for the asset-health monitoring idea and how performance was kept in check during production; little need for a second pass unless adopting a similar metrics system.
Rigging
-
, , ,
Empirically derived rig for realistic jaw animation, capturing biomechanical jaw motion from data for use in facial character rigs.
abstract ▾ abstract ▴
In computer graphics the motion of the jaw is commonly modelled by up-down and left-right rotation around a fixed pivot plus a forward-backward translation, yielding a three dimensional rig that is highly suited for intuitive artistic control. The anatomical motion of the jaw is, however, much more complex since the joints that connect the jaw to the skull exhibit both rotational and translational components. In reality the jaw does not move in a three dimensional subspace but on a constrained manifold in six dimensions. We analyze this manifold in the context of computer animation and show how the manifold can be parameterized with three degrees of freedom, providing a novel jaw rig that preserves the intuitive control while providing more accurate jaw positioning. The chosen parameterization furthermore places anatomically correct limits on the motion, preventing the rig from entering physiologically infeasible poses. Our new jaw rig is empirically designed from accurate capture data, and we provide a simple method to retarget the rig to new characters, both human and fantasy.
Related Direct Manipulation Blendshapes · DreamWorks Animation Facial Motion and Deformation System · Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer · Performance-Driven Facial Animation
how to read this ▾ how to read this ▴
- Category
- Method: an empirically derived jaw rig for facial animation
- Contributions
-
- Analyzes the jaw's motion as a constrained six-dimensional manifold rather than the usual three-DOF fixed-pivot model
- Parameterizes that manifold with three degrees of freedom, keeping intuitive artistic control while improving positional accuracy
- Places anatomically correct limits to prevent infeasible poses, with a simple method to retarget the rig to new human and fantasy characters
- Context
- Builds on production facial performance capture (e.g. Medusa: A Production-Ready Photoreal Facial Performance Capture System) by using captured data to design a more anatomically faithful jaw rig. Builds on: Medusa: A Production-Ready Photoreal Facial Performance Capture System
- Correctness
- Empirically designed from accurate capture data and constrained to physiologically feasible motion; accuracy is therefore tied to the captured subjects, and the retargeting to fantasy characters extrapolates beyond the human anatomy the manifold was derived from, which a reader should keep in mind.
- Clarity
- Clear and well-motivated; a first pass conveys the manifold insight, a second pass clarifies the parameterization.
- How to read it
- First pass for the 6D-manifold insight and the three-DOF parameterization; second pass on the manifold fitting and retargeting if building or improving a facial jaw rig.
Facial / Rigging
-
, , , , , ,
Implicit frictional contact solver integrated with adaptive remeshing for robust, stable cloth simulation under complex contact scenarios.
abstract ▾ abstract ▴
Cloth dynamics plays an important role in the visual appearance of moving characters. Properly accounting for contact and friction is of utmost importance to avoid cloth-body and cloth-cloth penetration and to capture typical folding and stick-slip behavior due to dry friction. We present here the first method able to account for cloth contact with exact Coulomb friction, treating both cloth self-contacts and contacts occurring between the cloth and an underlying character. Our key contribution is to observe that for a nodal system like cloth, the frictional contact problem may be formulated based on velocities as primary variables, without having to compute the costly Delassus operator. Then, by reversing the roles classically played by the velocities and the contact impulses, conical complementarity solvers of the literature can be adapted to solve for compatible velocities at nodes. To handle the full complexity of cloth dynamics scenarios, we have extended this base algorithm in two ways: first, towards the accurate treatment of frictional contact at any location of the cloth, through an adaptive node refinement strategy; second, towards the handling of multiple constraints at each node, through the duplication of constrained nodes and the adding of pin constraints between duplicata.
Related Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Projective Dynamics with Dry Frictional Contact · Frictional Contact on Smooth Elastic Solids · Directing Cloth Draping through Blended UVs
how to read this ▾ how to read this ▴
- Category
- Method: an implicit frictional contact solver for cloth simulation
- Contributions
-
- First method to treat cloth contact with exact Coulomb friction across both self-contacts and cloth-body contacts
- Reformulates the frictional contact problem in velocity variables, avoiding the costly Delassus operator
- Adapts conical complementarity solvers and extends them with adaptive node refinement to handle contact anywhere on the cloth
- Context
- Builds on robust simultaneous-collision treatment (Harmon et al., Robust Treatment of Simultaneous Collisions) and the broader line of constraint-based frictional contact for nodal/cloth systems. Builds on: Robust Treatment of Simultaneous Collisions
- Correctness
- Demonstrated on complex cloth dynamics scenarios with self-contact and an underlying character; the velocity-as-primary-variable reformulation is the central assumption, and readers should note the cost and convergence behavior of complementarity solvers under dense contact.
- Clarity
- Moderately technical; a first pass conveys the velocity-reformulation idea, but the complementarity formulation needs a careful second pass.
- How to read it
- Focus first on why working in velocities avoids the Delassus operator; do a second pass on the conical complementarity adaptation and the node-refinement strategy if you intend to implement.
CFX
-
Guerrilla Games covered the full machine character pipeline for Horizon Zero Dawn, from pre-production prototyping to final polish including animation-tech, AI behavior, and motion style.
Rigging / Motion Synthesis
- Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production SIGGRAPH Asia Courses Pixar 3 cites
, , , ,
Course covering Pixar's full costume pipeline from design through tailoring and simulation, illustrated with examples from Coco and Incredibles 2.
abstract ▾ abstract ▴
Costumes are an important part of character design, acting, storytelling, and visual appeal in animation. However, it is challenging to achieve art-directed natural-looking motion and detail in CG animated clothing, due to technology, workflow, and budget constraints. This course will cover Pixar's latest approach to CG costumes, from design to tailoring to simulation, and how we try to address these challenges. Our goal is to continue working towards a balance between the detail and physicality of real costumes, and the stylized artistry and movement of 2D animated clothing. Using examples from "Incredibles 2", "Coco", and other Pixar films, we will show how our artists approach the initial costume design direction, strategically plan designs to fit within time and technology constraints, and translate drawings into 3D clothing on stylized characters. Next, we will show how we create garment models using 3D and flat-panel tailoring methods, applications for common simulation parameters and settings, and robust out-of-box simulation techniques using cloth rigging and dynamic alterations. Finally, we will cover the tools used to simulate garments in shots, create appealing shapes and movement, and help Animation let the characters act with their clothing. Although Pixar uses proprietary tools, the principles can be applied to other pipelines.
Related Directing Cloth Draping through Blended UVs · Mixing Yarns and Triangles in Cloth Simulation · Simulating Wind Effects on Cloth and Hair in Disney's Frozen · Untangling Cloth
how to read this ▾ how to read this ▴
- Category
- Production course: Pixar costume pipeline (design, tailoring, simulation)
- Contributions
-
- Walks through Pixar's end-to-end CG costume workflow from initial design direction through tailoring to in-shot simulation
- Covers 3D and flat-panel garment tailoring, common simulation parameters, cloth rigging, and dynamic alterations for robust out-of-box sims
- Illustrates balancing physical realism with stylized 2D-animation appeal using examples from Incredibles 2 and Coco
- Context
- Relates to Pixar's deformation and surface tooling (de Goes et al., Patch-based Surface Relaxation) and the wider tradition of art-directed garment simulation in feature animation. Builds on: Patch-based Surface Relaxation
- Correctness
- Studio practice rather than peer-reviewed research; techniques are production-proven on specific films and reflect workflow and budget constraints, so generalization to other pipelines is not guaranteed.
- Clarity
- Accessible course material aimed at practitioners; a single read conveys the pipeline, with specific sections worth revisiting per task.
- How to read it
- Read selectively by the stage you care about (design vs tailoring vs sim); treat the parameter and rigging tips as a reference, no formal second pass needed.
CFX
- Aura Mesh: Motion Retargeting to Preserve the Spatial Relationships between Skinned Characters CGF Academic 37 cites
, ,
Volumetric aura mesh surrounding each character encodes proximity relationships, preserving interaction semantics during motion retargeting.
abstract ▾ abstract ▴
Applying motion‐capture data to multi‐person interaction between virtual characters is challenging because one needs to preserve the interaction semantics while also satisfying the general requirements of motion retargeting, such as preventing penetration and preserving naturalness. An efficient means of representing interaction semantics is by defining the spatial relationships between the body parts of characters. However, existing methods consider only the character skeleton and thus are not suitable for capturing skin‐level spatial relationships. This paper proposes a novel method for retargeting interaction motions with respect to character skins. Specifically, we introduce the aura mesh, which is a volumetric mesh that surrounds a character's skin. The spatial relationships between two characters are computed from the overlap of the skin mesh of one character and the aura mesh of the other, and then the interaction motion retargeting is achieved by preserving the spatial relationships as much as possible while satisfying other constraints. We show the effectiveness of our method through a number of experiments.
Related Motion Retargeting for Crowd Simulation · Geometry-Aware Retargeting for Two-Skinned Characters Interaction · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Robust Marker Trajectory Repair for MOCAP Using Kinematic Reference
how to read this ▾ how to read this ▴
- Category
- Method: interaction-preserving motion retargeting
- Contributions
-
- Introduces the aura mesh, a volumetric mesh surrounding a character's skin that encodes skin-level spatial relationships
- Computes interaction semantics from the overlap between one character's skin mesh and another's aura mesh
- Retargets multi-person interaction motions by preserving these spatial relationships while preventing penetration and keeping motion natural
- Context
- Extends classic motion retargeting (Gleicher, Retargeting Motion to New Characters) from skeleton-only constraints to skin-level proximity between interacting characters. Builds on: Retargeting Motion to New Characters
- Correctness
- Validated through a number of interaction-retargeting experiments; the approach assumes a meaningful aura volume around the skin, and readers should consider how aura sizing and tight-contact cases affect the preserved semantics.
- Clarity
- Accessible; a first pass conveys the aura-mesh idea, do a second pass for the relationship-preservation formulation and constraints.
- How to read it
- Focus on how the aura mesh represents spatial relationships and how that objective is balanced against penetration and naturalness; a second pass pays off mainly if you are building a retargeting system.
Retargeting
-
, , , , , , ,
Direct Drive retargeting system used on Thanos in Avengers, bypassing traditional blendshape solvers by learning a direct mapping from actor performance to character.
abstract ▾ abstract ▴
In Marvel's Avengers: Infinity War, Thanos (played by actor Josh Brolin,) is entirely CG and is one of the the main characters in this live action movie. The plot depends on the emotional performances of this digital creature and it was imperative that Thanos's facial performance convey the actor's performances faithfully. Digital Domain's performance capture process, Direct Drive is a major departure from traditional blendshape solver techniques and was used to create Thanos's performances. We will present an overview of our updated multistage facial retargeting process. We have removed the reliance on high-resolution, per-shot facial capture and refined the process of training the system. This system is faster to set up, needs far less artist input, and preserves elements of the performance that were previously lost using traditional facial capture techniques.
Related Realtime Facial Animation with On-the-fly Correctives · Online Modeling for Realtime Facial Animation · Facial Retargeting with Automatic Range of Motion Alignment · High Fidelity Facial Animation Capture and Retargeting with Contours
how to read this ▾ how to read this ▴
- Category
- Production talk: facial performance capture and retargeting
- Contributions
-
- Presents Digital Domain's Direct Drive retargeting used to drive the fully CG Thanos from the actor's performance
- Departs from traditional blendshape solver techniques by learning a more direct actor-to-character mapping
- Removes reliance on high-resolution per-shot facial capture, with faster setup, less artist input, and better preserved performance detail
- Context
- Builds on Digital Domain's head-mounted-camera capture work (Moser et al., Masquerade) and the broader move from blendshape-solver retargeting toward learned facial transfer. Builds on: Masquerade: Fine-Scale Details for Head-Mounted Camera Motion Capture Data
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a single hero character (Thanos), so reported speed and fidelity gains are tied to that specific pipeline and training setup.
- Clarity
- Accessible overview pitched at practitioners; a single read conveys the workflow without heavy math.
- How to read it
- Read for the high-level pipeline and how Direct Drive differs from blendshape solving; no second pass needed unless comparing facial-retargeting strategies.
Facial
-
Improves Pixar's Fizt cloth solver with enhanced collision handling and performance gains, addressing the high cloth demand of Coco's skeleton cast.
abstract ▾ abstract ▴
Among the many technical challenges of Pixar's Coco was the need to handle cloth simulation for a densely populated city of skeleton characters. Skeletons posed new challenges to the collision algorithms of our in-house cloth system, Fizt. Continuous collision detection and response is an obvious solution to handling fast motion of thin geometry, but it presents us with a serious problem. In our production pipeline, geometry often starts in intersection. Animation also frequently causes kinematic surfaces to pinch the cloth between them and drive the cloth through itself. We present a solution for robustly allowing intersection recovery while employing standard continuous detection techniques. Coco also demanded more cloth than any previous Pixar film. To keep up with demand, Fizt needed to run much faster. We share our techniques for gaining performance in linear system assembly and solution, which should be applicable to most implicit solvers.
Related A Safe and Fast Repulsion Method for GPU-based Cloth Self Collisions · Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation · Efficient and Stable Approach to Elasticity and Collisions for Hair Animation · Untangling Cloth
how to read this ▾ how to read this ▴
- Category
- Production talk: cloth collision and solver performance
- Contributions
-
- Presents a method to robustly recover from initial intersections while still using standard continuous collision detection in Pixar's Fizt cloth system
- Handles cases where animation pinches cloth between kinematic surfaces or drives cloth through itself
- Shares performance gains in implicit-solver linear system assembly and solution to meet Coco's high cloth demand
- Context
- Improves on robust skinned-cloth collision and friction handling (Bridson et al., Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation) within Pixar's in-house Fizt solver. Builds on: Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on Coco's dense skeleton cast, and the intersection-recovery and speedup techniques are framed as broadly applicable but reported in a production context.
- Clarity
- Accessible talk; a single read conveys the problems and solutions, with the solver-assembly speedups stated at a practitioner level.
- How to read it
- Focus on the intersection-recovery strategy under continuous detection and the linear-solve speedups; useful as a reference if you maintain an implicit cloth solver, otherwise one pass suffices.
CFX
-
Ubisoft La Forge demonstrated how neural networks can build scalable animation systems for games, covering Phase-Functioned and Mode-Adaptive Neural Networks for character locomotion.
Motion Synthesis / ML Deformation
-
Blue Sky Studios technique that fragments character geometry into independently-evaluating pieces processed in parallel, then recombined via SubDeform to hit 24 fps targets.
abstract ▾ abstract ▴
Describes the ChopRig system for optimizing character rig evaluation in parallel by breaking up geometry and deformer stacks across multiple processing cores. The system pre-separates character geometry and distributes deformers to chopped pieces, then combines them using a lightweight SubDeform plugin, achieving interactive performance above 24 fps.
Related LibEE: A Multithreaded Dependency Graph for Character Animation · Group Based Rigging of Realistically Feathered Wings · Interactive Sculpting of Digital Faces Using an Anatomical Modeling Paradigm · It's a UVN Face Rig, Charlie Brown: Facial Techniques for Peanuts
how to read this ▾ how to read this ▴
- Category
- Production technique: parallel rig evaluation
- Contributions
-
- Describes the ChopRig system that fragments character geometry and deformer stacks so pieces evaluate independently across cores
- Pre-separates geometry and distributes deformers to chopped pieces, then recombines them with a lightweight SubDeform plugin
- Targets interactive rig playback above 24 fps
- Context
- Relates to Blue Sky Studios' real-time rig efforts (Hallac, Achieving and Maintaining Real-Time Rigs) and the general goal of parallelizing deformer evaluation for interactive performance. Builds on: Achieving and Maintaining Real-Time Rigs
- Correctness
- Studio practice, not peer-reviewed; presented as a production-proven approach, so the 24 fps target reflects a specific pipeline and the chopping/recombination strategy must preserve correct deformation at seams.
- Clarity
- Accessible write-up (Medium article) describing the system at a conceptual level rather than a formal specification.
- How to read it
- Read for the chop-and-recombine concept and the role of SubDeform; one pass is enough unless you are designing a parallel rig-eval system, in which case probe how seam continuity is maintained.
Rigging
- Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation SIGGRAPH Pixar 12 cites
, ,
Applies volume simulation to character meshes before cloth simulation to eliminate self-intersections, used in production on Pixar's short Bao.
abstract ▾ abstract ▴
Simulation artists frequently work with characters that self-intersect. When these characters are sent as inputs to a cloth simulator, the results can often contain terrible artifacts that must addressed by tediously sculpting either the input characters or the output cloth. In this talk, we apply volume simulation to character meshes and remove self-intersections before they are sent to the cloth simulator. The technique has successfully dealt with very challenging animation scenarios in a production setting, and was applied to all the characters on the short film, Bao.
Related Dynamic Deformables: Implementation and Production Practicalities · Nonlinear Cloth Simulation with Isogeometric Analysis · Untangling Cloth · Fast Cloth Simulation on Moving Humanoids
how to read this ▾ how to read this ▴
- Category
- Production talk: input cleanup for cloth simulation
- Contributions
-
- Applies volume simulation to character meshes to remove self-intersections before cloth simulation
- Reduces the tedious manual sculpting of input characters or output cloth caused by self-intersecting inputs
- Reports successful use on challenging animation on Pixar's short film Bao
- Context
- Builds on cloth-untangling ideas (Baraff et al., Untangling Cloth) but moves the intervention upstream, resolving intersections on the character input rather than on the simulated cloth. Builds on: Untangling Cloth
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a single short (Bao), so robustness claims are tied to the cases encountered there and depend on the volume sim resolving intersections without distorting the intended pose.
- Clarity
- Accessible short talk; a single read conveys the idea and where it sits in the pipeline.
- How to read it
- Read for the insight of cleaning the character input before the cloth solver; one pass suffices, with attention to how volume sim trades off intersection removal against shape fidelity.
CFX
-
EA presented a predictive-contact approach to cloth self-collision that supports large simulation timesteps with only a single collision sweep, costing a few milliseconds per CPU thread.
CFX
-
Cubic Motion CEO traces the evolution of their markerless facial tracking and mocap-solving AI across three real-time digital human projects, showing advances in performance fidelity.
Retargeting / Facial
-
,
Autoregressive RNN conditioned on target keyframes automatically completes motions matching sparse user-specified poses; won MIG 2018 Best Paper.
abstract ▾ abstract ▴
We explore the potential of learned autocompletion methods for synthesizing animated motions from input keyframes. Our model uses an autoregressive two-layer recurrent neural network that is conditioned on target keyframes. The model is trained on the motion characteristics of example motions and sampled keyframes from those motions. Given a set of desired key frames, the trained model is then capable of generating motion sequences that interpolate the keyframes while following the style of the examples observed in the training corpus. We demonstrate our method on a hopping lamp, using a diverse set of hops from a physics-based model as training data. The model can then synthesize new hops based on a diverse range of keyframes. We discuss the strengths and weaknesses of this type of approach in some detail.
Related Character Motion Synthesis by Topology Coordinates · Motion Grammars for Character Animation · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Physically Based Motion Transformation
how to read this ▾ how to read this ▴
- Category
- Method: data-driven keyframe autocompletion (learned animation synthesis)
- Contributions
-
- Uses an autoregressive two-layer recurrent network conditioned on target keyframes to autocomplete motion
- Trains on example motions plus keyframes sampled from them so generated motion interpolates keys in the examples' style
- Demonstrates synthesis of new hops for a hopping-lamp character from a diverse set of keyframes (MIG 2018 Best Paper)
- Context
- Sits between learned motion manifolds (Holden et al., Learning Motion Manifolds with Convolutional Autoencoders) and classic keyframe animation (Burtnyk and Wein, Interactive Skeleton Techniques), framing autocompletion as conditioned sequence generation. Builds on: Learning Motion Manifolds with Convolutional Autoencoders · Interactive Skeleton Techniques for Enhancing Motion Dynamics in Key Frame Animation
- Correctness
- Demonstrated on a single hopping-lamp character with physics-based training data; the authors themselves discuss strengths and weaknesses, so generalization beyond this constrained, exploratory setting should not be assumed.
- Clarity
- Accessible and candid about limitations; a first pass conveys the idea, a second pass clarifies the conditioning and training setup.
- How to read it
- Focus on how keyframe conditioning shapes the RNN output and on the stated weaknesses; a second pass is worthwhile mainly for the training/sampling details rather than for transferable results.
Motion Synthesis
-
, , ,
Deep generative model for face appearance and geometry enabling photorealistic rendering and synthesis of facial expressions.
abstract ▾ abstract ▴
We introduce a deep appearance model for rendering the human face. Inspired by Active Appearance Models, we develop a data-driven rendering pipeline that learns a joint representation of facial geometry and appearance from a multiview capture setup. Vertex positions and view-specific textures are modeled using a deep variational autoencoder that captures complex nonlinear effects while producing a smooth and compact latent representation. View-specific texture enables the modeling of view-dependent effects such as specularity. In addition, it can also correct for imperfect geometry stemming from biased or low resolution estimates. This is a significant departure from the traditional graphics pipeline, which requires highly accurate geometry as well as all elements of the shading model to achieve realism through physically-inspired light transport. Acquiring such a high level of accuracy is difficult in practice, especially for complex and intricate parts of the face, such as eyelashes and the oral cavity. These are handled naturally by our approach, which does not rely on precise estimates of geometry. Instead, the shading model accommodates deficiencies in geometry though the flexibility afforded by the neural network employed. At inference time, we condition the decoding network on the viewpoint of the camera in order to generate the appropriate texture for rendering.
Related Facial Animation with Disentangled Identity and Motion using Transformers · Learning a Generalized Physical Face Model From Data · CANRIG: Cross-Attention Neural Face Rigging with Variable Local Control · Acquiring the Reflectance Field of a Human Face
how to read this ▾ how to read this ▴
- Category
- Method: deep generative model for face rendering
- Contributions
-
- Introduces a deep appearance model that jointly learns facial geometry and appearance from a multiview capture setup
- Models vertex positions and view-specific textures with a deep variational autoencoder, capturing view-dependent effects like specularity in a compact latent space
- Tolerates imperfect or low-resolution geometry by letting the shading model compensate, avoiding the traditional pipeline's need for highly accurate geometry
- Context
- Inspired by Active Appearance Models and related to data-driven face capture and reenactment (Thies et al., Face2Face), departing from physically-based light-transport rendering. Builds on: Face2Face: Real-Time Face Capture and Reenactment of RGB Videos
- Correctness
- Trained from a specific multiview capture rig; the key assumption is that view-specific texture can absorb geometric error, so results depend on capture coverage and the approach is demonstrated for face rendering rather than as a general light-transport replacement.
- Clarity
- Reasonably accessible given the AAM framing; a first pass conveys the idea, a second pass is needed for the VAE architecture and view conditioning.
- How to read it
- Focus on the joint geometry-plus-view-dependent-texture VAE and why precise geometry becomes unnecessary; do a second pass on the model details if rendering or capture is your focus.
Facial / ML Deformation
- DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills SIGGRAPH Academic 579 cites project ↗
, , ,
Physics-based character controller using deep RL to imitate reference motion clips, achieving diverse athletic skills with natural-looking dynamics.
abstract ▾ abstract ▴
A longstanding goal in character animation is to combine data-driven specification of behavior with a system that can execute a similar behavior in a physical simulation, enabling realistic responses to perturbations and environmental variation. We show that well-known reinforcement learning methods can be adapted to learn robust control policies capable of imitating a broad range of example motion clips, while also learning complex recoveries, adapting to changes in morphology, and accomplishing user-specified goals. The method handles keyframed motions, highly dynamic actions such as motion-captured flips and spins, and retargeted motions. By combining a motion-imitation objective with a task objective, characters can be trained to react intelligently in interactive settings.
Related SFV: Reinforcement Learning of Physical Skills from Video · AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters · SuperTrack: Motion Tracking for Physically Simulated Characters Using Supervisory Signals
how to read this ▾ how to read this ▴
- Category
- Method: physics-based character control via deep reinforcement learning
- Contributions
-
- Adapts deep RL to learn control policies that imitate a broad range of reference motion clips in physics simulation
- Combines a motion-imitation objective with a task objective so characters pursue user goals while staying natural
- Handles keyframed, highly dynamic (flips, spins), and retargeted motions, and learns complex recoveries plus adaptation to morphology changes
- Context
- Builds on the authors' earlier deep-RL locomotion work (Peng et al., Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning), advancing example-guided imitation of reference motion. Builds on: Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning
- Correctness
- Demonstrated across diverse athletic skills with natural-looking dynamics and perturbation recovery; results depend on quality reference clips and per-skill training, and the imitation-plus-task objective balance is central to the behavior obtained.
- Clarity
- Accessible in motivation; a first pass conveys the imitation-plus-task idea, a second pass is needed for the reward design and training setup.
- How to read it
- Focus on the reward formulation (imitation plus task) and how reference clips are used; a second pass pays off for the RL training details if you plan to reproduce or extend it.
Motion Synthesis
- Dynamic Kelvinlets: Secondary Motions Based on Fundamental Solutions of Elastodynamics SIGGRAPH Pixar 7 cites
,
Real-time physically based secondary animation using analytic elastodynamic solutions, enabling compressive and shear wave effects on characters.
abstract ▾ abstract ▴
We introduce Dynamic Kelvinlets, a new analytical technique for real-time physically based animation of virtual elastic materials. Our formulation is based on the dynamic response to time-varying force distributions applied to an infinite elastic medium. The resulting displacements provide the plausibility of volumetric elasticity, the dynamics of compressive and shear waves, and the interactivity of closed-form expressions. Our approach builds upon the work of de Goes and James [2017] by presenting an extension of the regularized Kelvinlet solutions from elastostatics to the elastodynamic regime. To finely control our elastic deformations, we also describe the construction of compound solutions that resolve pointwise and keyframe constraints. We demonstrate the versatility and efficiency of our method with a series of examples in a production grade implementation.
Related Interactive Skeleton-Driven Dynamic Deformations · Regularized Kelvinlets: Sculpting Brushes Based on Fundamental Solutions of Elasticity · Complementary Dynamics · Physically Based Rigging for Deformable Characters
how to read this ▾ how to read this ▴
- Category
- Method: an analytic elastodynamic deformation technique for real-time secondary motion
- Contributions
-
- Extends the regularized Kelvinlet solutions from elastostatics to the elastodynamic regime, giving closed-form displacements with compressive and shear wave dynamics
- Constructs compound solutions that resolve pointwise and keyframe constraints for fine control
- Demonstrates the approach in a production grade implementation with interactive, physically plausible volumetric elasticity
- Context
- Directly extends de Goes and James 2017 (Regularized Kelvinlets) from static sculpting to time-varying elastodynamics, sitting in the lineage of fundamental-solution (Green's function) approaches to elasticity. Builds on: Regularized Kelvinlets: Sculpting Brushes Based on Fundamental Solutions of Elasticity
- Correctness
- Built on the idealization of an infinite elastic medium with analytic responses, so it targets plausible secondary motion rather than a full boundary-respecting FEM solve; readers should remember it trades exact domain and boundary handling for closed-form speed and control.
- Clarity
- Reasonably accessible at the concept level; a first pass conveys the idea (analytic waves driven by force distributions), and a second pass is needed to follow the elastodynamic formulation and compound-constraint construction.
- How to read it
- Skim the figures and the static-to-dynamic extension first; if you intend to implement it, do a second pass on the elastodynamic kernels and the pointwise/keyframe constraint solve.
Rigging / ML Deformation
-
, , ,
Neural networks learn a film rig's nonlinear deformations so approximated characters run interactively, the paper that opened the ML deformer era.
abstract ▾ abstract ▴
Character rigs are procedural systems that compute the shape of an animated character for a given pose. They can be highly complex and must account for bulges, wrinkles, and other aspects of a character's appearance. When comparing film-quality character rigs with those designed for real-time applications, there is typically a substantial and readily apparent difference in the quality of the mesh deformations. Real-time rigs are limited by a computational budget and often trade realism for performance. Rigs for film do not have this same limitation, and character riggers can make the rig as complicated as necessary to achieve realistic deformations. However, increasing the rig complexity slows rig evaluation, and the animators working with it can become less efficient and may experience frustration. In this paper, we present a method to reduce the time required to compute mesh deformations for film-quality rigs, allowing better interactivity during animation authoring and use in real-time games and applications. Our approach learns the deformations from an existing rig by splitting the mesh deformation into linear and nonlinear portions. The linear deformations are computed directly from the transformations of the rig's underlying skeleton. We use deep learning methods to approximate the remaining nonlinear portion.
Related Fast and Deep Facial Deformations · Delta Mush: Smoothing Deformations While Preserving Detail · Skinning: Real-time Shape Deformation · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks
how to read this ▾ how to read this ▴
- Category
- Method: a learned (neural) approximation of film-rig mesh deformation
- Contributions
-
- Learns a film-quality character rig's deformations from existing rig output, enabling much faster mesh evaluation for interactive authoring and real-time use
- Splits the deformation into a linear part computed directly and a nonlinear part predicted by neural networks
- Closes the visible quality gap between film and real-time rigs while keeping evaluation fast
- Context
- Relates to skeleton-driven and example-based deformation, building on Pose Space Deformation (Lewis et al. 2000) and skinning decomposition such as Smooth Skinning Decomposition with Rigid Bones (Le and Deng 2012); widely seen as opening the ML-deformer era. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation · Smooth Skinning Decomposition with Rigid Bones
- Correctness
- The approximation is learned from one specific source rig, so quality is bounded by training pose coverage and the rig it imitates; it approximates rather than reproduces the original rig, and extrapolation to unseen poses is the natural limitation to keep in mind.
- Clarity
- Accessible and well-motivated; a first pass conveys the linear-plus-nonlinear split clearly, and a second pass pays off for the network design and training details.
- How to read it
- Focus on the linear/nonlinear decomposition and how the nonlinear residual is learned; a second pass is worth it if you plan to train your own deformer, otherwise the abstract and figures carry the idea.
ML Deformation / Skinning
-
, ,
This paper introduces an operator-splitting approach for corotated FEM simulation by separating the corotated linear deformation energy into a stretching term and a volume-preservation term.
abstract ▾ abstract ▴
This paper introduces an operator-splitting approach for corotated FEM simulation by separating the corotated linear deformation energy into a stretching term and a volume-preservation term. Formulating backward Euler as an optimization shows the stretching term is rotation invariant, so it can be solved accurately with a precomputed Cholesky factorization while the volume term is handled with compliant constraints and Gauss-Seidel iterations. The result is a fast, stable solver for elastic solids and shells that supports large time steps.
Related Multi-Resolution Isotropic Strain Limiting · Flesh, Flab, and Fascia Simulation on Zootopia · Robust Treatment of Degenerate Elements in Interactive Corotational FEM Simulations · Loki: A Unified Multiphysics Simulation Framework for Production
how to read this ▾ how to read this ▴
- Category
- Method: an operator-splitting solver for corotated FEM
- Contributions
-
- Splits the corotated linear deformation energy into a stretching term and a volume-preservation term
- Shows the stretching term is rotation invariant so it is solved accurately with a precomputed Cholesky factorization, while volume is handled via compliant constraints and Gauss-Seidel
- Yields a fast, stable solver for elastic solids and shells that supports large time steps
- Context
- Sits in the corotated linear FEM and constraint-based simulation lineage, relating to efficient character-skinning elasticity such as McAdams et al. 2011 (Efficient Elasticity for Character Skinning with Contact and Collisions). Builds on: Efficient Elasticity for Character Skinning with Contact and Collisions
- Correctness
- Relies on the corotated linear energy model and a precomputed factorization, so accuracy depends on that linearization holding and on the Gauss-Seidel volume iterations; demonstrated on elastic solids and shells, but it is an approximation, not a fully nonlinear hyperelastic solve.
- Clarity
- Moderately technical; a first pass conveys the splitting intuition, and a second pass is needed to follow the backward-Euler-as-optimization derivation and the rotation-invariance argument.
- How to read it
- Read for the energy-splitting idea and why the stretching term reuses one Cholesky factorization; do a second pass on the optimization formulation if you implement implicit integration.
CFX
-
, , ,
Animal Logic integrates USD into Maya via AL_USDMaya for Peter Rabbit, animating 5 hero characters across 1100 shots with a production-proven USD-based animation system.
abstract ▾ abstract ▴
This talk presents practical, production-proven solutions for integrating Pixar's Universal Scene Description (USD) into Autodesk Maya through the open-sourced AL_USDMaya plugin, used to build a high-performance animation platform for the Peter Rabbit movie. A proprietary Forge software stack generates USD files and exposes both simple and advanced views, making USD the owner of the scene data streamed directly into the Maya viewport by Pixar's Hydra renderer. Custom translators, a Python command system, and Forge Options expose rig levels of detail as USD variants and bundle prim states so animators can swap geometry caches for rigs with a single click.
Related Using USD in Pixar's Digital Backlot · Zero to USD in 80 Days · USD at Scale · Universal Scene Description: Open Source Release
how to read this ▾ how to read this ▴
- Category
- Production talk: a USD-based animation pipeline integration
- Contributions
-
- Demonstrates integrating Pixar's USD into Maya via the open-sourced AL_USDMaya plugin to build a high-performance animation platform for Peter Rabbit
- Shows a proprietary Forge stack generating USD and streaming scene data into the Maya viewport through Pixar's Hydra renderer
- Exposes rig levels of detail as USD variants with custom translators, a Python command system, and one-click geometry-cache or rig swapping
- Context
- Builds on the open-source release of Universal Scene Description (Pixar 2016), applying it as the backbone of a studio animation pipeline inside Autodesk Maya. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Studio practice, not peer-reviewed; the solutions are production-proven on a specific film and toolchain (Maya, AL_USDMaya, Hydra, the proprietary Forge stack), so generality to other pipelines is not the claim.
- Clarity
- Accessible to pipeline and TD readers; a single pass conveys the architecture, with details mostly in the workflow rather than in math.
- How to read it
- Read once for the pipeline architecture (Forge to USD to Hydra in Maya) and the LOD-as-variant pattern; useful as a reference if you are wiring USD into a DCC, no formal second pass needed.
Rigging
-
,
Reconstructs simulatable strands from artist-created hair meshes and applies style-specific simulation per cluster for efficiency.
abstract ▾ abstract ▴
As the deformation behaviors of hair strands vary greatly depending on the hairstyle, the computational cost and accuracy of hair movement simulations can be significantly improved by applying simulation methods specific to a certain style. This paper makes two contributions with regard to the simulation of various hair styles. First, we propose a novel method to reconstruct simulatable hair strands from hair meshes created by artists. Manually created hair meshes consist of numerous mesh patches, and the strand reconstruction process is challenged by the absence of connectivity information among the patches for the same strand and the omission of hidden parts of strands due to the manual creation process. To this end, we develop a two‐stage spectral clustering method for estimating the degree of connectivity among patches and a strand‐growing method that preserves hairstyles. Next, we develop a hairstyle classification method for style‐specific simulations. In particular, we propose a set of features for efficient classifications and show that classifiers trained with the proposed features have higher accuracy than those trained with naive features. Our method applies efficient simulation methods according to the hairstyle without specific user input, and thus is favorable for real‐time simulation.
Related Artistic Simulation of Curly Hair · The Art and Technology of Hair Simulation in Disney's Moana · Strand-Accurate Multi-View Hair Capture · Hair Meshes
how to read this ▾ how to read this ▴
- Category
- Method: strand reconstruction from artist hair meshes plus style-specific simulation
- Contributions
-
- Reconstructs simulatable hair strands from artist-created hair meshes using two-stage spectral clustering plus a hairstyle-preserving strand-growing method
- Proposes a hairstyle classification method with features that classify more accurately than naive features
- Applies efficient style-specific simulation per cluster to improve cost and accuracy
- Context
- No explicit prior works are listed; it relates generally to hair-meshing/strand-reconstruction and physically based hair simulation, with the novel twist of selecting a simulation method based on classified hairstyle.
- Correctness
- Assumes mesh patches can be reliably grouped into strands despite missing connectivity and hidden geometry, and that style classification maps cleanly to an appropriate simulation method; reconstruction quality on messy or unusual artist meshes is the limitation to watch.
- Clarity
- Fairly accessible; a first pass conveys the two-contribution structure (reconstruction, then style-based simulation), and a second pass clarifies the clustering and feature definitions.
- How to read it
- Focus first on the strand-reconstruction pipeline, then on the classification features; second pass worthwhile mainly if you handle artist-authored hair meshes.
CFX
-
, , ,
HairControl adds artistic control to physics-based hair simulation by constraining a high-resolution detail-hair simulation to follow an input animation of coarse guide hairs.
abstract ▾ abstract ▴
HairControl adds artistic control to physics-based hair simulation by constraining a high-resolution detail-hair simulation to follow an input animation of coarse guide hairs. The core of the method is a set of tracking constraints that require the center of mass of a subset of detail hairs to maintain its position relative to a reference point on the corresponding guide hair. This lets artists direct overall hair motion while preserving secondary dynamics and natural collisions in the detailed simulation.
Related Physics-Inspired Upsampling for Cloth Simulation in Games · Animating Puss in Boots' Feather in Shrek 2 · Hummingbird: DreamWorks Feather System · Towards Realtime: A Hybrid Physics-based Method for Hair Animation on GPU
how to read this ▾ how to read this ▴
- Category
- Method: directable hair simulation via tracking constraints
- Contributions
-
- Adds artistic control to physics-based hair by constraining a high-resolution detail-hair simulation to follow an animation of coarse guide hairs
- Introduces tracking constraints that hold the center of mass of a subset of detail hairs relative to a reference point on the corresponding guide hair
- Lets artists direct overall hair motion while preserving secondary dynamics and natural collisions
- Context
- Continues the directable/art-directed hair lineage, building on Directing Hair Motion on Tangled (Simmons et al. 2011) by tracking guide-hair animation rather than fully re-simulating. Builds on: Directing Hair Motion on Tangled
- Correctness
- Assumes a meaningful guide-to-detail correspondence and that center-of-mass tracking is the right control handle; the input guide animation drives results, so quality depends on guide quality and how aggressively tracking competes with natural dynamics.
- Clarity
- Accessible; the core constraint idea is intuitive and a first pass conveys it, with a second pass useful for the constraint formulation and solver coupling.
- How to read it
- Read for the tracking-constraint definition and how it balances direction against secondary motion; a second pass pays off if you implement guide-driven hair control.
CFX
-
, , , , , ,
CNN mapping 2D orientation field input to scalp-parameterized strand features; generates 30K-strand hairstyle in real time with a novel collision loss.
abstract ▾ abstract ▴
We introduce a deep learning-based method to generate full 3D hair geometry from an unconstrained image. Our method can recover local strand details and has real-time performance. State-of-the-art hair modeling techniques rely on large hairstyle collections for nearest neighbor retrieval and then perform ad-hoc refinement. Our deep learning approach, in contrast, is highly efficient in storage and can run 1000 times faster while generating hair with 30K strands. The convolutional neural network takes the 2D orientation field of a hair image as input and generates strand features that are evenly distributed on the parameterized 2D scalp. We introduce a collision loss to synthesize more plausible hairstyles, and the visibility of each strand is also used as a weight term to improve the reconstruction accuracy. The encoder-decoder architecture of our network naturally provides a compact and continuous representation for hairstyles, which allows us to interpolate naturally between hairstyles. We use a large set of rendered synthetic hair models to train our network. Our method scales to real images because an intermediate 2D orientation field, automatically calculated from the real image, factors out the difference between synthetic and real hairs.
Related Perm: A Parametric Representation for Multi-Style 3D Hair Modeling · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Neural Cloth Simulation · A Pixel-Based Framework for Data-Driven Clothing
how to read this ▾ how to read this ▴
- Category
- Method: single-view 3D hair reconstruction with a CNN
- Contributions
-
- Generates full 3D hair geometry (30K strands) from a single unconstrained image in real time, far faster than retrieval-based pipelines
- Uses an encoder-decoder CNN that maps a 2D orientation field to strand features evenly distributed on a parameterized 2D scalp
- Introduces a collision loss for more plausible hairstyles and uses per-strand visibility as a reconstruction weight, with a compact latent space allowing interpolation between hairstyles
- Context
- Departs from database nearest-neighbor retrieval methods such as AutoHair (Chai et al. 2016) by learning a direct image-to-strand mapping, training on rendered synthetic hair and bridging to real images via an intermediate 2D orientation field. Builds on: AutoHair: Fully Automatic Hair Modeling from a Single Image
- Correctness
- Trained on synthetic rendered hair and relying on an automatically computed orientation field, so real-image accuracy hinges on that field and on synthetic-to-real generalization; it recovers plausible strands rather than guaranteed ground-truth geometry, especially for occluded interior hair.
- Clarity
- Accessible for readers with CNN background; a first pass conveys the orientation-field-to-strands idea, and a second pass clarifies the scalp parameterization and the collision/visibility losses.
- How to read it
- Read for the input representation (2D orientation field) and the strand decoder plus collision loss; second pass worthwhile if you work on learned hair or single-view reconstruction.
CFX / ML Deformation
-
,
Introduces hierarchical sculpting controls into Disney's hair pipeline for layered art-direction of hair motion across varying simulation scales.
abstract ▾ abstract ▴
Creating appealing shapes and silhouettes of a character's hair while maintaining the organic motion produced by physical simulation is a challenge in Disney's very stylized animated worlds. This talk describes the introduction of hierarchical sculpting controls into our hair pipeline and presents a set of tools for creating and manipulating this consistent structure to achieve art-directed hair motion. From grooming through animation, simulation and technical animation, hierarchy is leveraged both for efficiency and for preservation of the hairstyle's structure. To date this hierarchical workflow has been used on two feature productions, allowing for the efficient art-direction of a wide variety of hair types and styles.
Related Scriptable Character FX Solution · Art-Directing Asha's Braids in Disney's Wish · Choreography of Hair and Cloth in Disney's Moana 2 · The Art and Technology of Hair Simulation in Disney's Moana
how to read this ▾ how to read this ▴
- Category
- Production talk: hierarchical art-direction controls for hair
- Contributions
-
- Introduces hierarchical sculpting controls into Disney's hair pipeline for layered art-direction of hair shape and motion
- Presents tools to create and manipulate a consistent hierarchy spanning grooming, animation, simulation, and technical animation
- Leverages hierarchy for both efficiency and preservation of hairstyle structure across simulation scales
- Context
- Extends Disney's stylized-hair workflow, building on The Art and Technology of Hair Simulation in Disney's Moana (Thyng et al. 2017) toward more art-directable hierarchical control. Builds on: The Art and Technology of Hair Simulation in Disney's Moana
- Correctness
- Studio practice, not peer-reviewed; results are production-proven, reported as used on two feature productions, so the value is in the workflow rather than in validated quantitative claims.
- Clarity
- Accessible; a single pass conveys the hierarchical-control concept and where it slots into the pipeline.
- How to read it
- Read once for how hierarchy is defined and reused from groom through sim; a useful reference for art-directable hair pipelines, no formal second pass needed.
CFX
-
, ,
This work presents a model-reduction scheme for projective dynamics that accelerates the simulation of deformable objects.
abstract ▾ abstract ▴
This work presents a model-reduction scheme for projective dynamics that accelerates the simulation of deformable objects. It combines subspace reduction of the configuration space with a hyper-reduction of the nonlinear force and constraint terms via a fitted cubature approximation. The resulting solver achieves fast, real-time elastic simulation with controllable accuracy and large speedups over full projective dynamics.
Related Projective Dynamics with Dry Frictional Contact · Projective Dynamics: Fusing Constraint Projections for Fast Simulation · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains
how to read this ▾ how to read this ▴
- Category
- Method: a model-reduction scheme for projective dynamics
- Contributions
-
- Presents a hyper-reduced projective dynamics scheme that accelerates deformable-object simulation
- Combines subspace reduction of the configuration space with hyper-reduction of nonlinear force and constraint terms via a fitted cubature approximation
- Achieves fast, real-time elastic simulation with controllable accuracy and large speedups over full projective dynamics
- Context
- Builds directly on Projective Dynamics (Bouaziz et al. 2014), bringing subspace and cubature-style hyper-reduction (familiar from reduced-order elasticity) to that solver. Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- Accuracy is bounded by the chosen subspace and the fitted cubature sample set, so it is a controllable approximation rather than the full solve; readers should weigh the speed gains against subspace expressiveness for their deformation range.
- Clarity
- Technical; a first pass conveys the two-stage reduction idea, and a second (and likely third) pass is needed to follow the subspace construction and cubature fitting.
- How to read it
- Read for how subspace reduction and cubature hyper-reduction are combined; plan a careful second pass on the cubature fitting if you intend to reproduce the speedups.
CFX
-
,
Dual-representation graph engine that separates authoring nodes from evaluation tasks, achieving a 100x authoring speedup over LibEE v1 in the Premo system.
abstract ▾ abstract ▴
The Premo animation platform [Gong et al. 2014] developed by DreamWorks utilized LibEE v1 [Watt et al. 2012] for high performance graph evaluation. The animator experience required fast evaluation, but did not require fast editing of the graph. LibEE v1, therefore, was never designed to support efficient edits. This talk presents an overview of how we developed LibEE v2 to enable fast editing of character rigs while still maintaining or improving upon the speed of evaluation. Overall, LibEE v2 achieves a 100x speedup of authoring operations compared with LibEE v1.
Related LibEE: A Multithreaded Dependency Graph for Character Animation · Stable and Efficient Differential IK · Using Deep Learning to Approximate Joint Placement in 3D Bipedal Characters · A.C.M.E. Multilimb System
how to read this ▾ how to read this ▴
- Category
- Production talk: a rebuilt dependency-graph engine for character rigs
- Contributions
-
- Presents LibEE v2, a dual-representation graph engine that separates authoring nodes from evaluation tasks
- Enables fast editing of character rigs while maintaining or improving evaluation speed
- Reports a 100x authoring-operation speedup over LibEE v1 within the Premo platform
- Context
- Direct successor to LibEE: A Multithreaded Dependency Graph for Character Animation (Watt et al. 2012), used by DreamWorks' Premo platform, where v1 prioritized evaluation speed over edit speed. Builds on: LibEE: A Multithreaded Dependency Graph for Character Animation
- Correctness
- Studio practice, not peer-reviewed; the 100x figure is a production-proven authoring speedup specific to Premo and the LibEE codebase, so it should be read as an internal engineering result rather than a generalized benchmark.
- Clarity
- Accessible to engine and pipeline readers; a single pass conveys the authoring-versus-evaluation split that motivates the redesign.
- How to read it
- Read once for the dual-representation design and why decoupling authoring from evaluation unlocks fast edits; valuable if you build or maintain a rig-evaluation graph, no formal second pass needed.
Rigging
-
,
Presented the pipeline behind the MEETMIKE real-time avatar: performance capture, solver, and photorealistic rendering in a live interactive context.
Facial / Retargeting
- Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 SIGGRAPH Pixar 3 cites
, ,
Pixar approximates complex Presto crowd rigs with skinned skeletons to enable mocap, motion blending, and ragdoll physics interchange for Incredibles 2 crowd pipeline.
abstract ▾ abstract ▴
The stylized world of Incredibles 2 features large urban crowds both in everyday situations and in scenes of panicked mayhem. While Pixar's now academy award winning animation software, Presto, has allowed us to create expressive and nuanced rigs for our crowd characters, our proprietary approach has made it difficult to utilize animation from external sources, such as crowd simulations or from motion capture. In this talk, we discuss how we can automatically approximate our complex rigs with skinned skeletons, as well as how this has opened up our crowd pipeline to procedural look-ats, motion blending, ragdoll physics, and motion capture. In particular, the use of motion capture is novel for Pixar, and finding a way to integrate this workflow into our animator-centric pipeline and culture has been an ongoing effort. The system we designed allows us to capture motion data for multiple characters in the context of complex shots in Presto, and it facilitates choreography of nuanced and specifically timed crowd motions. Together with traditional hand animated motion cycles, our crowd choreography tools in Presto [Arumugam et al., 2013], and skeletal agent based simulation in SideFX's Houdini [SideFX, [n. d.]] via our MURE tools [Gustafson et al., 2016], the crowds team on Incredibles 2 produced rich scenes of busy streets and urban panic.
Related Motion Retargeting for Crowd Simulation · Automatic Rigging and Animation of 3D Characters · Real-Time Weighted Pose-Space Deformation on the GPU · MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline breakdown (rig interoperability for crowds)
- Contributions
-
- Automatically approximates complex proprietary Presto crowd rigs with skinned skeletons
- Opens the crowd pipeline to procedural look-ats, motion blending, ragdoll physics, and motion capture
- Integrates multi-character mocap capture and crowd choreography into Pixar's animator-centric workflow
- Context
- Relates to Pixar's crowd and scene-description tooling (referenced Universal Scene Description release), bridging the proprietary Presto rig to interchange formats used by external simulation and mocap sources. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on Incredibles 2, so a reader should treat the skinned-skeleton approximation as a pragmatic interchange device rather than a general accuracy guarantee.
- Clarity
- Accessible and narrative; a single first pass conveys the pipeline goals and the interchange idea.
- How to read it
- Read once for the workflow motivation and the approximate-rig-as-skeleton trick; no formulation to chase, so focus on which interchange capabilities (look-ats, blending, ragdoll, mocap) each step unlocks.
Rigging / Skinning
-
, , ,
Mode-adaptive neural network with learned gating for quadruped locomotion control handling diverse gaits and transitions.
abstract ▾ abstract ▴
Quadruped motion includes a wide variation of gaits such as walk, pace, trot and canter, and actions such as jumping, sitting, turning and idling. Applying existing data-driven character control frameworks to such data requires a significant amount of data preprocessing such as motion labeling and alignment. In this paper, we propose a novel neural network architecture called Mode-Adaptive Neural Networks for controlling quadruped characters. The system is composed of the motion prediction network and the gating network. At each frame, the motion prediction network computes the character state in the current frame given the state in the previous frame and the user-provided control signals. The gating network dynamically updates the weights of the motion prediction network by selecting and blending what we call the expert weights, each of which specializes in a particular movement. Due to the increased flexibility, the system can learn consistent expert weights across a wide range of non-periodic/periodic actions, from unstructured motion capture data, in an end-to-end fashion. In addition, the users are released from performing complex labeling of phases in different gaits. We show that this architecture is suitable for encoding the multi-modality of quadruped locomotion and synthesizing responsive motion in real-time.
Related Neural State Machine for Character-Scene Interactions · Near-Optimal Character Animation with Continuous Control · Local Motion Phases for Learning Multi-Contact Character Movements · Character Motion Synthesis by Topology Coordinates
how to read this ▾ how to read this ▴
- Category
- Method: a data-driven neural controller for character locomotion
- Contributions
-
- Introduces Mode-Adaptive Neural Networks for quadruped motion control
- Pairs a motion-prediction network with a gating network that blends specialized expert weights per frame
- Learns from unstructured mocap end-to-end, removing manual gait labeling and phase alignment
- Context
- Builds on phase-based neural character control (Phase-Functioned Neural Networks), generalizing the periodic phase mechanism into a learned gating over experts for non-periodic and periodic quadruped actions. Builds on: Phase-Functioned Neural Networks for Character Control
- Correctness
- Demonstrated on quadruped locomotion across varied gaits and actions learned from mocap; a reader should note that the gating/expert capacity and training-data coverage bound which motions and transitions can be reproduced.
- Clarity
- Moderately technical; a first pass conveys the gating-of-experts idea, and a second pass is needed for the network formulation and training setup.
- How to read it
- First pass for the gating-network-blends-experts concept versus a fixed phase function; do a second pass on the architecture and loss if you intend to implement or adapt it to other morphologies.
Motion Synthesis
-
,
Follow-up GDC bootcamp session expanding motion matching with advanced trajectory control and production implementation strategies.
Motion Synthesis
-
, , , ,
Optimal keyframe selection algorithm for motion capture data that minimizes reconstruction error while enabling interactive artist-driven control.
Retargeting
-
Surveyed ILM's multi-decade evolution in photoreal character and digital human creation, from early tests through modern deep-learning-assisted pipelines.
Facial / Skinning
-
, , , ,
Transfers a desired edge-loop patch layout from a reference mesh to deformed meshes via decal-map-weighted relaxation, used on Bao and Incredibles 2.
abstract ▾ abstract ▴
From rigging to post-simulation cleanups, surface relaxation is a widely used procedure in feature animation. Over the years, Pixar has experimented with several techniques for this task, mostly based on variants of Laplacian smoothing. Notably, none of existing approaches are suited to reproduce the patch layout of a baseline mesh. This is of particular interest for modeling the span of edge flows, or for restoring the rest configuration of a mesh under large deformations. To achieve this goal, we developed a new patch-aware relaxation method for general polygonal meshes. Our approach encompasses three main contributions. We first introduce a weighting scheme that uses local decal maps to encode the structure of edge flows formed by the desirable patch layout. We then propose an update rule that transfers a reference patch arrangement to a deformed mesh. To control volume preservation, we also present a surface-constrained regime that exploits decal maps to slide points within the surface. We demonstrate the effectiveness and versatility of our tool with a series of examples from Pixar's short Bao and feature film Incredibles 2.
Related Implicit Skinning: Real-Time Skin Deformation with Contact Modeling · Delta Mush: Smoothing Deformations While Preserving Detail · Harmonic Coordinates for Character Articulation · Automatic Rigging and Animation of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: a patch-aware mesh surface relaxation technique
- Contributions
-
- A weighting scheme using local decal maps to encode the desired patch layout and edge flows
- An update rule that transfers a reference patch arrangement onto a deformed mesh
- A surface-constrained regime exploiting decal maps to slide points for volume preservation
- Context
- Extends classic Laplacian-style smoothing used in feature animation, adding patch-layout awareness that prior relaxation variants did not reproduce.
- Correctness
- Demonstrated on examples from Pixar's Bao and Incredibles 2; as a production-motivated tool, its effectiveness depends on having a meaningful reference patch layout and well-authored decal maps, which a reader should keep in mind.
- Clarity
- Reasonably accessible if you know Laplacian smoothing; a first pass conveys the patch-transfer goal, a second pass clarifies the weighting and update rules.
- How to read it
- First pass for the motivation (reproducing edge-loop layout under deformation) and the three contributions; do a second pass on the decal-map weighting and update rule if you need to reimplement the relaxation.
Rigging / Skinning
-
,
A physically-based model of the sticky-lip effect for facial animation, using total Lagrangian explicit dynamics FEM with a new breaking element for the saliva, so the lips stick and separate realistically as the mouth opens.
abstract ▾ abstract ▴
A novel solution for the sticky lip problem in computer facial animation, recreating the way the lips stick together when drawn apart in speech or in the formation of facial expressions. Where traditional approaches rely on an artist estimating the behaviour, this presents a physically-based model: the mouth is modelled with the total Lagrangian explicit dynamics finite element method, with a new breaking element modelling the saliva between the lips. Subtle yet complex behaviours are recreated implicitly, reproducing varying degrees of stickiness between the lips as well as asymmetric effects.
Related Realtime Performance-Driven Physical Simulation for Facial Animation · Direct Manipulation Blendshapes · Realtime Performance-Based Facial Animation · High Fidelity Facial Animation Capture and Retargeting with Contours
how to read this ▾ how to read this ▴
- Category
- Method: a physically-based facial animation model (sticky lips)
- Contributions
-
- A physically-based model of the sticky-lip effect, replacing artist-estimated behaviour
- Models the mouth with total Lagrangian explicit dynamics FEM
- Introduces a new breaking element representing saliva so lips stick and separate realistically
- Context
- Sits within physically-based facial animation and FEM soft-tissue simulation, contrasting with traditional hand-keyed or blendshape estimates of lip stickiness.
- Correctness
- Validated qualitatively by recreating subtle behaviours such as varying stickiness and asymmetric separation; a reader should note the effect rests on the breaking-element model of saliva and on the explicit-dynamics FEM parameterization rather than measured material data.
- Clarity
- Accessible in motivation but technical in the FEM detail; a first pass conveys the idea, a second pass is needed for the breaking-element formulation.
- How to read it
- First pass for the problem and the breaking-element idea; do a second pass on the total Lagrangian explicit dynamics setup and the breaking criterion if you want to reproduce the stickiness behaviour.
Facial
-
EA Frostbite explains driven-ragdoll technology blending physics reactivity with animation adherence, covering performance optimisation, multiplayer synchronisation, and emergent procedural motion generation.
Rigging / Muscles
-
, , , ,
Physics-based character controller that imitates motion capture reference with deep reinforcement learning, producing physically plausible results.
abstract ▾ abstract ▴
We introduce a deep reinforcement learning method that learns to control articulated humanoid bodies to imitate given target motions closely when simulated in a physics simulator. The target motion, which may not have been seen by the agent and can be noisy, is supplied at runtime. Our method can recover balance from moderate external disturbances and keep imitating the target motion. When subjected to large disturbances that cause the humanoid to fall down, our method can control the character to get up and recover to track the motion. Our method is trained to imitate the mocap clips from the CMU motion capture database and a number of other publicly available databases. We use a state-of-the-art deep reinforcement learning algorithm to learn to dynamically control the gain of PD controllers, whose target angles are derived from the mocap clip and to apply corrective torques with the goal of imitating the provided motion clip as closely as possible. Both the simulation and the learning algorithms are parallelized and run on the GPU. We demonstrate that the proposed method can control the character to imitate a wide variety of motions such as running, walking, dancing, jumping, kicking, punching, standing up, and so on.
Related SFV: Reinforcement Learning of Physical Skills from Video · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters · DReCon: Data-Driven Responsive Control of Physics-Based Characters
how to read this ▾ how to read this ▴
- Category
- Method: a physics-based motion-imitation controller via deep RL
- Contributions
-
- A deep RL method that controls articulated humanoids to closely imitate target motions in a physics simulator
- Handles runtime target motions that may be unseen or noisy, and recovers balance after disturbances including getting up from falls
- Learns to dynamically tune PD-controller gains and apply corrective torques, with GPU-parallelized simulation and training
- Context
- Builds on example-guided physics-based RL controllers (DeepMimic), trained to imitate clips from the CMU and other public mocap databases. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- Demonstrated across a wide variety of motions (running, walking, dancing, jumping, etc.) and disturbance recovery; a reader should keep in mind that imitation fidelity and recovery depend on the simulator, reward design, and the coverage of the training clips.
- Clarity
- Moderately technical; a first pass conveys the imitate-with-RL idea and the runtime-target feature, a second pass covers the state, reward, and gain-control details.
- How to read it
- First pass to grasp the runtime-supplied target and the PD-gain control; do a second pass on the reward and training pipeline, especially to contrast its design choices with DeepMimic.
Motion Synthesis / Retargeting
-
,
Projective Skinning is a physics-based character skinning approach that overcomes well-known artifacts of geometric skinning while maintaining real-time performance, enabling dynamic effects and resol
abstract ▾ abstract ▴
Projective Skinning is a physics-based character skinning approach that overcomes well-known artifacts of geometric skinning while maintaining real-time performance, enabling dynamic effects and resolving local self-collisions. It needs neither skinning weights nor a complex volumetric tessellation, requiring just a triangle mesh and a skeleton as input. The method builds a two-layer model of rigid bones and an elastic soft tissue layer that is computed efficiently from the input surface mesh and is solved with projective dynamics.
Related Efficient and Robust Skin Slide Simulation · Efficient Elasticity for Character Skinning with Contact and Collisions · Segmentation-Based Skinning · Efficient Dynamic Skinning with Low-Rank Helper Bone Controllers
how to read this ▾ how to read this ▴
- Category
- Method: a physics-based real-time character skinning algorithm
- Contributions
-
- A physics-based skinning approach that avoids common geometric-skinning artifacts at real-time rates
- Builds a two-layer model of rigid bones plus an elastic soft-tissue layer directly from a triangle mesh and skeleton
- Requires no skinning weights or volumetric tessellation, and resolves local self-collisions while enabling dynamic effects
- Context
- Builds on Projective Dynamics, applying its fast constraint-projection solver to the skinning problem as an alternative to linear-blend / geometric skinning. Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- Demonstrated as real-time with dynamic effects and local self-collision resolution; a reader should note 'local' self-collisions and the soft-tissue layer derived from the surface, so global collisions and material realism are not the focus.
- Clarity
- Accessible idea with a concise model description; a first pass conveys the two-layer concept, a second pass is needed for the projective-dynamics solve.
- How to read it
- First pass for the no-weights, two-layer (bone + soft tissue) framing and why it beats geometric skinning; do a second pass on the projective-dynamics constraints if you plan to implement it.
Skinning / CFX
-
MPC Asset Supervisor details creation of the photoreal digital Rachael character, presenting techniques for crossing the uncanny valley on a human CG face.
Facial
-
, ,
Addresses Incredibles 2 skin simulation using 2D ray-tracing over mesh surfaces for fast, robust secondary motion on superhero characters.
abstract ▾ abstract ▴
Robustly simulating the dynamics of skin sliding over a character's body is an ongoing challenge. Skin can become non-physically "snagged" in curved or creased regions, such as armpits, and create unusable results. These problems usually arise when it becomes ambiguous which kinematic surface the skin should be sliding along. We have found that many of these problems can be addressed by performing 2D ray-tracing over the surface of the mesh. The approach is fast and robust, and has been used successfully in Incredibles 2.
Related Simulating Cloth Using Bilinear Elements · Smeat: ADMM Based Tools for Character Deformation · Discrete Shells · Creating a Photorealistic Hyena
how to read this ▾ how to read this ▴
- Category
- Production talk / technique (robust skin simulation)
- Contributions
-
- Identifies non-physical snagging of skin in curved or creased regions (e.g. armpits) as the core failure
- Addresses it by performing 2D ray-tracing over the surface of the mesh to disambiguate the sliding surface
- Delivers a fast, robust skin-sliding approach used in production on Incredibles 2
- Context
- Relates to character skinning with contact and collisions (referenced efficient-elasticity skinning work), targeting the secondary skin-sliding motion on stylized superhero characters. Builds on: Efficient Elasticity for Character Skinning with Contact and Collisions
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on Incredibles 2, and the 2D surface ray-tracing is presented as a robust heuristic for the snagging ambiguity rather than a fully general dynamics solution.
- Clarity
- Accessible and problem-focused; a single first pass conveys the snagging issue and the ray-tracing remedy.
- How to read it
- Read once for the failure case (ambiguous sliding surface) and the 2D-ray-tracing fix; focus on when the ambiguity arises rather than expecting a formal derivation.
CFX / Muscles
-
, , , ,
Physics-based character learns to imitate athletic motions reconstructed from monocular video using deep RL.
abstract ▾ abstract ▴
Data-driven character animation based on motion capture can produce highly naturalistic behaviors and, when combined with physics simulation, can provide for natural procedural responses to physical perturbations, environmental changes, and morphological discrepancies. Motion capture remains the most popular source of motion data, but collecting mocap data typically requires heavily instrumented environments and actors. In this paper, we propose a method that enables physically simulated characters to learn skills from videos (SFV). Our approach, based on deep pose estimation and deep reinforcement learning, allows data-driven animation to leverage the abundance of publicly available video clips from the web, such as those from YouTube. This has the potential to enable fast and easy design of character controllers simply by querying for video recordings of the desired behavior. The resulting controllers are robust to perturbations, can be adapted to new settings, can perform basic object interactions, and can be retargeted to new morphologies via reinforcement learning. We further demonstrate that our method can predict potential human motions from still images, by forward simulation of learned controllers initialized from the observed pose. Our framework is able to learn a broad range of dynamic skills, including locomotion, acrobatics, and martial arts. (Video1)
Related DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Physics-based Motion Capture Imitation with Deep Reinforcement Learning · C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters · ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters
how to read this ▾ how to read this ▴
- Category
- Method: learning physics-based character skills from video (SFV)
- Contributions
-
- Enables physically simulated characters to learn skills from ordinary monocular video clips
- Combines deep pose estimation with deep reinforcement learning to imitate the reconstructed motion
- Produces controllers robust to perturbations, adaptable to new settings and object interactions, and retargetable to new morphologies; can also predict motion from still images
- Context
- Builds on example-guided physics-based RL (DeepMimic) and deep human pose estimation, replacing mocap reference with motion reconstructed from web video. Builds on: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
- Correctness
- Demonstrated on athletic skills reconstructed from video and on perturbation robustness; a reader should keep in mind that quality is bounded by the monocular pose-estimation accuracy and by the RL reward and simulation, so noisy or ambiguous video limits the learned skill.
- Clarity
- Accessible high-level pipeline with technical depth in the RL stage; a first pass conveys the video-to-controller idea, a second pass covers pose-estimation and reward details.
- How to read it
- First pass for the video -> pose -> RL-imitation pipeline and what it unlocks; do a second pass on how pose-estimation noise is handled and how the reward relates to DeepMimic if you want to build on it.
Motion Synthesis
-
,
ADMM-based creature tools for muscle, skin, and fatty tissue simulation achieving anatomically accurate dynamic deformation, demonstrated on Avengers: Infinity War characters.
abstract ▾ abstract ▴
Recent work on physical simulation in computer graphics has focused on energy minimization formulations of dynamics based on fast optimization methods. Because these methods can efficiently tackle a wide range of problems in the realms of both dynamic simulation and static relaxation, we adopted one such method, ADMM, and used it to implement a set of creature tools we call Smeat. We will describe this tool set and give a case study of how it was used to create character effects on "Avengers: Infinity War".
Related Muscle and Fascia Simulation with Extended Position Based Dynamics · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Finding Hank · Efficient and Robust Skin Slide Simulation
how to read this ▾ how to read this ▴
- Category
- Production talk / tools (ADMM-based creature deformation)
- Contributions
-
- Adopts ADMM optimization to build a creature tool set (Smeat) spanning dynamic simulation and static relaxation
- Simulates muscle, skin, and fatty tissue for anatomically driven dynamic deformation
- Presents a case study of creating character effects on Avengers: Infinity War
- Context
- Relates to energy-minimization / fast-optimization approaches to physical simulation (in the lineage of Projective Dynamics), applying the ADMM solver to muscle, skin, and fat tissue. Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- Studio practice, not peer-reviewed; results are production-proven via the Infinity War case study, so the anatomical accuracy claim should be read as art-directed plausibility rather than measured biomechanics.
- Clarity
- Accessible as a tools-and-case-study briefing; a first pass conveys what the tool set does, with the ADMM formulation only sketched.
- How to read it
- Read once for the tool scope (muscle/skin/fat) and why ADMM suits both dynamics and static relaxation; consult the underlying ADMM/Projective Dynamics literature separately if you need the math.
Muscles / CFX
-
, , ,
Retargeting method preserving spatial relationships between body surface vertices rather than only skeletal joints.
abstract ▾ abstract ▴
Retargeting motion from one character to another is a key process in computer animation. It enables to reuse animations designed for a character to animate another one, or to make performance-driven be faithful to what has been performed by the user. Previous work mainly focused on retargeting skeleton animations whereas the contextual meaning of the motion is mainly linked to the relationship between body surfaces, such as the contact of the palm with the belly. In this paper we propose a new context-aware motion retargeting framework, based on deforming a target character to mimic a source character poses using harmonic mapping. We also introduce the idea of Context Graph: modeling local interactions between surfaces of the source character, to be preserved in the target character, in order to ensure fidelity of the pose. In this approach, no rigging is required as we directly manipulate the surfaces, which makes the process totally automatic. Our results demonstrate the relevance of this automatic rigging-less approach on motions with complex contacts and interactions between the character's surface.
Related Articulated Mesh Animation from Multi-view Silhouettes · Laplacian Surface Editing · Animation Setup Transfer for 3D Characters · Deformation Transfer for Triangle Meshes
how to read this ▾ how to read this ▴
- Category
- Method: a context-aware, surface-based motion retargeting framework
- Contributions
-
- Retargets motion by preserving spatial relationships between body surfaces rather than only skeletal joints
- Deforms the target character to mimic source poses via harmonic mapping, with no rigging required
- Introduces a Context Graph that models local surface interactions to preserve contacts in the target
- Context
- Builds on classic skeleton-based motion retargeting (Retargeting Motion to New Characters), shifting the objective from joint matching to surface-relationship and contact preservation. Builds on: Retargeting Motion to New Characters
- Correctness
- Demonstrated on motions with complex contacts and self-interactions; a reader should note the rigging-free, surface-driven approach depends on harmonic mapping quality and on the Context Graph capturing the relevant interactions, which may bound robustness on dissimilar body shapes.
- Clarity
- Accessible motivation (palm-to-belly contact example) with technical mapping detail; a first pass conveys the surface-over-skeleton idea, a second pass covers harmonic mapping and the Context Graph.
- How to read it
- First pass for why surface relationships beat joint-only retargeting and what the Context Graph encodes; do a second pass on the harmonic mapping and graph construction if you need to reproduce the contact preservation.
Retargeting / Skinning
-
Compared classical FACS/blend-shape facial pipelines against emerging neural-network approaches for character face deformation at production scale.
ML Deformation / Facial
-
, ,
SIGGRAPH Asia 2018 course demystifying USD composition, scene interoperability, and how Pixar's pipeline uses USD for asset interchange across feature film production.
abstract ▾ abstract ▴
We want to start from scratch and explain what USD is and isn't.
Related Universal Scene Description: Open Source Release · A Deep Dive into Universal Scene Description and Hydra · Zero to USD in 80 Days · USD in Production
how to read this ▾ how to read this ▴
- Category
- Course / tutorial on a scene-description standard
- Contributions
-
- Explains from first principles what USD is and is not
- Demystifies USD composition and scene interoperability for cross-studio asset interchange
- Grounds the material in how Pixar's feature-film pipeline uses USD
- Context
- An introductory SIGGRAPH course building directly on Pixar's open-source release of Universal Scene Description, framing it as the shared substrate for asset interchange in film production. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Course material rather than a research result, so claims are pedagogical and reflect Pixar's own pipeline practice; a reader should treat it as authoritative on USD concepts but not as an independent evaluation across studios.
- Clarity
- Designed to be accessible from scratch; a single first pass conveys the conceptual model, with later passes useful for composition mechanics.
- How to read it
- Read as your USD primer: focus on the composition concepts and the what-it-is-and-isn't framing, then revisit specific composition operators when you need the mechanics.
Rigging
-
,
DreamWorks transitioned How to Train Your Dragon 3 to USD as primary asset and shot representation in 80 working days, covering adoption strategy and implementation challenges.
abstract ▾ abstract ▴
This talk describes how DreamWorks adopted Pixar's Universal Scene Description (USD) as the primary asset and shot representation across its production pipeline, starting with How To Train Your Dragon 3. It recounts the motivation for the switch, driven by the new Moonray renderer whose objects had no representation in the legacy ASG format, and the strategy for integrating USD within a constrained 80 working day timeline. The authors review their methodology for planning the integration, present implementation details such as intermediate ASG-compatible pipelines and Alembic geometry caches, and discuss successes and challenges including mastering USD's LIVRPS composition scheme and adapting a partly procedural asset model to USD's declarative data model.
Related USD and Scene Interoperability: Demystifying the State of the Art · Forging a New Animation Pipeline with USD · USD in Production · Universal Scene Description: Open Source Release
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline adoption case study
- Contributions
-
- Recounts DreamWorks adopting USD as primary asset and shot representation, starting on How To Train Your Dragon 3, within an 80 working day timeline
- Documents integration strategy including intermediate ASG-compatible pipelines and Alembic geometry caches
- Discusses successes and challenges, notably mastering USD's LIVRPS composition and fitting a partly procedural asset model into USD's declarative data model
- Context
- A studio adoption story for Pixar's Universal Scene Description, motivated by the new Moonray renderer whose objects had no place in the legacy ASG format. Builds on: Universal Scene Description: Open Source Release
- Correctness
- Studio practice, not peer-reviewed; the experience is specific to DreamWorks' tooling, legacy format, and one film, so transfer to other pipelines is suggestive rather than guaranteed.
- Clarity
- Narrative and accessible; one pass conveys the adoption strategy, with a second pass useful for the LIVRPS and procedural-to-declarative details.
- How to read it
- Read for the migration playbook: focus on the timeline strategy, the intermediate-pipeline bridging, and the LIVRPS and procedural-model pain points if you are planning a USD transition.
Rigging
2017
33-
, , , ,
Takes four-view hair photos and estimates a 3D direction field to grow dense strands; allows mixing views from different hairstyles for flexible modeling.
abstract ▾ abstract ▴
We introduce a novel four-view image-based hair modeling method. Given four hair images taken from the front, back, left and right views as input, we first estimate the rough 3D shape of the hair observed in the input using a predefined database of 3D hair models, then synthesize a hair texture on the surface of the shape, from which the hair growing direction information is calculated and used to construct a 3D direction field in the hair volume. Finally, we grow hair strands from the scalp, following the direction field, to produce the 3D hair model, which closely resembles the hair in all input images. Our method does not require that all input images are from the same hair, enabling an effective way to create compelling hair models from images of considerably different hairstyles at different views. We demonstrate the efficacy of our method using a wide range of examples.
Related Single-View Hair Modeling Using a Hairstyle Database · SMPLicit: Topology-aware Generative Model for Clothed People · SwinGar: Spectrum-Inspired Neural Dynamic Deformation for Free-Swinging Garments · SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
how to read this ▾ how to read this ▴
- Category
- Method: image-based 3D hair modeling from multiple views
- Contributions
-
- A four-view (front, back, left, right) image-based hair modeling pipeline
- Estimates rough 3D hair shape from a predefined database, synthesizes a surface hair texture, and builds a 3D direction field used to grow dense strands from the scalp
- Allows mixing input views from different hairstyles, enabling hair models assembled from considerably different references
- Context
- Builds on single-image and database-driven hair modeling (Chai et al., AutoHair), extending capture from one image to a four-view setup with a direction-field strand-growth stage. Builds on: AutoHair: Fully Automatic Hair Modeling from a Single Image
- Correctness
- Demonstrated qualitatively across a range of examples; quality depends on the coverage of the 3D hair database and on consistent direction-field estimation from the photos, and the four-view input requirement limits fully casual capture.
- Clarity
- Pipeline is accessible; a first pass conveys the shape-then-texture-then-direction-field-then-grow flow, with stage details in later passes.
- How to read it
- First pass for the four-view pipeline and the cross-hairstyle mixing capability; second pass on direction-field construction and strand growth if you work on hair capture.
CFX
-
,
This work extends microfacet theory to model iridescent reflections produced by thin films of varying thickness layered on top of an arbitrarily rough base surface.
abstract ▾ abstract ▴
This work extends microfacet theory to model iridescent reflections produced by thin films of varying thickness layered on top of an arbitrarily rough base surface. The material is the first to produce a consistent appearance between tristimulus (RGB) and spectral rendering engines by analytically pre-integrating its spectral response. The extension applies to any microfacet-based model, covering reflection over dielectrics or conductors as well as transmission through dielectrics, making it suitable for surfaces such as oil films, soap bubbles, and structurally colored feathers.
Related Appearance Modeling of Iridescent Feathers with Diverse Nanostructures · Rendering Iridescent Rock Dove Neck Feathers · Microstructure-based Appearance Rendering for Feathers · A Surface-based Appearance Model for Pennaceous Feathers
how to read this ▾ how to read this ▴
- Category
- Method: an appearance model extending microfacet theory for iridescence
- Contributions
-
- Extends microfacet theory to model iridescent reflections from thin films of varying thickness over an arbitrarily rough base surface
- Analytically pre-integrates the spectral response to give consistent appearance between tristimulus (RGB) and spectral rendering engines
- Applies to any microfacet-based model, covering reflection over dielectrics or conductors and transmission through dielectrics (oil films, soap bubbles, structurally colored feathers)
- Context
- Sits in the microfacet BRDF / thin-film interference lineage, adding a varying-thickness iridescence layer compatible with existing microfacet models.
- Correctness
- Presented as a practical extension validated on representative iridescent surfaces; its central claim is RGB-spectral consistency via pre-integration, an approximation whose fidelity at extreme roughness or thickness ranges a careful reader should still check.
- Clarity
- Practically framed but physically dense; a first pass conveys scope and the RGB/spectral consistency goal, the integration math needs a second pass.
- How to read it
- First pass for what it adds to microfacet models and the pre-integration motivation; do a focused second pass on the spectral derivation if you implement it in a renderer.
CFX
- talk Animation Bootcamp: Uncharted 4: Naughty Dog's Animation Workflow and Animation Prototyping GDC Industrial
,
Naughty Dog lead animators cover Uncharted 4's full animation pipeline from previs through motion capture and final polish, plus rapid animation prototyping to accelerate design decisions.
Retargeting
-
, ,
This paper introduces an anisotropic elastoplastic constitutive model that resolves frictional contact for codimensional materials such as cloth, knit, and hair within a continuum simulation.
abstract ▾ abstract ▴
This paper introduces an anisotropic elastoplastic constitutive model that resolves frictional contact for codimensional materials such as cloth, knit, and hair within a continuum simulation. The model is discretized with the Material Point Method using a novel return mapping and update of the deformation gradient that enforces Coulomb friction across the codimensional manifold. The approach handles dense self-contact and inter-object contact, simulating scenarios with up to roughly one million degrees of freedom.
Related An Implicit Frictional Contact Solver for Adaptive Cloth Simulation · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Scriptable Character FX Solution · Codimensional Incremental Potential Contact
how to read this ▾ how to read this ▴
- Category
- Method: an anisotropic elastoplastic model for codimensional frictional contact
- Contributions
-
- An anisotropic elastoplastic constitutive model resolving frictional contact for codimensional materials (cloth, knit, hair) within a continuum simulation
- A Material Point Method discretization with a novel return mapping and deformation-gradient update that enforces Coulomb friction across the codimensional manifold
- Handles dense self-contact and inter-object contact, simulating scenarios up to roughly one million degrees of freedom
- Context
- Extends continuum (MPM) simulation to thin and yarn-level structures, relating to yarn-level cloth modeling (Cirio et al., Yarn-Level Simulation of Woven Cloth) by treating contact via plasticity rather than explicit collision handling. Builds on: Yarn-Level Simulation of Woven Cloth
- Correctness
- Demonstrated on dense-contact cloth/knit/hair scenes at large degree-of-freedom counts; friction is modeled through an elastoplastic return mapping (an approximation of Coulomb contact), so physical accuracy and compute cost at scale are points a reader should weigh.
- Clarity
- Mathematically heavy; a first pass conveys the codimensional-friction-via-plasticity idea, with the constitutive model and return mapping needing later passes.
- How to read it
- First pass for the high-level idea of encoding frictional contact as anisotropic plasticity in MPM; reserve second and third passes for the return mapping and deformation-gradient update if you work in continuum simulation.
CFX
- Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion SIGGRAPH Industrial 481 cites
, , , ,
End-to-end deep learning system for speech-driven facial animation jointly learning lip sync, head pose, and emotional expression.
abstract ▾ abstract ▴
We present a machine learning technique for driving 3D facial animation by audio input in real time and with low latency. Our deep neural network learns a mapping from input waveforms to the 3D vertex coordinates of a face model, and simultaneously discovers a compact, latent code that disambiguates the variations in facial expression that cannot be explained by the audio alone. During inference, the latent code can be used as an intuitive control for the emotional state of the face puppet. We train our network with 3--5 minutes of high-quality animation data obtained using traditional, vision-based performance capture methods. Even though our primary goal is to model the speaking style of a single actor, our model yields reasonable results even when driven with audio from other speakers with different gender, accent, or language, as we demonstrate with a user study. The results are applicable to in-game dialogue, low-cost localization, virtual reality avatars, and telepresence.
Related Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks · Audiovisual Inputs for Learning Robust, Real-Time Facial Animation with Lip Sync · AI in Maya: Autodesk CEO and Animation Product Manager Demo MotionMaker, FaceAnimator and More · Capture, Learning, and Synthesis of 3D Speaking Styles
how to read this ▾ how to read this ▴
- Category
- Method: a deep-learning audio-to-face animation system
- Contributions
-
- A neural network mapping raw audio waveforms to 3D face-mesh vertex positions in real time at low latency
- A jointly learned latent emotion code that disambiguates expression variation not explained by audio and serves as an intuitive control at inference
- Demonstrated generalization to other speakers (gender, accent, language) from only 3 to 5 minutes of training animation
- Context
- Builds on vision-based performance capture used to source training data, in the lineage of Laine et al.'s Production-Level Facial Performance Capture Using Deep CNNs, applying learning to the audio-to-geometry direction. Builds on: Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks
- Correctness
- Validated for a single actor's speaking style on a small (3 to 5 minute) high-quality dataset with a user study on cross-speaker driving; readers should keep in mind it targets one actor's style and that cross-speaker results are described as reasonable rather than perfect.
- Clarity
- Accessible; a first pass conveys the joint pose/emotion idea, do a second pass for the network and latent-code formulation.
- How to read it
- Focus on how the latent emotion code is learned and used as a control; a second pass pays off for the loss design and real-time inference path if you plan to reimplement.
Facial / Motion Synthesis
-
Official Autodesk feature demo of the Delta Mush deformer, which acts as a low-pass filter to remove skinning artifacts and restore rest-shape volume on deforming characters.
abstract ▾ abstract ▴
This Autodesk feature demo introduces the Delta Mush deformer in Maya 2016, which reduces the need to paint skin weights by calculating offsets (deltas) between the reference mesh and a smoothed version and reusing them to remove deformation artifacts. The presenter deliberately smooth binds a character with bad skin weights set to a max influence of 1, scrubs the animation to expose ugly deformation across the body, then layers on the Delta Mush deformer and raises its envelope from 0 to 1 to show the artifacts cleaning up. The talk then adjusts the iterations and displacement attributes to control how strongly the smoothed deltas are applied, demonstrating how an otherwise unusable skin bind can be smoothed into a clean deforming character.
Related Maya 2017 Update 3: Tension Deformer and Bake Deformer Tool · Segmentation-Based Skinning · Speed Up Animation Workflows With Maya's ML Deformer, Powered by Autodesk AI · Delta Mush: Smoothing Deformations While Preserving Detail
how to read this ▾ how to read this ▴
- Category
- Production talk: a feature demo of the Delta Mush deformer
- Contributions
-
- Shows the Delta Mush deformer cleaning up artifacts from a deliberately bad smooth bind (max influence of 1) by raising its envelope from 0 to 1
- Demonstrates using stored offsets (deltas) between a reference mesh and a smoothed version to reduce skin-weight painting
- Walks through the iterations and displacement attributes to control how strongly the smoothed deltas are applied
- Context
- A vendor demo of the in-Maya implementation of Mancewicz et al.'s Delta Mush (Smoothing Deformations While Preserving Detail), presented as a practical low-pass filter for skinning artifacts. Builds on: Delta Mush: Smoothing Deformations While Preserving Detail
- Correctness
- Vendor feature demo, not peer-reviewed; the cleanup is shown on a single intentionally bad bind, so it illustrates typical behavior rather than validated limits, and very poor input or extreme poses may still need weight work.
- Clarity
- Very accessible; a single watch conveys the idea, no second pass needed beyond replicating the attribute settings.
- How to read it
- Watch for what the iterations and displacement attributes do and when Delta Mush substitutes for weight painting versus only masking issues; consult the source paper for the actual algorithm.
Skinning / ML Deformation
-
, , , ,
Dexter Studios presents ZENN node-network, a procedural grooming pipeline for fur, feathers, and scales used in commercial creature productions.
abstract ▾ abstract ▴
ZENN (Zelos Node Network) is a procedural grooming solution from Dexter Studios for the quick, easy, and art-directable creation of body coverings for digital creatures such as fur, feathers, and scales, and by extension forests, rocks, and landscapes for digital environments. Implemented as a Maya plug-in of modular nodes that pass strand data, it converts guide curves into internal strands bound to a mesh, then generates many more strands through sampling and interpolation schemes and applies modifiers for randomness, clumping, and frizz. The talk also describes binding and fake-dynamics options, feather and mesh instancing onto strands, and the caching and rendering pipeline used in production.
Related Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks · Automation of Creature FX in a Small Studio Pipeline
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline: a procedural grooming system
- Contributions
-
- ZENN, a modular Maya node-network that passes strand data to author fur, feathers, and scales (and by extension forests, rocks, landscapes)
- A guide-to-strand workflow that binds guide curves into internal strands and generates many more via sampling and interpolation, with randomness/clumping/frizz modifiers
- Production binding and fake-dynamics options plus feather and mesh instancing onto strands, with a described caching and rendering pipeline
- Context
- A studio (Dexter Studios) grooming solution in the lineage of arbitrary primitive generators such as Thompson et al.'s XGen, framed for art-directable creature coverings. Builds on: XGen: Arbitrary Primitive Generator
- Correctness
- Studio practice presented as used in commercial creature productions, not peer-reviewed; results are production-proven but the talk emphasizes workflow and art-directability over quantitative comparison.
- Clarity
- Accessible as a pipeline overview; a first pass conveys the node-network concept, a second pass helps if you want the specific interpolation and instancing schemes.
- How to read it
- Read for the node-graph architecture and the guide-curve to dense-strand interpolation; skim the dynamics and caching sections unless you are building a comparable pipeline.
CFX
- talk CG Hair Fur workflow in Houdini for VFX and Games | Saber Jlassi's GDC 2017 Presentation GDC Industrial
Blizzard Cinematics Senior TD presents at the SideFX GDC booth: custom procedural grooming system for VFX, auto-generating real-time hair cards and Nvidia Hairworks export.
abstract ▾ abstract ▴
A Blizzard Animation Senior TD presents a from-scratch procedural grooming system in Houdini and then converts the result for real-time rendering. He builds hair interpolation by triangulating curve roots into a connectivity mesh and using barycentric coordinates from UVs to blend neighboring guide curves, expanding from roughly 100 guides to 86,000 curves; clumping is achieved by packing each curve's point positions into a string attribute via Python, transferring it to high-res curves and blend-shaping toward nearest 1K/3K cluster guides with a root-to-tip ramp, then mixing large and small clumps with a noise map. Variation comes from length offsets along the tangent and a quaternion-based twist deformer built around each clump guide. For games he auto-generates hair cards by clustering tips, extruding ribbons oriented by a ray-projected scalp surface normal crossed with the curve tangent, then bakes color, normal, depth, occlusion and tangent passes by shooting reflection rays in Mantra, packs all 250 card textures into an 8K atlas for Marmoset, and separately exports a 10K curve set with a spherical-UV connectivity mesh through Maya into NVIDIA Hairworks for real-time expansion and wind.
Related Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks · Creatures in Houdini | Ahmed Gharraph | FMX 2019 · Simulating the Perfect Groom for a Bovine Biker | Untold Studios | FMX HIVE 2023 · Creating a Photorealistic Hyena
how to read this ▾ how to read this ▴
- Category
- Production talk: a Houdini procedural grooming-to-realtime workflow
- Contributions
-
- A from-scratch Houdini grooming system using root triangulation and barycentric UV blending to expand roughly 100 guides to about 86,000 curves, with clumping, twist, and noise-based variation
- Auto-generation of real-time hair cards by clustering tips and extruding scalp-oriented ribbons, baking color/normal/depth/occlusion/tangent passes in Mantra into an 8K atlas for Marmoset
- Export of a 10K curve set with a spherical-UV connectivity mesh through Maya into NVIDIA Hairworks for real-time expansion
- Context
- Relates to procedural hair/fur grooming and the VFX-to-games asset bridge, presented at the SideFX GDC booth by a Blizzard Cinematics senior TD.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on the presenter's cinematics work, and the specific counts and bake settings are workflow choices rather than validated optima.
- Clarity
- Dense and implementation-heavy; a first watch conveys the pipeline shape, a second pass is needed to follow the attribute-packing and card-generation details.
- How to read it
- Focus on the guide-to-dense interpolation (barycentric blending, clump string attributes) and the hair-card baking/atlas path; rewatch the relevant section if you are exporting to Hairworks or building cards.
CFX
-
, , ,
4D scanning system that captures detailed clothing geometry and body motion simultaneously, enabling garment retargeting to new body shapes.
abstract ▾ abstract ▴
Designing and simulating realistic clothing is challenging. Previous methods addressing the capture of clothing from 3D scans have been limited to single garments and simple motions, lack detail, or require specialized texture patterns. Here we address the problem of capturing regular clothing on fully dressed people in motion. People typically wear multiple pieces of clothing at a time. To estimate the shape of such clothing, track it over time, and render it believably, each garment must be segmented from the others and the body. Our ClothCap approach uses a new multi-part 3D model of clothed bodies, automatically segments each piece of clothing, estimates the minimally clothed body shape and pose under the clothing, and tracks the 3D deformations of the clothing over time. We estimate the garments and their motion from 4D scans; that is, high-resolution 3D scans of the subject in motion at 60 fps. ClothCap is able to capture a clothed person in motion, extract their clothing, and retarget the clothing to new body shapes; this provides a step towards virtual try-on.
Related SMPL: A Skinned Multi-Person Linear Model · Learning-Based Animation of Clothing for Virtual Try-On · SMPLicit: Topology-aware Generative Model for Clothed People · SNUG: Self-Supervised Neural Dynamic Garments
how to read this ▾ how to read this ▴
- Category
- Capture system: 4D clothing capture and retargeting
- Contributions
-
- A multi-part 3D model of clothed bodies that automatically segments each garment from the others and the body
- Estimation of the minimally clothed body shape and pose under the clothing and tracking of garment 3D deformations over time from 4D scans (60 fps)
- Retargeting of captured clothing to new body shapes as a step toward virtual try-on
- Context
- Builds on Loper et al.'s SMPL skinned body model, extending it to a multi-part clothed-body representation for capturing regular clothing on people in motion. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Demonstrated on high-resolution 4D scans of fully dressed subjects in motion; it assumes garments can be segmented and an underlying body inferred, so results depend on scan quality and the model's part decomposition rather than on physical simulation.
- Clarity
- Accessible in motivation; a first pass conveys the capture-segment-retarget pipeline, a second pass clarifies the multi-part model and tracking formulation.
- How to read it
- Focus on the multi-part model and how the under-clothing body is estimated; a second pass pays off for the segmentation and tracking math if you work in capture or try-on.
CFX / Retargeting
-
, , , , , ,
Learns pose-dependent soft-tissue dynamics from FEM simulation data and embeds them in a real-time statistical skin model.
abstract ▾ abstract ▴
Data driven models of human poses and soft-tissue deformations can produce very realistic results, but they only model the visible surface of the human body and cannot create skin deformation due to interactions with the environment. Physical simulations can generalize to external forces, but their parameters are difficult to control. In this paper, we present a layered volumetric human body model learned from data. Our model is composed of a data-driven inner layer and a physics-based external layer. The inner layer is driven with a volumetric statistical body model (VSMPL). The soft tissue layer consists of a tetrahedral mesh that is driven using the finite element method (FEM). Model parameters, namely the segmentation of the body into layers and the soft tissue elasticity, are learned directly from 4D registrations of humans exhibiting soft tissue deformations. The learned two layer model is a realistic full-body avatar that generalizes to novel motions and external forces. Experiments show that the resulting avatars produce realistic results on held out sequences and react to external forces. Moreover, the model supports the retargeting of physical properties from one avatar when they share the same topology.
Related Steklov-Poincare Skinning · NIMBLE: A Non-rigid Hand Model with Bones and Muscles · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method: a data-driven plus physics layered soft-tissue body model
- Contributions
-
- A layered volumetric body model with a data-driven inner layer (volumetric statistical body model, VSMPL) and a physics-based outer FEM tetrahedral soft-tissue layer
- Learning of the body-to-layer segmentation and soft-tissue elasticity directly from 4D registrations of humans showing soft-tissue deformation
- A full-body avatar that generalizes to novel motions and reacts to external forces, with retargeting of physical properties between avatars
- Context
- Combines statistical body modeling with finite-element flesh simulation in the lineage of Teran et al.'s Robust Quasistatic Finite Elements and Flesh Simulation, bridging data-driven surfaces and physics that respond to contact. Builds on: Robust Quasistatic Finite Elements and Flesh Simulation
- Correctness
- Validated on held-out motion sequences and shown reacting to external forces; the realism rests on the learned segmentation and elasticity from 4D data, so generalization is bounded by that capture and the two-layer assumption.
- Clarity
- Moderately technical; a first pass conveys the inner/outer layering idea, a second and possibly third pass are needed for the FEM coupling and the learning of material parameters.
- How to read it
- Focus on how the inner statistical layer drives the FEM outer layer and how elasticity is learned from registrations; do a deeper pass on the simulation coupling if you need response to external forces.
Muscles / Skinning / ML Deformation
-
,
Efficient simulation of skin sliding over underlying anatomy, capturing the biomechanical offset behavior of skin relative to muscle.
abstract ▾ abstract ▴
Automated skin slide simulation for character deformation, where outer surface moves along tangent directions from stretching and underlying tissue motion. Uses ADMM-based implicit Euler integration with specialized collision handling that exploits local deformation neighborhoods for efficiency and stability. Integrated with linear blend skinning and pose space deformation for production use.
Related Smeat: ADMM Based Tools for Character Deformation · How to Build a Human: Practical Physics-Based Character Animation · Projective Skinning · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method: an efficient skin-slide simulation for characters
- Contributions
-
- Automated simulation of skin sliding tangentially over underlying anatomy from stretching and tissue motion
- An ADMM-based implicit Euler solver with collision handling that exploits local deformation neighborhoods for efficiency and stability
- Integration with linear blend skinning and pose-space deformation for production use
- Context
- Relates to contact-aware skin deformation in the lineage of Vaillant et al.'s Implicit Skinning, adding the lateral sliding behavior of skin over muscle on top of standard skinning rigs. Builds on: Implicit Skinning: Real-Time Skin Deformation with Contact Modeling
- Correctness
- Presented as a production-oriented method emphasizing efficiency and robustness; the realism targets the offset sliding behavior rather than full volumetric anatomy, and benefits depend on the local-neighborhood collision approximation, so it complements rather than replaces full physics.
- Clarity
- Fairly technical and terse; a first pass conveys the goal, a second pass is needed for the ADMM integration and collision scheme.
- How to read it
- Focus on the ADMM/implicit-Euler formulation and the local-neighborhood collision handling; a second pass pays off if you want to layer skin slide onto an existing LBS plus PSD pipeline.
Skinning / Muscles
- Emotion Challenge: Building a New Photoreal Facial Performance Pipeline for Games DigiPro DreamWorks 4 cites
, , , , , , , , , ,
End-to-end facial performance pipeline for real-time games combining performance capture, blendshape solving, and photoreal rendering.
abstract ▾ abstract ▴
This work presents Emotion Challenge, a new photoreal facial performance pipeline built for games. The goal is a unified, robust, and scalable pipeline that spans actor likeness acquisition, character art, performance capture, and character animation. The authors discuss the specific challenges of building such a pipeline for games, including working within model and rig limitations, efficiently handling a large number of characters and a large volume of performances, achieving consistency with pre-rendered cinematics, and supporting a wide range of animation pipelines and game engines.
Related Interactive Editing of Performance-based Facial Animation · Premo: Powerful Character Rigging, Fast Animation · Creating an Actor-Specific Facial Rig from Performance Capture · High Fidelity Facial Animation Capture and Retargeting with Contours
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline: a photoreal facial performance pipeline for games
- Contributions
-
- A unified, robust, scalable pipeline spanning actor likeness acquisition, character art, performance capture, and character animation
- Discussion of game-specific constraints: working within rig/model limits, handling many characters and large performance volumes
- Approaches for consistency with pre-rendered cinematics and support across animation pipelines and game engines
- Context
- Builds on photoreal digital-actor work in the lineage of Alexander et al.'s Digital Emily Project, adapting it to the real-time and scale constraints of games. Builds on: The Digital Emily Project: Achieving a Photorealistic Digital Actor
- Correctness
- Studio practice, not peer-reviewed; results are production-proven but the talk emphasizes pipeline integration and trade-offs over quantitative evaluation, so claims are about workability at scale rather than measured accuracy.
- Clarity
- Accessible as an experience report; a first pass conveys the pipeline stages and challenges, with little need for a deep formulation pass.
- How to read it
- Read for the game-specific constraints and how the stages connect end to end; skim for the trade-offs relevant to your engine rather than expecting reproducible algorithms.
Facial
-
, , , , ,
Augments blendshape rigs with finite-element tissue simulation to add physically plausible secondary dynamics during facial animation.
abstract ▾ abstract ▴
This work presents a method for adding physical effects such as secondary dynamics to traditional facial blendshape animation by enriching blendshape rigs with a simple volumetric tissue structure. Rather than relying on detailed anatomical models, the approach uses the artist-created blendshape animation as per-frame rest shapes for a finite element simulation, ensuring that in the absence of head motion or external forces the result exactly matches the input animation. The method introduces blendmaterials to account for spatio-temporally varying material properties caused by muscle activation, and a dynamic rebalancing scheme that updates the deformed configuration to preserve inertial forces and avoid spurious oscillations when changing rest shape or material. Results are demonstrated on both realistic human faces and fantasy creatures undergoing rapid head motion, running, and speech.
Related Lessons from the Evolution of an Anatomical Facial Muscle Model · Fully Automatic Generation of Anatomical Face Simulation Models · Art-Directed Muscle Simulation for High-End Facial Animation · Realtime Performance-Driven Physical Simulation for Facial Animation
how to read this ▾ how to read this ▴
- Category
- Method: physics enrichment of facial blendshape rigs
- Contributions
-
- A method that adds secondary dynamics to blendshape animation by enriching the rig with a simple volumetric tissue structure and using artist animation as per-frame FEM rest shapes
- Blendmaterials, modeling spatio-temporally varying material properties caused by muscle activation
- A dynamic rebalancing scheme that updates the deformed configuration to preserve inertial forces and avoid spurious oscillations when rest shape or material changes
- Context
- Extends artist-directed facial rigging in the lineage of Bickel et al.'s Physically Based Rigging for Artist-Directed Facial Animation, layering FEM dynamics onto standard blendshapes without a detailed anatomical model. Builds on: Physically Based Rigging for Artist-Directed Facial Animation
- Correctness
- Demonstrated on realistic human faces and fantasy creatures under rapid head motion, running, and speech; by construction it matches the input animation when there is no head motion or external force, so the contribution is the added dynamics, and fidelity depends on the simple tissue model rather than true anatomy.
- Clarity
- Technical but well-motivated; a first pass conveys the rest-shape-from-animation idea, a second pass is needed for blendmaterials and the rebalancing scheme.
- How to read it
- Focus on how artist blendshapes become per-frame rest shapes and why rebalancing is needed; a second pass pays off for the blendmaterial formulation if you implement secondary facial dynamics.
Facial / Muscles
-
, , , ,
Automatically aligns source and target blendshape ranges of motion for retargeting, reducing artist intervention while preserving intended expression style.
abstract ▾ abstract ▴
While facial capturing focuses on accurate reconstruction of an actor's performance, facial animation retargeting has the goal to transfer the animation to another character, such that the semantic meaning of the animation remains. Because of the popularity of blendshape animation, this effectively means to compute suitable blendshape weights for the given target character. Current methods either require manually created examples of matching expressions of actor and target character, or are limited to characters with similar facial proportions (i.e., realistic models). In contrast, our approach can automatically retarget facial animations from a real actor to stylized characters. We formulate the problem of transferring the blendshapes of a facial rig to an actor as a special case of manifold alignment, by exploring the similarities of the motion spaces defined by the blendshapes and by an expressive training sequence of the actor. In addition, we incorporate a simple, yet elegant facial prior based on discrete differential properties to guarantee smooth mesh deformation. Our method requires only sparse correspondences between characters and is thus suitable for retargeting marker-less and marker-based motion capture as well as animation transfer between virtual characters.
Related Transferring Facial Expressions to Different Face Models · Transferring the Rig and Animations from a Character to Different Face Models · Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild · Realtime Facial Animation with On-the-fly Correctives
how to read this ▾ how to read this ▴
- Category
- Method: a facial animation retargeting algorithm
- Contributions
-
- Automatic alignment of source actor and target character ranges of motion, framing retargeting as a special case of manifold alignment between blendshape motion spaces
- A facial prior based on discrete differential properties to keep mesh deformation smooth
- Retargeting to stylized (non-realistic) characters from sparse correspondences and an expressive actor training sequence
- Context
- Sits in the blendshape facial retargeting lineage, extending the goal of artist-friendly transfer raised by Seol et al.'s Artist Friendly Facial Animation Retargeting toward automatic, proportion-agnostic alignment. Builds on: Artist Friendly Facial Animation Retargeting
- Correctness
- Demonstrated for transferring expressions to stylized targets using only sparse correspondences and a training sequence; reader should note results hinge on how well the actor's expressive sequence spans the motion space and that semantic fidelity to artistic intent is the success criterion, not metric accuracy.
- Clarity
- Accessible at the conceptual level; a first pass conveys the manifold-alignment idea, a second pass is needed for the alignment formulation and the differential prior.
- How to read it
- Focus first on the manifold-alignment framing and what 'range of motion' means here; do a second pass on the alignment math and smoothness prior if you intend to implement or compare against it.
Facial / Retargeting
-
Official SideFX masterclass introducing Houdini 16.5 guide-curve groom tools including Curve Advect, white-hair generation, and partition-based local groom workflows.
CFX
-
, , , ,
FLAME model: articulated jaw, neck, and eyeballs with pose-dependent and expression blendshapes trained on 33,000 3D scans.
abstract ▾ abstract ▴
The field of 3D face modeling has a large gap between high-end and low-end methods. At the high end, the best facial animation is indistinguishable from real humans, but this comes at the cost of extensive manual labor. At the low end, face capture from consumer depth sensors relies on 3D face models that are not expressive enough to capture the variability in natural facial shape and expression. We seek a middle ground by learning a facial model from thousands of accurately aligned 3D scans. Our FLAME model (Faces Learned with an Articulated Model and Expressions) is designed to work with existing graphics software and be easy to fit to data. FLAME uses a linear shape space trained from 3800 scans of human heads. FLAME combines this linear shape space with an articulated jaw, neck, and eyeballs, pose-dependent corrective blendshapes, and additional global expression blendshapes. The pose and expression dependent articulations are learned from 4D face sequences in the D3DFACS dataset along with additional 4D sequences. We accurately register a template mesh to the scan sequences and make the D3DFACS registrations available for research purposes. In total the model is trained from over 33, 000 scans. FLAME is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model.
Related I M Avatar: Implicit Morphable Head Avatars from Videos · SPARK: Self-supervised Personalized Real-time Monocular Face Capture · Expressive Body Capture: 3D Hands, Face, and Body from a Single Image · STAR: Sparse Trained Articulated Human Body Regressor
how to read this ▾ how to read this ▴
- Category
- Method / model: a learned parametric face model
- Contributions
-
- FLAME, a face model combining a linear shape space with an articulated jaw, neck, and eyeballs plus pose-dependent corrective and global expression blendshapes
- Trained from thousands of aligned 3D scans (shape) and 4D sequences (pose and expression), designed to fit data and work in existing graphics software
- Release of registered D3DFACS sequences for research
- Context
- Builds on the morphable-model tradition of Blanz and Vetter's A Morphable Model for the Synthesis of 3D Faces, adding articulation and learned correctives to bridge high-end and low-end face capture. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Learned from a large scan and 4D corpus and built to be easy to fit; readers should remember the shape and expression spaces are linear (with pose-dependent correctives), so extreme or highly stylized faces outside the training distribution may not be captured.
- Clarity
- Clearly written and widely adopted; a first pass conveys the model structure, a second pass clarifies the registration and training pipeline.
- How to read it
- Focus on how shape, pose, and expression are factored and trained; a second pass on registration and fitting pays off because FLAME is a common building block you will likely reuse or compare against.
Facial
-
,
Systematic study of action space design choices for deep reinforcement learning locomotion controllers for simulated characters.
abstract ▾ abstract ▴
This paper studies how the choice of action parameterization affects learning difficulty and performance when using deep reinforcement learning to control articulated figure locomotion. It compares four actuation models, torques, muscle activations for musculotendon units, target joint angles for PD controllers, and target joint velocities, on a gait-cycle imitation task for planar bipeds, a dog, and a raptor. The policies are trained with an actor-critic method using positive temporal difference updates and experience replay, and MTU actuator parameters are tuned by alternating policy learning with covariance matrix adaptation. The results show that action spaces incorporating local feedback, such as PD target angles, improve learning speed, robustness, and motion quality, with the advantage growing as character complexity increases.
Related Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning · DReCon: Data-Driven Responsive Control of Physics-Based Characters · Physical Based Motion Reconstruction From Videos Using Musculoskeletal Model · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
how to read this ▾ how to read this ▴
- Category
- Empirical study: action-space design for DeepRL locomotion
- Contributions
-
- Systematic comparison of four actuation models (torques, muscle activations, PD target angles, target joint velocities) on a gait-imitation task
- Finding that action spaces with local feedback, such as PD target angles, improve learning speed, robustness, and motion quality
- Evidence that this advantage grows with character complexity, across planar bipeds, a dog, and a raptor
- Context
- Continues the authors' deep-RL locomotion line following Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning, isolating the action-parameterization variable. Builds on: Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning
- Correctness
- Conclusions are drawn from gait-cycle imitation on planar (2D) characters with an actor-critic method and specific tuning; readers should treat the PD-advantage finding as supported within that experimental setup and not assume it transfers unchanged to 3D or non-imitation tasks.
- Clarity
- Very readable as a controlled study; a first pass gives the takeaway, a second pass covers the training and MTU-tuning details.
- How to read it
- Read it for the practical conclusion on action representation; skim the RL machinery and focus on the comparison setup if you are choosing an action space for your own controller.
Motion Synthesis
-
, ,
Retrospective on developing and refining a physics-based facial muscle model for production, covering lessons in anatomical modeling.
abstract ▾ abstract ▴
Precise anatomical modeling proved essential for production-quality facial muscle simulation augmenting blendshape workflows. The team discovered that approximate muscle and skeleton geometry limited simulation quality and iteratively refined the model using simulation-aided sculpting and feedback from anatomical reference materials. Key improvements included proper nasal cartilage representation for rigidity, accurate orbicularis oris fiber layering for lip definition, and correct cheek muscle concavity for smooth nasolabial fold deformation.
Related Animating Facial Expressions · Fully Automatic Generation of Anatomical Face Simulation Models · Art-Directed Muscle Simulation for High-End Facial Animation · High-Quality Face Capture Using Anatomical Muscles
how to read this ▾ how to read this ▴
- Category
- Production talk / experience report: physics-based facial muscle modeling
- Contributions
-
- A retrospective on iteratively refining an anatomical facial muscle model for production-quality simulation
- Concrete lessons: nasal cartilage rigidity, layered orbicularis oris fibers for lip definition, and correct cheek concavity for nasolabial folds
- A simulation-aided sculpting workflow guided by anatomical reference to overcome limits of approximate geometry
- Context
- Extends the team's art-directed muscle simulation work (Cong et al., Art-Directed Muscle Simulation for High-End Facial Animation) by reporting what anatomical accuracy actually required in practice. Builds on: Art-Directed Muscle Simulation for High-End Facial Animation
- Correctness
- This is studio practice and lessons learned rather than a controlled study; the takeaways are production-proven on their pipeline but the specific geometry fixes are tied to their model and may not generalize as stated.
- Clarity
- Accessible and concrete; a single read conveys the lessons, with anatomy reference helpful for the muscle-specific points.
- How to read it
- Read once for the practical lessons on where anatomical fidelity matters most; revisit specific sections when you hit the same muscle or cartilage regions in your own facial rig.
Muscles / Facial
- Masquerade: Fine-Scale Details for Head-Mounted Camera Motion Capture Data SIGGRAPH Industrial 12 cites
, ,
Modular ML pipeline that adds fine-scale wrinkle and expression detail to sparse head-mounted camera marker data, used in Digital Domain's production capture workflow.
abstract ▾ abstract ▴
Masquerade is a modular tool for adding fine-scale details to facial motion capture data acquired from head-mounted cameras, which produce lower-resolution reconstructions than fixed seated capture rigs. The authors study prior data-driven approaches that separate large-scale and fine-scale deformations, then combine their strengths: deformation gradients represent the face pose so training data can be reused across marker sets, local vertex offsets encode the fine-scale details, and one radial basis function with a biharmonic kernel is fit per marker region instead of per vertex. This region-based scheme reduces memory and computation while improving reconstruction quality. The solution is in production use for enhancing marker data with fine-scale facial detail.
Related Facial Performance Enhancement Using Dynamic Shape Space Analysis · Interactive Editing of Performance-based Facial Animation · Semi-Supervised Video-Driven Facial Animation Transfer for Production · Avengers: Capturing Thanos's Complex Face
how to read this ▾ how to read this ▴
- Category
- Method / production tool: fine-scale enhancement of HMC facial capture
- Contributions
-
- Masquerade, a modular pipeline that adds fine-scale wrinkle and expression detail to sparse head-mounted-camera marker data
- A representation that separates large-scale pose (deformation gradients, so training data reuses across marker sets) from fine-scale detail (local vertex offsets)
- A region-based scheme fitting one biharmonic-kernel RBF per marker region rather than per vertex, reducing memory and compute while improving quality
- Context
- Addresses the gap between high-resolution seated capture systems such as Beeler et al.'s Medusa and the lower-resolution data from head-mounted cameras, combining prior large-scale/fine-scale separation approaches. Builds on: Medusa: A Production-Ready Photoreal Facial Performance Capture System
- Correctness
- Reported as in production use at Digital Domain; this is a production-validated engineering solution rather than a benchmarked study, so quality claims rest on artist-facing use, and reconstruction still depends on the training data and marker layout.
- Clarity
- Clear and pragmatic; a first pass conveys the two-scale, per-region design, a second pass covers the RBF formulation.
- How to read it
- Focus on the large-scale vs fine-scale split and the per-region RBF choice; a second pass on the deformation-gradient representation pays off if you work with HMC capture pipelines.
Facial
-
Autodesk walkthrough of two deformation tools added in Maya 2017 Update 3: the Tension deformer for surface-tension effects and the Bake Deformer Tool for converting complex deformers to skin weights for game engine export.
abstract ▾ abstract ▴
Autodesk walkthrough of two deformation features in Maya 2017 Update 3 demonstrated on an unskinned character. The tension deformer is applied over a geodesic voxel bind to reduce facet tearing and surface stretching when the arm, collarbone, and chest bones are rotated, with iterations and envelope controls shown. The Bake Deformer Tool, invoked via a script editor command, then transfers the weighted deformation result from a source mesh and skeleton onto a target leather-suit mesh and skeleton by rotating all bones, measuring vertex displacement, and baking it into a plain skin cluster of five influences so the result can be exported to a game engine without the deformer.
Related Autodesk Maya - Features - Delta Mush deformer · Maya 2022: New Features for Rigging · Maya 2020 | Proximity Wrap Deformer · Robust Skin Weights Transfer via Weight Inpainting
how to read this ▾ how to read this ▴
- Category
- Production talk: DCC tool walkthrough (Maya deformation features)
- Contributions
-
- Demonstrates the Tension deformer over a geodesic voxel bind to reduce facet tearing and stretching when arm, collarbone, and chest bones rotate, with iterations and envelope controls
- Demonstrates the Bake Deformer Tool transferring a weighted deformation result onto a target mesh and skeleton by measuring vertex displacement across bone rotations
- Shows baking the result into a plain skin cluster of five influences for game-engine export without the deformer
- Context
- A vendor walkthrough of Maya 2017 Update 3 deformation tools, relating to the general DCC-to-game-engine skinning and deformation-baking workflow.
- Correctness
- This is an Autodesk product demonstration, not peer-reviewed work; behavior shown is product-defined and results depend on the specific character, bind, and influence-count settings used in the demo.
- Clarity
- Highly accessible, step-by-step demo; a single viewing conveys what the tools do and how to invoke them.
- How to read it
- Watch once to learn when to reach for the Tension deformer versus Bake Deformer Tool; revisit the baking steps when you actually need to export deformer-driven results to a game engine.
Skinning / Rigging
-
, , , ,
Frame-based retargeting using normalized Euclidean distance matrices of inter-joint distances to transfer motion across differently proportioned skeletons.
abstract ▾ abstract ▴
Presents a distance-based approach to motion retargeting that represents human postures using normalized Euclidean Distance Matrices containing all inter-joint distances. Proposes normalization and denormalization procedures based on kinematic chain lengths to adapt distance matrices across different skeletal morphologies. Uses a Distance Geometry Problem solver with spectral gradient optimization to compute retargeted joint positions that best satisfy the adapted distance constraints.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Retargeting for Crowd Simulation · Retargeting Motion to New Characters · Real-Time Motion Retargeting to Highly Varied User-Created Morphologies
how to read this ▾ how to read this ▴
- Category
- Method: a motion retargeting algorithm
- Contributions
-
- A frame-based posture representation using normalized Euclidean Distance Matrices of all inter-joint distances
- Normalization and denormalization procedures based on kinematic-chain lengths to adapt distance matrices across different skeletal morphologies
- A Distance Geometry Problem solver with spectral gradient optimization to recover retargeted joint positions
- Context
- Belongs to the motion retargeting lineage opened by Gleicher's Retargeting Motion to New Characters, recasting the cross-morphology transfer as a distance-geometry problem. Builds on: Retargeting Motion to New Characters
- Correctness
- The approach is posture (frame) based on inter-joint distances; readers should keep in mind that a purely per-frame distance formulation may not by itself guarantee temporal smoothness or enforce constraints like foot contacts unless handled separately.
- Clarity
- Moderately technical; a first pass conveys the distance-matrix idea, a second pass is needed for the normalization scheme and the DGP solver.
- How to read it
- Focus on how postures become distance matrices and how chain-length normalization bridges morphologies; a second pass on the DGP solver pays off only if you implement or extend it.
Retargeting
-
, , ,
Simulates facial expressions via nonlinear energies modeling passive flesh, active muscles, and rigid bone; supports inertia, skin sliding, and collision handling.
abstract ▾ abstract ▴
We present a novel physics-based approach to facial animation. Contrary to commonly used generative methods, our solution computes facial expressions by minimizing a set of non-linear potential energies that model the physical interaction of passive flesh, active muscles, and rigid bone structures. By integrating collision and contact handling into the simulation, our algorithm avoids inconsistent poses commonly observed in generative methods such as blendshape rigs. A novel muscle activation model leads to a robust optimization that faithfully reproduces complex facial articulations. We show how person-specific simulation models can be built from a few expression scans with a minimal data acquisition process and an almost entirely automated processing pipeline. Our method supports temporal dynamics due to inertia or external forces, incorporates skin sliding to avoid unnatural stretching, and offers full control of the simulation parameters, which enables a variety of advanced animation effects. For example, slimming or fattening the face is achieved by simply scaling the volume of the soft tissue elements. We show a series of application demos, including artistic editing of the animation model, simulation of corrective facial surgery, or dynamic interaction with external forces and objects.
Related An Implicit Physical Face Model Driven by Expression and Style · SoftDECA: Computationally Efficient Physics-Based Facial Animations · BlendForces: A Dynamic Framework for Facial Animation · Lessons from the Evolution of an Anatomical Facial Muscle Model
how to read this ▾ how to read this ▴
- Category
- Method: physics-based face modeling and animation
- Contributions
-
- Phace, a physics-based facial animation method that computes expressions by minimizing nonlinear potential energies over passive flesh, active muscles, and rigid bone
- A novel muscle activation model plus integrated collision and contact handling that avoids the inconsistent poses common to generative blendshape rigs
- Person-specific simulation models built from a few expression scans via a near-automatic pipeline, with temporal dynamics, skin sliding, and editing effects such as scaling soft-tissue volume
- Context
- Builds on personalized anatomical body modeling (Kadlecek et al., Reconstructing Personalized Anatomical Models for Physics-based Body Animation), bringing a physics-simulation alternative to generative blendshape facial animation. Builds on: Reconstructing Personalized Anatomical Models for Physics-based Body Animation
- Correctness
- Person-specific models are built from only a few expression scans; readers should note the realism depends on the simulation parameters and the quality of those scans, and physics-based solving is generally heavier and harder to direct than blendshape evaluation.
- Clarity
- Conceptually clear but technically dense; a first pass conveys the simulate-instead-of-blend idea, a second and likely third pass are needed for the energy formulation and activation model.
- How to read it
- Read first for why physical simulation avoids blendshape artifacts; do a careful second pass on the energy terms and muscle activation model if you plan to build or evaluate a physics-based face rig.
Facial / Muscles
-
, ,
A phase-conditioned network that generates real-time locomotion, the breakthrough for neural character controllers.
abstract ▾ abstract ▴
This paper presents the Phase-Functioned Neural Network (PFNN), a real-time character control architecture in which the network weights are generated each frame by a cyclic phase function, implemented as a cubic Catmull-Rom spline parameterized by the phase of the motion cycle. The network takes user controls, the previous character state, and the surrounding terrain geometry as input and produces high quality locomotion that adapts to rough terrain, obstacles, and low ceilings through walking, running, jumping, climbing, and crouching. To train the system, motion capture data is fitted to a large database of heightmaps extracted from virtual environments. By explicitly handling the phase, the PFNN avoids the dying-out and blending artifacts of autoregressive RNN models while remaining compact and fast, requiring only milliseconds of execution and a few megabytes of memory even when trained on gigabytes of data.
Related A Deep Learning Framework for Character Motion Synthesis and Editing · Neural State Machine for Character-Scene Interactions · DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds · Learned Motion Matching
how to read this ▾ how to read this ▴
- Category
- Method: a neural network architecture for real-time character control
- Contributions
-
- The Phase-Functioned Neural Network, whose weights are regenerated each frame by a cyclic phase function (a cubic Catmull-Rom spline over the motion-cycle phase)
- Real-time, terrain-adaptive locomotion (walking, running, jumping, climbing, crouching) from user controls, prior state, and surrounding terrain geometry
- A compact, fast model that avoids autoregressive RNN dying-out and blending artifacts while training on motion fitted to large heightmap databases
- Context
- Bridges data-driven control approaches such as Clavet's Motion Matching and the deep-learning motion synthesis line (Holden et al.), introducing explicit phase conditioning for character controllers. Builds on: Motion Matching and The Road to Next-Gen Animation · A Deep Learning Framework for Character Motion Synthesis and Editing
- Correctness
- Demonstrated on locomotion with explicit cyclic phase; readers should remember the phase assumption suits cyclic motion well, that preparing training data requires fitting captures to terrain heightmaps, and that non-cyclic actions are a less natural fit for this parameterization.
- Clarity
- Well written and influential; a first pass conveys the phase-function idea clearly, a second pass covers the data preparation and network details.
- How to read it
- Focus on the phase-function weight-generation idea and why it beats autoregressive blending; a second pass on data fitting and the spline parameterization pays off given how foundational this work is.
Motion Synthesis
-
Describes Naughty Dog's physics layer on top of gameplay animation in Uncharted 4, enabling cloth and secondary-motion with predictable artist control and minimal visual distortion.
CFX
-
, , , , , , ,
Deep CNN system for production facial performance capture achieving real-time tracking speed with production-quality reconstruction accuracy.
abstract ▾ abstract ▴
This paper presents a real-time deep learning framework for video-based facial performance capture that densely tracks an actor's face from a monocular video. A high-end production capture pipeline based on multi-view stereo and artist clean-up is applied to 5 to 10 minutes of footage to generate training data, which is used to train a convolutional neural network that predicts the per-frame positions of roughly 5000 facial mesh vertices from a single grayscale input image. The output layer is initialized with a PCA basis of the target meshes and the network is trained with data augmentation against a mean square error loss. Once trained, the network processes the remaining footage automatically at rates up to 870 frames per second, drastically reducing manual labor while producing plausible inference even in self-occluded regions such as the eyes and lips.
Related Fast and Deep Facial Deformations · Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion · Monocular Facial Performance Capture via Deep Expression Matching · High Resolution Passive Facial Performance Capture
how to read this ▾ how to read this ▴
- Category
- Method / system: deep CNN for video-based facial performance capture
- Contributions
-
- A real-time framework that densely tracks an actor's face from monocular video, predicting per-frame positions of roughly 5000 mesh vertices from a single grayscale image
- Use of a high-end multi-view stereo and artist-cleanup pipeline on a few minutes of footage to bootstrap training data, with a PCA-initialized output layer and data augmentation
- Automatic processing of the remaining footage at high frame rates, with plausible inference in self-occluded regions such as eyes and lips
- Context
- Follows the real-time high-fidelity facial capture line (Cao et al., Real-Time High-Fidelity Facial Performance Capture), replacing hand-crafted tracking with a learned CNN regressor. Builds on: Real-Time High-Fidelity Facial Performance Capture
- Correctness
- The network is trained per actor from that actor's own clean-up-quality footage, so accuracy is tied to the bootstrap pipeline and the trained subject; readers should treat the throughput and quality as demonstrated within that production setup rather than as a general cross-actor model.
- Clarity
- Clear and practical; a first pass conveys the bootstrap-then-infer idea, a second pass covers the network and PCA-output details.
- How to read it
- Focus on the data-bootstrapping strategy and PCA-initialized output that make the CNN production-viable; a second pass on architecture and augmentation pays off if you build a similar capture system.
Facial / ML Deformation
- Regularized Kelvinlets: Sculpting Brushes Based on Fundamental Solutions of Elasticity SIGGRAPH Pixar 25 cites
,
Closed-form elastic sculpting brushes for volume-preserving grab, scale, twist, and pinch used in character pose-space sculpting workflows.
abstract ▾ abstract ▴
We introduce a new technique for real-time physically based volume sculpting of virtual elastic materials. Our formulation is based on the elastic response to localized force distributions associated with common modeling primitives such as grab, scale, twist, and pinch. The resulting brush-like displacements correspond to the regularization of fundamental solutions of linear elasticity in infinite 2D and 3D media. These deformations thus provide the realism and plausibility of volumetric elasticity, and the interactivity of closed-form analytical solutions. To finely control our elastic deformations, we also construct compound brushes with arbitrarily fast spatial decay. Furthermore, pointwise constraints can be imposed on the displacement field and its derivatives via a single linear solve. We demonstrate the versatility and efficiency of our method with multiple examples of volume sculpting and image editing.
Related Interactive Skeleton-Driven Dynamic Deformations · Physically Based Rigging for Deformable Characters · Sharp Kelvinlets: Elastic Deformations with Cusps and Localized Falloffs · Dynamic Kelvinlets: Secondary Motions Based on Fundamental Solutions of Elastodynamics
how to read this ▾ how to read this ▴
- Category
- Method: closed-form elastic deformation primitives (sculpting brushes)
- Contributions
-
- Brush-like grab, scale, twist, and pinch displacements derived as regularized fundamental solutions of linear elasticity in 2D and 3D
- Compound brushes with arbitrarily fast spatial decay for fine local control
- Pointwise constraints on the displacement field and its derivatives via a single linear solve
- Context
- Grounds interactive sculpting in classical linear elasticity (Kelvin's fundamental solutions), offering an analytical alternative to mesh-based or FEM volume deformation for pose-space sculpting workflows.
- Correctness
- Built on linear elasticity in infinite media, so results are physically plausible rather than exact for finite or nonlinear materials; the closed-form, regularized formulation buys real-time interactivity but the demonstrations are sculpting and image-editing examples rather than a quantitative validation.
- Clarity
- Conceptually accessible (brush metaphor); a first pass conveys the idea, a second pass is needed for the elasticity formulation and regularization.
- How to read it
- First pass for the brush vocabulary and the regularization idea; do a careful second pass on the fundamental-solution math and the constraint solve if you intend to implement or extend the brushes.
Rigging / Skinning
-
,
Derives skinning weights for skeleton-driven animation from a watershed mesh segmentation, working for characters in T-pose or arbitrary poses.
abstract ▾ abstract ▴
Proposes a skinning approach based on mesh segmentation: a watershed segmentation of the character mesh guides assignment of skinning weights for skeleton-driven animation, working for models in T-pose as well as arbitrary poses, automating and improving the binding step.
Related Real-Time Skeletal Skinning with Optimized Centers of Rotation · Geodesic Voxel Binding for Production Character Meshes · A Statistical Model of Human Pose and Body Shape · Autodesk Maya - Features - Delta Mush deformer
how to read this ▾ how to read this ▴
- Category
- Method: an automatic skinning-weight algorithm
- Contributions
-
- Derives skeleton-driven skinning weights from a watershed segmentation of the character mesh
- Works for models in T-pose as well as arbitrary poses
- Automates and improves the binding step of the rigging pipeline
- Context
- Builds on automatic rigging and weight-assignment work such as Baran and Popovic's Automatic Rigging and Animation of 3D Characters, substituting a watershed mesh segmentation for the usual heat-diffusion or distance-based weight computation. Builds on: Automatic Rigging and Animation of 3D Characters
- Correctness
- Assumes a watershed segmentation aligns well with the skeleton's articulated regions; the claimed pose-robustness (T-pose and arbitrary poses) is the central promise, so a reader should check on which mesh types it was demonstrated and watch for artifacts where segment boundaries cross joints.
- Clarity
- Accessible to anyone familiar with skinning; a first pass conveys the pipeline, a second pass clarifies the segmentation-to-weight mapping.
- How to read it
- Focus the first pass on how segmentation maps to weights and the comparison against standard automatic weighting; a second pass pays off only if you need the segmentation details to reproduce it.
Skinning
-
, , ,
First method to capture dynamic hair and infer physical simulation parameters from video, enabling playback-and-edit of captured hairstyle dynamics.
CFX
-
, , , , , ,
Optimization method for computing sparse rig parameter values that reproduce animator-specified poses with minimal parameter usage.
abstract ▾ abstract ▴
Proposes motion retargeting to artist-friendly rig space by optimizing sparse parameters that minimize source motion error while maintaining editability. Uses intermediate object to transfer motion from various sources to production rigs, with sparsity regularization to activate only necessary controls and keyframe extraction for efficient editing.
Related MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds · Learning an Inverse Rig Mapping for Character Animation · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · Normalized Euclidean Distance Matrices for Human Motion Retargeting
how to read this ▾ how to read this ▴
- Category
- Method: an optimization for motion retargeting into rig space
- Contributions
-
- Optimizes sparse rig parameter values that reproduce a target pose or motion with minimal source error
- Uses an intermediate object to transfer motion from varied sources onto production rigs
- Adds sparsity regularization to activate only necessary controls plus keyframe extraction for editable output
- Context
- Relates to learning inverse rig mappings for character animation (e.g. Holden et al.'s Learning an Inverse Rig Mapping), but emphasizes sparsity and editability so the retargeted result stays artist-friendly. Builds on: Learning an Inverse Rig Mapping for Character Animation
- Correctness
- Assumes that a sparse activation of controls can adequately reproduce the source motion while preserving editability; the value depends on the chosen sparsity regularizer and on the rigs tested, so a reader should weigh reconstruction error against the cleanliness of the resulting curves.
- Clarity
- Moderately technical; a first pass conveys the goal (sparse, editable retargeting), a second pass is needed for the optimization formulation.
- How to read it
- First pass for the problem framing (sparsity plus editability); do a second pass on the objective and the sparsity term if you care about clean retargeted keyframes.
Rigging
-
, , ,
Production pipeline for Moana's complex curly hair with heavy character and environment interaction, introducing a new hair model with overhauled collision handling.
abstract ▾ abstract ▴
This talk describes the hair simulation and technical animation pipeline developed at Walt Disney Animation Studios for the long curly hair of characters in Moana. A new elastic rod hair model based on Discrete Viscous Threads was implemented to capture the bending and twisting modes of curly hair, requiring a new curve-based data structure that embeds twist data and a base frame alongside positions. Hair volume was supported through dynamic hair-hair collision response using edge-edge repulsion springs with stiction and attenuation parameters rather than static connections, giving artists intuitive control over clumping and break-up. The team also developed a force-based grab node for complex hair interactions and a wind shadowing model supporting self and external shadowing for the film's outdoor environments.
Related Scriptable Character FX Solution · Hair Emoting with Style Guides in Turning Red · A Mass Spring Model for Hair Simulation · Art-Directing Asha's Braids in Disney's Wish
how to read this ▾ how to read this ▴
- Category
- Production talk: a hair simulation pipeline breakdown
- Contributions
-
- Demonstrates an elastic rod hair model based on Discrete Viscous Threads for curly hair bending and twisting, with a curve data structure embedding twist and a base frame
- Shows dynamic hair-hair collision response via edge-edge repulsion springs with stiction and attenuation for artist-controlled clumping and break-up
- Presents a force-based grab node for complex interactions and a wind shadowing model with self and external shadowing
- Context
- Extends Disney's prior production hair work (e.g. Simulating Rapunzel's Hair in Disney's Tangled) to the longer, curlier, more interactive hair of Moana, swapping static connections for dynamic collision response. Builds on: Simulating Rapunzel's Hair in Disney's Tangled
- Correctness
- Studio practice, not peer-reviewed; the techniques are production-proven on the film but tuned for its specific look and shots, so parameters and trade-offs are art-directed rather than generally validated.
- Clarity
- Accessible breakdown aimed at practitioners; a single read conveys the pipeline and the rationale behind each component.
- How to read it
- Read once for the model choices and the artist-control trade-offs (stiction, attenuation, clumping); a second pass only if you are implementing Discrete Viscous Threads or the collision response.
CFX
-
, ,
Describes the grooming, rigging, and simulation pipeline for character-connected hair shared between conjoined twins in Trolls.
abstract ▾ abstract ▴
This DigiPro talk presents the techniques used to create the seamlessly connected, looped hair for the conjoined Fashionista Twins, Satin and Chenille, in the film Trolls. The twins are treated as two instances of a single character asset, and grooming achieves a mirrored look by mirroring the deformable guide hairs and scalp about the vertical axis before relaxing them. A new render-time styler called MatchEnds, built into the in-house Willow hair geometry library, geometrically connects matched follicle curves across the two heads using equal pulling with falloff, Laplacian smoothing of shared CVs, and tangency matching. For motion, a modified Spline IK system drives joint chains spanning the combined hair length with cross-character constraints, and the shot pipeline outputs mirrored guides and render hairs in Satin's reference space to assemble the full renderable hair asset.
Related Hair Emoting with Style Guides in Turning Red · Hair and Fur in an Evolving Pipeline · Patch-based Surface Relaxation · A Biologically-Parameterized Feather Model
how to read this ▾ how to read this ▴
- Category
- Production talk: a grooming, rigging, and simulation breakdown
- Contributions
-
- Demonstrates mirrored grooming by mirroring deformable guide hairs and scalp about the vertical axis before relaxing
- Shows a render-time MatchEnds styler that geometrically connects matched follicle curves across two heads via equal pulling with falloff, Laplacian smoothing, and tangency matching
- Presents a modified Spline IK system with cross-character constraints to drive the combined hair length, outputting mirrored guides in one twin's reference space
- Context
- Relates to film hair grooming and simulation pipelines, addressing the unusual requirement of seamlessly conjoined, looped hair shared between two characters treated as instances of a single asset.
- Correctness
- Studio practice, not peer-reviewed; production-proven for the Fashionista Twins but a bespoke solution to a specific conjoined-hair problem, so its techniques are special-purpose rather than broadly general.
- Clarity
- Accessible to hair and rigging practitioners; one read conveys the connection trick and the cross-character rig.
- How to read it
- Read once for the asset-sharing idea and the MatchEnds connection approach; revisit specific sections only if you face a similar shared or looped hair problem.
CFX
- VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera at over 30fps SIGGRAPH Academic 870 cites
, , , , , , , ,
Real-time full-body 3D pose estimation from a single RGB camera at over 30fps enabling markerless mocap for interactive applications.
abstract ▾ abstract ▴
We present the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera. Our method combines a new convolutional neural network (CNN) based pose regressor with kinematic skeleton fitting. Our novel fully-convolutional pose formulation regresses 2D and 3D joint positions jointly in real time and does not require tightly cropped input frames. A real-time kinematic skeleton fitting method uses the CNN output to yield temporally stable 3D global pose reconstructions on the basis of a coherent kinematic skeleton. This makes our approach the first monocular RGB method usable in real-time applications such as 3D character control---thus far, the only monocular methods for such applications employed specialized RGB-D cameras. Our method's accuracy is quantitatively on par with the best offline 3D monocular RGB pose estimation methods. Our results are qualitatively comparable to, and sometimes better than, results from monocular RGB-D approaches, such as the Kinect. However, we show that our approach is more broadly applicable than RGB-D solutions, i.e., it works for outdoor scenes, community videos, and low quality commodity RGB cameras.
Related Normalized Euclidean Distance Matrices for Human Motion Retargeting · Robust Marker Trajectory Repair for MOCAP Using Kinematic Reference · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Retargeting for Crowd Simulation
how to read this ▾ how to read this ▴
- Category
- Method: real-time monocular 3D human pose estimation
- Contributions
-
- First real-time method to capture stable, temporally consistent full global 3D skeletal pose from a single RGB camera at over 30fps
- A fully-convolutional CNN pose regressor that jointly predicts 2D and 3D joint positions without tightly cropped input
- A real-time kinematic skeleton fitting step that yields temporally stable global pose suitable for 3D character control
- Context
- Sits in the lineage of CNN-based 2D and 3D human pose estimation combined with kinematic skeleton fitting, extending such methods to real-time monocular RGB use where prior interactive solutions relied on RGB-D cameras like Kinect.
- Correctness
- Reported as quantitatively on par with the best offline monocular RGB methods and qualitatively comparable to or better than RGB-D, with broader applicability (outdoor and community video); as a single-RGB approach it inherits depth and scale ambiguity, so absolute global accuracy should be read with that caveat.
- Clarity
- Clear system-level paper; a first pass conveys the two-stage architecture, a second pass is needed for the network and the fitting formulation.
- How to read it
- First pass for the regressor-plus-fitting split and the real-time claim; do a second pass on the CNN formulation and the kinematic fitting if implementing or comparing pose estimators.
Retargeting
2016
36-
, ,
Convolutional autoencoder for human motion that enables synthesis and editing of character motion sequences in a compact latent space.
abstract ▾ abstract ▴
We present a framework to synthesize character movements based on high level parameters, such that the produced movements respect the manifold of human motion, trained on a large motion capture dataset. The learned motion manifold, which is represented by the hidden units of a convolutional autoencoder, represents motion data in sparse components which can be combined to produce a wide range of complex movements. To map from high level parameters to the motion manifold, we stack a deep feedforward neural network on top of the trained autoencoder. This network is trained to produce realistic motion sequences from parameters such as a curve over the terrain that the character should follow, or a target location for punching and kicking. The feedforward control network and the motion manifold are trained independently, allowing the user to easily switch between feedforward networks according to the desired interface, without re-training the motion manifold. Once motion is generated it can be edited by performing optimization in the space of the motion manifold. This allows for imposing kinematic constraints, or transforming the style of the motion, while ensuring the edited motion remains natural. As a result, the system can produce smooth, high quality motion sequences without any manual pre-processing of the training data.
Related Phase-Functioned Neural Networks for Character Control · Learning Motion Manifolds with Convolutional Autoencoders · Character Controllers Using Motion VAEs · Neural State Machine for Character-Scene Interactions
how to read this ▾ how to read this ▴
- Category
- Method: a deep-learning framework for motion synthesis and editing
- Contributions
-
- Learns a motion manifold via a convolutional autoencoder trained on a large motion-capture dataset, representing motion as sparse combinable components
- Stacks a feedforward network on top of the manifold to map high-level parameters (terrain curves, target locations for punching/kicking) to realistic motion
- Enables editing by optimizing in the manifold space, imposing kinematic constraints or transforming style while keeping motion plausible
- Context
- Extends the authors' earlier convolutional-autoencoder motion manifold work (holden-deep-learning-motion-2015) into a full synthesis-and-editing framework with high-level control. Builds on: Learning Motion Manifolds with Convolutional Autoencoders
- Correctness
- Assumes the learned manifold captures the space of natural human motion from the training corpus, so synthesized and edited motion stays plausible only within that distribution; out-of-distribution parameters or styles may degrade results.
- Clarity
- Reasonably accessible; a first pass conveys the manifold-plus-control idea, a second pass clarifies the autoencoder architecture and the editing optimization.
- How to read it
- Focus first on the separation of motion manifold from control network (trained independently); a second pass is worth it to understand the convolutional autoencoder and the manifold-space optimization for editing.
Motion Synthesis
- An Anatomically Constrained Local Deformation Model for Monocular Face Capture SIGGRAPH Disney Research 107 cites
, , ,
Anatomically constrained local deformation model for monocular face capture that prevents physically implausible face reconstructions.
abstract ▾ abstract ▴
We present a new anatomically-constrained local face model and fitting approach for tracking 3D faces from 2D motion data in very high quality. In contrast to traditional global face models, often built from a large set of blendshapes, we propose a local deformation model composed of many small subspaces spatially distributed over the face. Our local model offers far more flexibility and expressiveness than global blendshape models, even with a much smaller model size. This flexibility would typically come at the cost of reduced robustness, in particular during the under-constrained task of monocular reconstruction. However, a key contribution of this work is that we consider the face anatomy and introduce subspace skin thickness constraints into our model, which constrain the face to only valid expressions and helps counteract depth ambiguities in monocular tracking. Given our new model, we present a novel fitting optimization that allows 3D facial performance reconstruction from a single view at extremely high quality, far beyond previous fitting approaches. Our model is flexible, and can be applied also when only sparse motion data is available, for example with marker-based motion capture or even face posing from artistic sketches.
Related High Resolution Passive Facial Performance Capture · High Fidelity Facial Animation Capture and Retargeting with Contours · Rigid Stabilization of Facial Expressions · Driving High-Resolution Facial Scans with Video Performance Capture
how to read this ▾ how to read this ▴
- Category
- Method: an anatomically constrained local face model for monocular capture
- Contributions
-
- Proposes a local face deformation model made of many small spatially distributed subspaces rather than a single global blendshape basis, giving more flexibility at smaller model size
- Introduces subspace skin-thickness (anatomy) constraints that restrict the face to valid expressions and counteract monocular depth ambiguity
- Presents a fitting optimization for high-quality single-view 3D facial performance reconstruction, applicable also to sparse marker-based motion data
- Context
- Builds on production facial-capture systems such as Medusa (beeler-medusa-2012), trading global blendshape models for a local, anatomy-constrained alternative. Builds on: Medusa: A Production-Ready Photoreal Facial Performance Capture System
- Correctness
- The key assumption is that skin-thickness anatomical priors keep the flexible local model from overfitting implausible shapes under the under-constrained monocular setting; demonstrated on high-quality tracking, but robustness still hinges on the validity of those anatomical constraints and 2D motion data quality.
- Clarity
- Moderately technical; a first pass conveys the local-subspace-plus-anatomy idea, but the fitting optimization needs a second pass.
- How to read it
- Read first for why local subspaces plus skin-thickness constraints beat global blendshapes for monocular capture; do a second pass on the fitting optimization if you intend to implement or compare against it.
Facial
-
Ubisoft Toronto presented motion matching as a practical replacement for manually crafted animation trees, demonstrating high-quality fluid locomotion from dense mocap databases.
Motion Synthesis
- talk Animation Bootcamp: The Animate Button: Mocap Automation Techniques at Ubisoft Montreal GDC Industrial
Presents Ubisoft Montreal's scripted mocap automation toolset that reduces manual cleanup and retargeting labour across large-scale game productions with hundreds of characters.
Retargeting
-
, , , , , , ,
Transfers skeleton structure and skinning weights between characters with different mesh topologies without manual re-rigging.
abstract ▾ abstract ▴
We present a general method for transferring skeletons and skinning weights between characters with distinct mesh topologies. Our pipeline takes as inputs a source character rig (consisting of a mesh, a transformation hierarchy of joints, and skinning weights) and a target character mesh. From these inputs, we compute joint locations and orientations that embed the source skeleton in the target mesh, as well as skinning weights to bind the target geometry to the new skeleton. Our method consists of two key steps. We first compute the geometric correspondence between source and target meshes using a semi‐automatic method relying on a set of markers. The resulting geometric correspondence is then used to formulate attribute transfer as an energy minimization and filtering problem. We demonstrate our approach on a variety of source and target bipedal characters, varying in mesh topology and morphology. Several examples demonstrate that the target characters behave well when animated with either forward or inverse kinematics. Via these examples, we show that our method preserves subtle artistic variations; spatial relationships between geometry and joints, as well as skinning weight details, are accurately maintained. Our proposed pipeline opens up many exciting possibilities to quickly animate novel characters by reusing existing production assets.
Related SkinMixer: Blending 3D Animated Models · Sculpt Processing for Character Rigging · A Statistical Model of Human Pose and Body Shape · Automatic Rigging and Animation of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: skeleton and skinning-weight transfer between characters
- Contributions
-
- Transfers a source rig (mesh, joint hierarchy, skinning weights) onto a target mesh with distinct topology, computing embedded joint locations/orientations and new skinning weights
- Establishes source-to-target geometric correspondence via a semi-automatic marker-based method
- Formulates attribute transfer as an energy-minimization and filtering problem, validated on varied bipedal characters under both forward and inverse kinematics
- Context
- Continues the automatic rigging lineage of Baran and Popovic's Pinocchio (baran-automatic-rigging-2007), focusing on reusing an existing rig rather than building one from scratch. Builds on: Automatic Rigging and Animation of 3D Characters
- Correctness
- Relies on a few manually placed markers to seed correspondence and is demonstrated on bipedal characters varying in topology and morphology; a reader should note results are shown for bipeds and that quality depends on marker placement and source-target morphological similarity.
- Clarity
- Accessible; a first pass conveys the two-step (correspondence then transfer) pipeline, a second pass covers the energy formulation.
- How to read it
- First pass for the correspondence-then-transfer structure and what inputs it needs; second pass on the energy-minimization step if you want to reproduce the weight and joint transfer.
Rigging / Skinning / Retargeting
-
, ,
Art-directable muscle simulation system for face and body allowing artists to achieve specific looks while respecting physics constraints.
Muscles / Facial
-
, , , ,
First fully automatic single-image hair modeling method using a hierarchical deep network for segmentation and direction estimation, building a 50K-model hairstyle database.
abstract ▾ abstract ▴
We introduce AutoHair , the first fully automatic method for 3D hair modeling from a single portrait image, with no user interaction or parameter tuning. Our method efficiently generates complete and high-quality hair geometries, which are comparable to those generated by the state-of-the-art methods, where user interaction is required. The core components of our method are: a novel hierarchical deep neural network for automatic hair segmentation and hair growth direction estimation, trained over an annotated hair image database; and an efficient and automatic data-driven hair matching and modeling algorithm, based on a large set of 3D hair exemplars. We demonstrate the efficacy and robustness of our method on Internet photos, resulting in a database of around 50K 3D hair models and a corresponding hairstyle space that covers a wide variety of real-world hairstyles. We also show novel applications enabled by our method, including 3D hairstyle space navigation and hair-aware image retrieval.
Related HairNet: Single-View Hair Reconstruction Using Convolutional Neural Networks · NeuralHDHair: Automatic High-Fidelity Hair Modeling from a Single Image Using Implicit Neural Representations · Single-View Hair Modeling Using a Hairstyle Database · Data-Driven Estimation of Cloth Simulation Models
how to read this ▾ how to read this ▴
- Category
- Method: fully automatic single-image 3D hair modeling
- Contributions
-
- Presents the first fully automatic single-image 3D hair modeling method, requiring no user interaction or parameter tuning
- Introduces a hierarchical deep network for automatic hair segmentation and growth-direction estimation trained on an annotated hair-image database
- Uses a data-driven matching and modeling algorithm over 3D hair exemplars, producing a ~50K-model database and a navigable hairstyle space, plus hair-aware image retrieval
- Context
- Builds on database-driven single-view hair modeling such as Hu et al. (hu-singleview-2015), removing the user interaction those methods required by adding learned segmentation and direction estimation. Builds on: Single-View Hair Modeling Using a Hairstyle Database
- Correctness
- Quality depends on the annotated training database and the coverage of the 3D hair-exemplar set; demonstrated on Internet photos with results stated as comparable to interactive state-of-the-art, so exotic or heavily occluded hairstyles outside the exemplar space are a likely limitation.
- Clarity
- Accessible; a first pass conveys the automatic pipeline, a second pass clarifies the network and matching algorithm.
- How to read it
- Focus first on how the deep segmentation/direction step feeds the exemplar matching; a second pass pays off for the network design and the hairstyle-space construction if hair modeling is your area.
CFX / ML Deformation
- Balanced Shape Space Construction with Applications to Planar Mechanical Systems SCA Disney Research 6 cites
, ,
Shape space construction method for planar mechanical design, enabling balanced motion planning for articulated characters and mechanisms.
Rigging
-
, ,
Treats blendshapes as force bases rather than shape bases, enabling dynamic and physically plausible facial animation and retargeting.
abstract ▾ abstract ▴
In this paper we present a new paradigm for the generation and retargeting of facial animation. Like a vast majority of the approaches that have adressed these topics, our formalism is built on blendshapes. However, where prior works have generally encoded facial geometry using a low dimensional basis of these blendshapes, we propose to encode facial dynamics by looking at blendshapes as a basis of forces rather than a basis of shapes . We develop this idea into a dynamic model that naturally combines the blendshapes paradigm with physics‐based techniques for the simulation of deforming meshes. Because it escapes the linear span of the shape basis through time‐integration and physics‐inspired simulation, this approach has a wider expressive range than previous blendshape‐based methods. Its inherent physically‐based formulation also enables the simulation of more advanced physical interactions, such as collision responses on lip contacts.
Related Realtime Performance-Driven Physical Simulation for Facial Animation · Direct Manipulation Blendshapes · Phace: Physics-based Face Modeling and Animation · SoftDECA: Computationally Efficient Physics-Based Facial Animations
how to read this ▾ how to read this ▴
- Category
- Method: a physics-based dynamic framework for facial animation
- Contributions
-
- Reframes blendshapes as a basis of forces rather than a basis of shapes, encoding facial dynamics instead of static geometry
- Develops a dynamic model combining the blendshape paradigm with physics-based deforming-mesh simulation and time integration
- Achieves a wider expressive range than linear blendshape methods and enables physical interactions such as lip-contact collision responses
- Context
- Departs from the standard linear blendshape control of Lewis and Anjyo's direct-manipulation blendshapes (lewis-direct-manipulation-blendshapes-2010) by recasting the basis within a physics simulation. Builds on: Direct Manipulation Blendshapes
- Correctness
- The central assumption is that treating blendshapes as forces driving a simulated mesh yields more plausible dynamics than linear shape blending; the gains rest on the physics model and integration, and a reader should weigh added simulation cost against the expressiveness benefit.
- Clarity
- Conceptually elegant but formally dense; a first pass conveys the forces-not-shapes reframing, the simulation math needs a careful second pass.
- How to read it
- Read first to grasp the forces-as-basis idea and why it escapes the linear span; a second (and likely third) pass is worth it for the dynamic model and time-integration formulation.
Facial
-
, , ,
System for constructing personalized volumetric facial rigs from scans, combining blendshapes with physics simulation for expressive animation.
abstract ▾ abstract ▴
This paper presents a facial animation system that builds user-specific volumetric face rigs combining the direct control of blendshapes with the physical realism of simulation. From 3D scans of an actor in a neutral pose and around ten facial expressions, an average-human volumetric head template (skull, jaw, flesh tet-mesh, and skin) is registered and deformed using a Projective Dynamics optimization framework with priors such as flesh thickness measurements and a PCA face model. A volumetric extension of Example-Based Facial Rigging produces volumetric blendshapes that are driven by the same weights as traditional blendshapes but blend deformation gradients across all tetrahedra. The resulting rig delivers physics-based effects including inertia and secondary motion, volume preservation, and collision response between lips, teeth, and external objects, and the simulated model is also used to bake quadratic corrective blendshapes for fast game-engine use.
Related SoftDECA: Computationally Efficient Physics-Based Facial Animations · Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies · Art-Directed Muscle Simulation for High-End Facial Animation · Realtime Performance-Driven Physical Simulation for Facial Animation
how to read this ▾ how to read this ▴
- Category
- Method: user-specific volumetric face rigs combining blendshapes and simulation
- Contributions
-
- Builds personalized volumetric face rigs (skull, jaw, flesh tet-mesh, skin) from a neutral scan plus ~ten expression scans, registered with a Projective Dynamics framework and anatomical/PCA priors
- Introduces a volumetric extension of example-based facial rigging that blends deformation gradients across tetrahedra under the same weights as traditional blendshapes
- Delivers physics effects (inertia, secondary motion, volume preservation, lip/teeth/object collisions) and bakes quadratic corrective blendshapes for fast game-engine use
- Context
- Builds on the authors' avatar-from-video work (ichim-dynamic-avatar-2015), extending example-based rigging into a volumetric, physically simulated representation. Builds on: Dynamic 3D Avatar Creation from Hand-Held Video Input
- Correctness
- Assumes an average-human volumetric template can be registered to an actor from a neutral plus ~ten expressions with flesh-thickness and PCA priors; the physical realism depends on those priors and template fit, and full simulation is heavier than the baked corrective approximation it provides for runtime.
- Clarity
- Technical and pipeline-heavy; a first pass conveys the volumetric-rig concept, a second pass is needed for the Projective Dynamics fitting and the volumetric blendshape construction.
- How to read it
- First pass for the rig construction inputs and the simulation-versus-baked-correctives trade; second pass on the Projective Dynamics registration and deformation-gradient blending for implementation detail.
Facial / Rigging / Muscles
- CAMA: Contact-Aware Matrix Assembly with Unified Collision Handling for GPU-based Cloth Simulation CGF Academic 70 cites
, , , ,
GPU cloth pipeline that integrates contact forces into implicit time integration via parallelized sparse matrix assembly for 100-300K triangle garments.
abstract ▾ abstract ▴
We present a novel GPU‐based approach to robustly and efficiently simulate high‐resolution and complexly layered cloth. The key component of our formulation is a parallelized matrix assembly algorithm that can quickly build a large and sparse matrix in a compressed format and accurately solve linear systems on GPUs. We also present a fast and integrated solution for parallel collision handling, including collision detection and response computations, which utilizes spatio‐temporal coherence. We combine these algorithms as part of a new cloth simulation pipeline that incorporates contact forces into implicit time integration for collision avoidance. The entire pipeline is implemented on GPUs, and we evaluate its performance on complex benchmarks consisting of 100, 300K triangles. In practice, our system takes a few seconds to simulate one frame of a complex cloth scene, which represents significant speedups over prior CPU and GPU‐based cloth simulation systems.
Related Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation · A Safe and Fast Repulsion Method for GPU-based Cloth Self Collisions · Better Collisions and Faster Cloth for Pixar's Coco · Smoothed Aggregation Multigrid for Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a GPU cloth-simulation pipeline with unified collision handling
- Contributions
-
- Introduces a parallelized contact-aware matrix assembly that builds a large sparse matrix in compressed format and solves the linear system on the GPU
- Presents an integrated parallel collision-handling solution (detection plus response) exploiting spatio-temporal coherence
- Combines these into a pipeline that folds contact forces into implicit time integration for collision avoidance, evaluated on 100-300K-triangle layered cloth
- Context
- Advances robust simultaneous-collision treatment in cloth (e.g. Harmon et al., harmon-simultaneous-collisions-2008) by mapping contact-aware implicit integration onto the GPU. Builds on: Robust Treatment of Simultaneous Collisions
- Correctness
- The contributions target throughput on high-resolution layered garments and the paper reports per-frame times of a few seconds with speedups over prior CPU/GPU systems; robustness claims rest on the unified collision handling, so results should be read as performance-and-robustness on the stated benchmark scale rather than guaranteed for arbitrary scenes.
- Clarity
- Systems-and-numerics heavy; a first pass conveys the pipeline and where the speedups come from, a second pass is required for the matrix-assembly and collision details.
- How to read it
- Read first for the pipeline structure and the contact-aware-integration idea; a second and third pass pay off for the GPU sparse-assembly and parallel collision algorithms if you build cloth solvers.
CFX
-
, ,
Pipeline for building a personalized facial rig directly from performance-capture data, fitting blendshape bases to actor-specific motion.
abstract ▾ abstract ▴
Creating a high quality blendshape rig usually involves a large amount of effort from skilled artists. Although current 3D reconstruction technologies are able to capture accurate facial geometry of the actor, it is still very difficult to build a production-ready blendshape rig from unorganized scans. Removing rigid head motion and separating mixed expressions from the captures are two of the major challenges in this process. We present a technique that creates a facial blendshape rig based on performance capture and a generic face rig. The customized rig accurately captures actor-specific face details while producing a semantically meaningful FACS basis. The resulting rig faithfully serves both artist friendly keyframe animation and high quality facial motion retargeting in production.
Related Facial Retargeting with Automatic Range of Motion Alignment · RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data · Example-Based Facial Rigging · Realtime Facial Animation with On-the-fly Correctives
how to read this ▾ how to read this ▴
- Category
- Method: building an actor-specific facial rig from performance capture
- Contributions
-
- Creates a customized facial blendshape rig from performance capture plus a generic face rig, capturing actor-specific detail while producing a semantically meaningful FACS basis
- Addresses the two main challenges of removing rigid head motion and separating mixed expressions from unorganized scans
- Produces a rig that serves both artist-friendly keyframe animation and high-quality facial motion retargeting in production
- Context
- Builds on the authors' artist-friendly facial retargeting work (seol-facial-retargeting-2011), turning captured performance into a production-ready, FACS-aligned personalized rig. Builds on: Artist Friendly Facial Animation Retargeting
- Correctness
- Assumes a generic face rig can be specialized to the actor once rigid motion is removed and mixed expressions are separated; presented as a production pipeline (DigiPro), so the reader should view it as a practitioner-validated workflow rather than a broadly benchmarked method.
- Clarity
- Accessible and practitioner-oriented; a first pass conveys the pipeline, a second pass clarifies the motion-separation and basis-fitting steps.
- How to read it
- First pass for the workflow and the two challenges it solves; a second pass is worth it on the rigid-motion removal and expression-separation if you build production facial rigs.
Facial / Rigging
-
Covered MPC techniques for photorealistic digital humans including skin shader behaviour, facial animation, and underlying muscle simulation.
Facial / Muscles / Skinning
-
,
Low-rank helper bone framework efficiently captures pose-dependent skinning deformations by learning auxiliary joint transforms from examples.
abstract ▾ abstract ▴
Dynamic skin deformation is vital for creating life-like characters, and its real-time computation is in great demand in interactive applications. We propose a practical method to synthesize plausible and dynamic skin deformation based on a helper bone rig. This method builds helper bone controllers for the deformations caused not only by skeleton poses but also secondary dynamics effects. We introduce a state-space model for a discrete time linear time-invariant system that efficiently maps the skeleton motion to the dynamic movement of the helper bones. Optimal transfer of nonlinear, complicated deformations, including the effect of soft-tissue dynamics, is obtained by learning the training sequence consisting of skeleton motions and corresponding skin deformations. Our approximation method for a dynamics model is highly accurate and efficient owing to its low-rank property obtained by a sparsity-oriented nuclear norm optimization. The resulting linear model is simple enough to easily implement in the existing workflows and graphics pipelines. We demonstrate the superior performance of our method compared to conventional dynamic skinning in terms of computational efficiency including LOD controls, stability in interactive controls, and flexible expression in deformations.
Related Delta Mush: Smoothing Deformations While Preserving Detail · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Data-Driven Physics for Human Soft Tissue Animation · A Statistical Model of Human Pose and Body Shape
how to read this ▾ how to read this ▴
- Category
- Method: dynamic skinning via low-rank helper-bone controllers
- Contributions
-
- Builds helper-bone controllers that reproduce skin deformation caused by both skeleton poses and secondary soft-tissue dynamics
- Introduces a state-space linear time-invariant model mapping skeleton motion to dynamic helper-bone movement, learned from training sequences of motions and deformations
- Obtains an efficient, accurate approximation through a low-rank, sparsity-oriented nuclear-norm optimization that drops into existing skinning workflows
- Context
- Extends pose-dependent deformation in the spirit of Lewis et al.'s Pose Space Deformation (lewis-psd-2000) by adding learned, dynamic helper bones with a low-rank state-space model. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Assumes complex nonlinear soft-tissue dynamics can be well approximated by a low-rank linear time-invariant helper-bone system learned from examples; accuracy and efficiency are reported as superior to conventional dynamic skinning, but the linear approximation may limit fidelity for highly nonlinear deformations outside the training data.
- Clarity
- Moderately technical; a first pass conveys the helper-bone-plus-state-space idea, a second pass covers the nuclear-norm optimization.
- How to read it
- Read first for why helper bones in a linear state-space model give real-time dynamic skinning; a second pass on the low-rank/nuclear-norm optimization if you plan to implement or extend it.
Skinning
-
, , , ,
Real-time face reenactment system transferring facial expressions from a source actor to a target in monocular video using dense tracking.
abstract ▾ abstract ▴
Face2Face is a method for real-time facial reenactment that transfers the expressions of a source actor, captured live with a commodity webcam, onto a monocular target video such as a YouTube clip, and re-renders the result photo-realistically. Actor identities are recovered from monocular video using a dense global non-rigid model-based bundling approach, while expressions of both source and target are tracked at runtime with a dense photometric analysis-by-synthesis formulation over a multilinear PCA face prior. Expressions are transferred via a sub-space deformation transfer technique that operates directly in the blendshape expression space, and a realistic mouth interior is synthesized by retrieving and warping the best matching mouth frame from the target sequence rather than copying the source mouth or using a generic teeth proxy. The optimization runs in real time on the GPU using a data-parallel iteratively reweighted least squares solver.
Related Reconstruction of Personalized 3D Face Rigs from Monocular Video · Example-Based Facial Rigging · 3D Morphable Face Models: Past, Present and Future · Neural Head Avatars from Monocular RGB Videos
how to read this ▾ how to read this ▴
- Category
- Method: real-time monocular face reenactment system
- Contributions
-
- Real-time expression transfer from a webcam source actor onto a monocular target video, re-rendered photo-realistically
- Dense analysis-by-synthesis tracking over a multilinear PCA face prior, with non-rigid model-based bundling to recover identity from monocular video
- Subspace deformation transfer in blendshape space plus a data-driven mouth-interior synthesis that retrieves and warps the best matching target mouth frame
- Context
- Builds on parametric face modeling in the lineage of Blanz and Vetter's morphable model, extending it to live monocular tracking and reenactment. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Demonstrated on commodity webcam input and existing monocular videos; the realism leans on the multilinear prior and the mouth-retrieval step, so readers should keep in mind it assumes a recoverable target mouth appearance and a face that fits the learned subspace.
- Clarity
- Accessible at a high level; a first pass conveys the pipeline and the reenactment idea, but a second pass is needed for the analysis-by-synthesis optimization and the IRLS solver.
- How to read it
- First pass for the four-stage pipeline (tracking, bundling, transfer, mouth synthesis); do a second pass on the photometric energy formulation and GPU IRLS solver if you care about the real-time tracking math.
Facial
-
,
Workshop showing DI4D surface capture pipelines for Remedy's Quantum Break and Blur Studio's Agent Six virtual-Angela-Bassett character.
Facial / Retargeting
-
, , , , ,
Describes the simulation rig for Finding Dory's septapus Hank, covering flesh simulation, sucker dynamics, and mantle behavior built on a complex animation rig.
abstract ▾ abstract ▴
Bringing the complex octopus character Hank to screen required specialized meshes for body, suckers, and mantle along with volumetric simulation for flesh and skin dynamics. The team developed a new modular meshing pipeline enabling independent refinement and conforming operations to handle high curvature and complex features. Sucker behavior combined art-directed shapes with physics-based simulation for stiction and mutual dynamics between tentacle flesh and individual suckers. Novel pre-roll techniques procedurally unfolded the character's convoluted rigging state to initialize simulation cleanly.
Related Dynamic Deformables: Implementation and Production Practicalities · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Smeat: ADMM Based Tools for Character Deformation · A Deep Emulator for Secondary Motion of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Production talk / character simulation breakdown
- Contributions
-
- A modular meshing pipeline for body, suckers, and mantle allowing independent refinement and conforming operations on high-curvature features
- Combined art-directed and physics-based sucker behavior, including stiction and mutual dynamics between tentacle flesh and individual suckers
- Procedural pre-roll techniques that unfold the character's convoluted rigging state to cleanly initialize simulation
- Context
- Relates to skinned cloth and contact simulation in the lineage of Bridson et al.'s collision and friction treatment, applied to a soft-bodied octopus character (Finding Dory's Hank). Builds on: Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on a single complex character, so techniques are tuned to that creature rather than presented as a general benchmark.
- Clarity
- Accessible as a war-story breakdown; one pass conveys the problems and the chosen solutions without heavy formalism.
- How to read it
- Read once for the meshing, sucker-dynamics, and pre-roll ideas; treat it as practical inspiration for soft-creature rigs rather than a reproducible algorithm.
CFX / Rigging
-
,
Ubisoft replaced reactive foot IK with a predictive biomechanical model for Assassin's Creed, producing more fluid and accurate foot placement that preserves source animation detail.
Rigging / Skinning
-
, , , , , , , ,
Character soft-tissue simulation system for Zootopia scaling from hero to crowd characters, capturing subtle animal flesh and fur dynamics with artistic controls.
abstract ▾ abstract ▴
This SIGGRAPH talk describes the character flesh simulation techniques developed for Disney's Zootopia to give anthropomorphic mammal characters a sense of weight, jiggle, volume preservation, and self-collision while still allowing 2D-animator art direction. The studio productized the grid-based PhysGrid flesh simulator, which embeds a surface mesh in a volumetric grid rather than requiring tetrahedral meshing, and added SIMD vectorization, generalized-alpha integration, and triangle-mesh methods replacing level sets. A deforming body bone bound to the animated surface mesh controls how closely the simulation follows animation, and a delta-based process combines simulation with skinning and pose space deformation for sensitive areas such as facial cheeks. For a savage jaguar, anatomically plausible bones, muscles, a fascia cloth layer, and a dermal fat layer were simulated to reveal creature-style anatomy with the same toolset.
Related How to Build a Human: Practical Physics-Based Character Animation · Muscle and Fascia Simulation with Extended Position Based Dynamics · Efficient Elasticity for Character Skinning with Contact and Collisions · Fast Corotated FEM using Operator Splitting
how to read this ▾ how to read this ▴
- Category
- Production talk / character flesh-simulation system
- Contributions
-
- Productized the grid-based PhysGrid flesh simulator that embeds a surface mesh in a volumetric grid, avoiding tetrahedral meshing
- Added SIMD vectorization, generalized-alpha integration, and triangle-mesh methods replacing level sets, with a deforming body bone to control how closely simulation follows animation
- A delta-based process combining simulation with skinning and pose-space deformation, and an anatomical layered setup (bones, muscles, fascia cloth, dermal fat) for a savage jaguar
- Context
- Relates to anatomical and musculoskeletal simulation in the lineage of Teran et al.'s skeletal muscle work, adapted for stylized anthropomorphic mammals scaling from hero to crowd (Disney's Zootopia). Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Studio practice, not peer-reviewed; results are production-proven across hero and crowd characters, with the explicit constraint that simulation must preserve 2D-animator art direction rather than chase pure physical accuracy.
- Clarity
- Accessible as a systems talk; one pass conveys the architecture and the art-direction trade-offs, with details kept at the production-overview level.
- How to read it
- Read once for the grid-based simulation choice and the simulation-vs-animation control scheme; revisit the delta/PSD blending section if you are building a similar art-directable flesh pipeline.
Muscles / CFX / Skinning
-
, , , ,
Production overview of physics-based character animation combining FEM muscles, skin, and collision for anatomically driven deformation.
abstract ▾ abstract ▴
This work presents a physics-based character animation workflow for generating realistic anatomical motion of muscles, fat, and skin using Finite Element Method simulation. The approach uses MRI scans for anatomical accuracy, assigns material properties from medical literature, and employs muscle fiber fields to enable contraction, producing physically plausible deformation without manual sculpting.
Related A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation · EMU: Efficient Muscle Simulation in Deformation Space · Data-driven Modeling of Skin and Muscle Deformation · Efficient and Robust Skin Slide Simulation
how to read this ▾ how to read this ▴
- Category
- Production overview: physics-based anatomical character animation
- Contributions
-
- An FEM-based workflow simulating muscles, fat, and skin to produce realistic anatomical motion
- Use of MRI scans for anatomical accuracy and material properties drawn from medical literature
- Muscle fiber fields enabling contraction-driven, physically plausible deformation without manual sculpting
- Context
- Builds on physically based muscle and soft-tissue simulation in the lineage of Teran et al.'s skeletal muscle work, framed as a practical production pipeline. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Presented as a production workflow; correctness rests on MRI-derived anatomy and literature material parameters, so readers should keep in mind that fidelity depends on scan quality and on how well published material values transfer to the character.
- Clarity
- Accessible at the overview level; one pass conveys the pipeline, while FEM and fiber-field specifics are kept practical rather than fully derived.
- How to read it
- Read once for the anatomy-to-simulation pipeline (MRI to materials to fiber-driven contraction); go to the cited FEM-muscle literature for the underlying formulation.
Muscles / Skinning
-
GDC talk introducing IK Rig architecture for procedural pose correction and layered animation, foundational to Unreal Engine IK framework.
Rigging
-
, ,
Uses formal grammar rewriting rules to synthesize structured motion sequences obeying behavioral rules for multiple characters.
abstract ▾ abstract ▴
The behavioral structure of human movements is imposed by multiple sources, such as rules, regulations, choreography, habits, and emotion. Our goal is to identify the behavioral structure in a specific application domain and create a novel sequence of movements that abide by structure‐building rules. To do so, we exploit the ideas from formal language, such as rewriting rules and grammar parsing, and adapted those ideas to synthesize the three‐dimensional animation of multiple characters. The structured motion synthesis using motion grammars is formulated in two layers. The upper layer is a symbolic description that relates the semantics of each individual's movements and the interaction among them. The lower layer provides spatial and temporal contexts to the animation. Our multi‐level MCMC (Markov Chain Monte Carlo) algorithm deals with the syntax, semantics, and spatiotemporal context of human motion to produce highly‐structured, animated scenes. The power and effectiveness of motion grammars are demonstrated in animating basketball games from drawings on a tactic board. Our system allows the user to position players and draw out tactical plans, which are animated automatically in virtual environments with three‐dimensional, full‐body characters.
Related Motion Graphs · A Deep Learning Framework for Character Motion Synthesis and Editing · Physically Based Motion Transformation · Character Motion Synthesis by Topology Coordinates
how to read this ▾ how to read this ▴
- Category
- Method: grammar-based structured motion synthesis
- Contributions
-
- Adapts formal-language rewriting rules and grammar parsing to synthesize 3D animation of multiple interacting characters
- A two-layer formulation separating an upper symbolic layer (semantics and interaction) from a lower spatiotemporal context layer
- A multi-level MCMC algorithm handling syntax, semantics, and spatiotemporal context, demonstrated by animating basketball games from tactic-board drawings
- Context
- Builds on data-driven motion synthesis in the lineage of Kovar et al.'s Motion Graphs, layering formal grammars on top to impose behavioral structure. Builds on: Motion Graphs
- Correctness
- Demonstrated on the structured domain of basketball tactics; the approach assumes the target behavior can be captured by grammar rules, so it suits rule-governed scenarios more than free-form or unstructured motion.
- Clarity
- Conceptually approachable via the language analogy, but the two-layer formulation and multi-level MCMC need a careful second pass.
- How to read it
- First pass for the grammar-as-motion-structure idea and the basketball demo; do a second pass on the two-layer model and the MCMC sampler if you intend to apply it to another structured domain.
Motion Synthesis
-
Ubisoft's nearest-neighbour search through raw mocap that replaced state-machine locomotion in AAA games (For Honor).
Motion Synthesis
-
,
Multiple FEM subspace models are precomputed around representative poses and blended at runtime, adding millisecond-cost secondary dynamics to rigged characters.
abstract ▾ abstract ▴
We enrich character animations with secondary soft-tissue Finite Element Method (FEM) dynamics computed under arbitrary rigged or skeletal motion. Our method optionally incorporates pose-space deformation (PSD). It runs at milliseconds per frame for complex characters, and fits directly into standard character animation pipelines. Our simulation method does not require any skin data capture; hence, it can be applied to humans, animals, and arbitrary (real-world or fictional) characters. In standard model reduction of three-dimensional nonlinear solid elastic models, one builds a reduced model around a single pose, typically the rest configuration. We demonstrate how to perform multi-model reduction of Finite Element Method (FEM) nonlinear elasticity, where separate reduced models are precomputed around a representative set of object poses, and then combined at runtime into a single fast dynamic system, using subspace interpolation. While time-varying reduction has been demonstrated before for offline applications, our method is fast and suitable for hard real-time applications in games and virtual reality. Our method supports self-contact, which we achieve by computing linear modes and derivatives under contact constraints.
Related CUDA Deformers for Model Reduction · Simulation of Hand Anatomy Using Medical Imaging · Complementary Dynamics · A Deep Emulator for Secondary Motion of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: real-time secondary soft-tissue dynamics for rigged characters
- Contributions
-
- Adds secondary FEM soft-tissue dynamics under arbitrary rigged or skeletal motion at milliseconds per frame, fitting standard animation pipelines
- Multi-model reduction: separate reduced FEM models precomputed around representative poses and combined at runtime via subspace interpolation
- Self-contact support via linear modes, and optional integration with pose-space deformation, requiring no skin capture so it applies to humans, animals, and fictional characters
- Context
- Builds on pose-space deformation in the lineage of Lewis et al.'s PSD, extending single-pose model reduction to multiple pose-anchored reduced models. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Targets hard real-time use; the key trade-off is that fidelity depends on the chosen representative pose set and on the reduced-subspace size, so motions far from the sampled poses are approximated by interpolation.
- Clarity
- Moderately technical; a first pass conveys the multi-model-reduction idea, but the subspace interpolation and contact handling reward a second pass.
- How to read it
- First pass for the multi-pose reduced-model concept and where it slots into a rig; second pass on the subspace interpolation and self-contact modes if implementing real-time secondary dynamics.
ML Deformation / Muscles
-
,
This paper presents a robust real-time hair simulation method built on the hair mesh representation, which describes the hair volume as a coarse polygonal mesh.
abstract ▾ abstract ▴
This paper presents a robust real-time hair simulation method built on the hair mesh representation, which describes the hair volume as a coarse polygonal mesh. Building on sheet-based cloth models, it introduces a volumetric force model that captures hair interactions inside the hair mesh volume, together with a position correction method that minimizes local deformation during collision handling. The technique remains stable under large time steps and fast motion, and recovers the initial hair shape even after substantial deformation.
Related Continuum-based Strain Limiting · GPU-Based Simulation of Cloth Wrinkles at Submillimeter Levels · Wrapped Clothing on Disney's Raya and the Last Dragon · Efficient Simulation of Inextensible Cloth
how to read this ▾ how to read this ▴
- Category
- Method: real-time hair simulation on a coarse mesh representation
- Contributions
-
- A robust real-time hair simulation built on the hair mesh representation, treating the hair volume as a coarse polygonal mesh
- A volumetric force model extending sheet-based cloth models to capture interactions inside the hair-mesh volume
- A position correction method that minimizes local deformation during collision handling, staying stable under large time steps and fast motion and recovering the initial shape after large deformation
- Context
- Builds directly on Yuksel et al.'s Hair Meshes representation and borrows from sheet-based cloth simulation models. Builds on: Hair Meshes
- Correctness
- Aimed at real-time use; stability and shape recovery are the headline claims, with the caveat that the coarse-mesh abstraction trades per-strand fidelity for speed and robustness.
- Clarity
- Reasonably accessible if you know cloth simulation; a first pass conveys the representation and goals, with the force model and correction step needing a second pass.
- How to read it
- First pass to grasp the hair-mesh-as-cloth-volume idea; second pass on the volumetric force model and position-correction step if you need stable real-time collision handling.
CFX
-
,
Optimizes per-vertex rotation centers for skeletal skinning, nearly eliminating candy-wrapper and volume-loss artifacts in real time.
abstract ▾ abstract ▴
Skinning algorithms that work across a broad range of character designs and poses are crucial to creating compelling animations. Currently, linear blend skinning (LBS) and dual quaternion skinning (DQS) are the most widely used, especially for real-time applications. Both techniques are efficient to compute and are effective for many purposes. However, they also have many well-known artifacts, such as collapsing elbows, candy wrapper twists, and bulging around the joints. Due to the popularity of LBS and DQS, it would be of great benefit to reduce these artifacts without changing the animation pipeline or increasing the computational cost significantly. In this paper, we introduce a new direct skinning method that addresses this problem. Our key idea is to pre-compute the optimized center of rotation for each vertex from the rest pose and skinning weights. At runtime, these centers of rotation are used to interpolate the rigid transformation for each vertex. Compared to other direct skinning methods, our method significantly reduces the artifacts of LBS and DQS while maintaining real-time performance and backwards compatibility with the animation pipeline.
Related Skinning: Real-time Shape Deformation · Direct Delta Mush Skinning and Variants · Segmentation-Based Skinning · NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks
how to read this ▾ how to read this ▴
- Category
- Method: a real-time direct skinning algorithm
- Contributions
-
- A direct skinning method that precomputes an optimized center of rotation per vertex from the rest pose and skinning weights
- Runtime interpolation of a rigid transformation per vertex from these centers, significantly reducing LBS and DQS artifacts (collapsing elbows, candy-wrapper twists, joint bulging)
- Maintains real-time performance and backwards compatibility with existing animation pipelines
- Context
- Sits alongside the standard direct-skinning family (linear blend skinning and Kavan et al.'s dual quaternion skinning), aiming to fix their well-known artifacts without changing the pipeline. Builds on: Skinning with Dual Quaternions
- Correctness
- Validated as an artifact-reduction drop-in across a range of characters and poses; the trade-off is a precomputation step tied to the rest pose and skinning weights, and it remains a direct (non-physical) method so it approximates rather than simulates tissue.
- Clarity
- Very accessible; the central idea is intuitive and a first pass conveys it well, with the optimization details reserved for a second pass.
- How to read it
- First pass for the optimized-center-of-rotation idea and the artifact comparisons; a second pass on the precomputation formulation is worth it if you plan to implement or integrate it.
Skinning
- Reconstructing Personalized Anatomical Models for Physics-based Body Animation SIGGRAPH Asia Academic 76 cites
, , , ,
Reconstructs internal anatomical structures from surface scans to drive physics-based soft-tissue deformation personalized to individual subjects.
abstract ▾ abstract ▴
We present a method to create personalized anatomical models ready for physics-based animation, using only a set of 3D surface scans. We start by building a template anatomical model of an average male which supports deformations due to both 1) subject-specific variations: shapes and sizes of bones, muscles, and adipose tissues and 2) skeletal poses. Next, we capture a set of 3D scans of an actor in various poses. Our key contribution is formulating and solving a large-scale optimization problem where we compute both subject-specific and pose-dependent parameters such that our resulting anatomical model explains the captured 3D scans as closely as possible. Compared to data-driven body modeling techniques that focus only on the surface, our approach has the advantage of creating physics-based models, which provide realistic 3D geometry of the bones and muscles, and naturally supports effects such as inertia, gravity, and collisions according to Newtonian dynamics.
Related Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies · How to Build a Human: Practical Physics-Based Character Animation · Data-Driven Physics for Human Soft Tissue Animation · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method: personalized anatomical model reconstruction for physics-based body animation
- Contributions
-
- Builds personalized physics-ready anatomical models from only a set of 3D surface scans, starting from an average-male template
- A large-scale optimization that jointly solves for subject-specific parameters (bone, muscle, adipose shape) and pose-dependent parameters so the model explains the captured scans
- Produces realistic internal bone and muscle geometry that naturally supports inertia, gravity, and collisions under Newtonian dynamics
- Context
- Builds on anatomy-template fitting in the lineage of Dicko et al.'s Anatomy Transfer, going beyond surface-only data-driven body modeling toward physics-based internal structure. Builds on: Anatomy Transfer
- Correctness
- Validated by how closely the reconstructed model explains the captured multi-pose scans; readers should keep in mind it starts from an average-male template and infers internal anatomy from surface evidence, so internal structures are plausible estimates rather than measured ground truth.
- Clarity
- Technical; a first pass conveys the scan-to-anatomy goal and the joint-optimization framing, while the optimization details need a careful second pass.
- How to read it
- First pass for the surface-scan-to-physics-model pipeline and the joint subject/pose optimization idea; second pass on the optimization formulation if you need to reproduce or extend the fitting.
Muscles / Skinning
-
, , , , , ,
Builds personalized 3D facial rigs from monocular video, recovering identity-specific blendshapes for downstream animation and editing.
abstract ▾ abstract ▴
We present a novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data (e.g., vintage movies). Our rig is based on three distinct layers that allow us to model the actor’s facial shape as well as capture his person-specific expression characteristics at high fidelity, ranging from coarse-scale geometry to fine-scale static and transient detail on the scale of folds and wrinkles. At the heart of our approach is a parametric shape prior that encodes the plausible subspace of facial identity and expression variations. Based on this prior, a coarse-scale reconstruction is obtained by means of a novel variational fitting approach. We represent person-specific idiosyncrasies, which cannot be represented in the restricted shape and expression space, by learning a set of medium-scale corrective shapes. Fine-scale skin detail, such as wrinkles, are captured from video via shading-based refinement, and a generative detail formation model is learned. Both the medium- and fine-scale detail layers are coupled with the parametric prior by means of a novel sparse linear regression formulation. Once reconstructed, all layers of the face rig can be conveniently controlled by a low number of blendshape expression parameters, as widely used by animation artists.
Related FaceLab: Scalable Facial Performance Capture for Visual Effects · Vdub: Modifying Face Video of Actors for Plausible Visual Alignment to a Dubbed Audio Track · It's a UVN Face Rig, Charlie Brown: Facial Techniques for Peanuts · Face2Face: Real-Time Face Capture and Reenactment of RGB Videos
how to read this ▾ how to read this ▴
- Category
- Method: personalized 3D face rig reconstruction from monocular video
- Contributions
-
- Automatic creation of a high-quality personalized 3D face rig of an actor from only monocular video, including vintage footage
- A three-layer rig: a coarse parametric shape/expression prior fit via a variational approach, learned medium-scale corrective shapes for person-specific idiosyncrasies, and fine-scale wrinkle detail from shading-based refinement with a generative detail model
- A sparse linear regression that couples the medium- and fine-scale detail layers to the parametric prior for downstream animation and editing
- Context
- Builds on parametric face modeling in the lineage of Blanz and Vetter's morphable model, adding corrective and detail layers to produce an editable, identity-specific rig. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Demonstrated on monocular and vintage video; the layered approach explicitly assumes the coarse face fits a restricted shape/expression subspace, with idiosyncrasies and wrinkles recovered as learned corrective and shading-based layers, so fine detail quality depends on the video's lighting and resolution.
- Clarity
- Accessible in structure thanks to the three-layer framing; a first pass conveys the layering, but the variational fit, detail model, and sparse regression reward a second pass.
- How to read it
- First pass for the coarse/medium/fine layered-rig concept and what each layer captures; second pass on the variational fitting and the sparse-regression coupling if you need the reconstruction details.
Facial / Rigging
-
, ,
Method for repurposing authored hand animation across different interactive contexts by adaptively retargeting motion to new constraints.
abstract ▾ abstract ▴
This paper describes a method for automatically animating interactive characters from an existing corpus of keyframed hand-animation. The method learns separate low-dimensional embeddings for subsets of the animation corresponding to different semantic labels, using the Gaussian Process Latent Variable Model to map high-dimensional rig control parameters to a three-dimensional latent space. By moving a simulated particle within these latent spaces it generates novel animations, and bridges linking similar poses across spaces allow smooth transitions between semantic labels. The approach is demonstrated by interactively animating the face of the dragon Toothless from How to Train Your Dragon 2 as it plays a game with the user.
Related Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild · Semi-Supervised Video-Driven Facial Animation Transfer for Production · Performance-Driven Facial Animation · Transferring the Rig and Animations from a Character to Different Face Models
how to read this ▾ how to read this ▴
- Category
- Method: data-driven animation synthesis from an authored corpus
- Contributions
-
- Learns separate low-dimensional embeddings of a keyframed hand-animation corpus per semantic label using a Gaussian Process Latent Variable Model mapping rig controls to a 3D latent space
- Generates novel motion by moving a simulated particle through the latent spaces, with bridges between similar poses enabling smooth transitions across semantic labels
- Demonstrates interactive character animation on the face of Toothless from How to Train Your Dragon 2 playing a game with the user
- Context
- Sits in the data-driven motion synthesis lineage, applying the Gaussian Process Latent Variable Model to repurpose existing keyframed rig animation for real-time interactive control.
- Correctness
- Validated by a single qualitative interactive demo (a production dragon face), so generality across rigs and richer interactions is suggested rather than measured, and quality depends on the coverage of the input corpus.
- Clarity
- Application idea is accessible on a first pass; the GPLVM latent-space formulation and bridging warrant a second pass.
- How to read it
- First pass for the corpus-to-latent-space framing and the bridge-transition idea; do a second pass on the GPLVM mapping and particle dynamics if you intend to reimplement or assess fidelity.
Rigging / Retargeting
-
Production cloth technique for selectively smoothing folds and resolving collisions to improve visual quality of dynamic garment animation.
abstract ▾ abstract ▴
This work presents techniques to selectively and dynamically detect and smooth folds in a simulated cloth mesh after simulation, giving artists controls to emphasize or de-emphasize folds and to clean up crumpling errors. Folds are detected per vertex using a position based method that compares the original mesh to an internally smoothed mesh, or an angle based method that measures angular change between faces sharing an edge, producing fold weights that are spread and spatially smoothed. These weights drive selective Laplacian, Taubin, or Delta Mush smoothing that blends between the original and fully smoothed mesh. A collision resolution pass then pushes interpenetrating vertices out of the body and smooths the offsets, and the method is implemented as a multi-threaded custom node added after the cloth simulation step.
Related Directing Cloth Draping through Blended UVs · Untangling Cloth · Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production · Fast Cloth Simulation on Moving Humanoids
how to read this ▾ how to read this ▴
- Category
- Production technique: post-simulation cloth fold smoothing and collision cleanup
- Contributions
-
- Per-vertex fold detection via a position-based method (original vs internally smoothed mesh) or an angle-based method (angular change across shared edges), producing spread and spatially smoothed fold weights
- Selective Laplacian, Taubin, or Delta Mush smoothing driven by those weights to blend between the original and a fully smoothed mesh, giving artists emphasis and de-emphasis control
- A collision resolution pass that pushes interpenetrating vertices out of the body and smooths offsets, implemented as a multi-threaded custom node after the cloth sim step
- Context
- Builds on the simulation-of-folds-and-wrinkles tradition (Bridson et al., Simulation of Clothing with Folds and Wrinkles), operating as an artist-controllable post-process rather than a new solver. Builds on: Simulation of Clothing with Folds and Wrinkles
- Correctness
- Presented as a DigiPro production technique demonstrated qualitatively in a pipeline node; assumes a usable simulated mesh exists and that selective smoothing will not erase desired detail, so results are tuned by artists rather than benchmarked.
- Clarity
- Practical and accessible; a first pass conveys the detect-weight-smooth-resolve workflow, with details in the smoothing-operator and collision sections.
- How to read it
- First pass for the pipeline-stage role and the fold-detection-to-weight idea; revisit the smoothing operators (Laplacian/Taubin/Delta Mush) and the collision pass if integrating into a cloth pipeline.
CFX
-
Detailed how real-time characters Senua, Siren, and Meetmike were built using motion capture solving and Cubic Motion's performance-transfer pipeline.
Facial / Retargeting
-
, , , , , , ,
Sketch-based interface for editing articulated character motion, enabling intuitive spatial control over limb trajectories and poses.
abstract ▾ abstract ▴
SketchiMo presents a sketch-based interface for expressive editing of articulated character motion, introducing sketch targets and sketch spaces as dynamic, view-dependent pictorial representations. The system solves for motion from projective constraints relating sketch inputs to unknown 3D poses, enabling seamless editing of diverse properties from joint trajectories to abstract coordinated motions through a unified optimization framework.
Related How the Rig Design Impacts the Animation Process · Retargeting Motion to New Characters · Pose and Skeleton-aware Neural IK for Pose and Motion Editing · Physically Based Motion Transformation
how to read this ▾ how to read this ▴
- Category
- Method: a sketch-based interface for editing articulated character motion
- Contributions
-
- SketchiMo, a sketch-based interface for expressive editing of articulated character motion
- Introduces sketch targets and sketch spaces as dynamic, view-dependent pictorial representations of motion
- A unified optimization that solves for 3D poses from projective constraints relating sketch input to unknown poses, spanning joint trajectories to abstract coordinated motions
- Context
- Relates to sketch-based and trajectory-based animation editing, framing motion editing as recovering 3D poses from view-dependent 2D sketch constraints.
- Correctness
- Demonstrated on a range of editing tasks via the projective-constraint optimization; results depend on viewpoint and the under-constrained 2D-to-3D mapping, so edits may need disambiguation and the breadth of motions handled is shown by example rather than exhaustively.
- Clarity
- Interaction concept is accessible; a first pass conveys the idea, while the projective-constraint formulation rewards a second pass.
- How to read it
- First pass to grasp sketch targets and sketch spaces; do a second pass on the unified optimization and constraint setup if you want to understand or reproduce the solver.
Rigging / Retargeting
-
, ,
Deep reinforcement learning for terrain-adaptive character locomotion that generalizes to novel obstacles without motion capture references.
abstract ▾ abstract ▴
Reinforcement learning offers a promising methodology for developing skills for simulated characters, but typically requires working with sparse hand-crafted features. Building on recent progress in deep reinforcement learning (DeepRL), we introduce a mixture of actor-critic experts (MACE) approach that learns terrain-adaptive dynamic locomotion skills using high-dimensional state and terrain descriptions as input, and parameterized leaps or steps as output actions. MACE learns more quickly than a single actor-critic approach and results in actor-critic experts that exhibit specialization. Additional elements of our solution that contribute towards efficient learning include Boltzmann exploration and the use of initial actor biases to encourage specialization. Results are demonstrated for multiple planar characters and terrain classes.
Related Learning Locomotion Skills Using DeepRL: Does the Choice of Action Space Matter? · DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills · Physics-based Motion Capture Imitation with Deep Reinforcement Learning · SFV: Reinforcement Learning of Physical Skills from Video
how to read this ▾ how to read this ▴
- Category
- Method: deep reinforcement learning for terrain-adaptive locomotion control
- Contributions
-
- A mixture of actor-critic experts (MACE) approach that learns terrain-adaptive dynamic locomotion from high-dimensional state and terrain descriptions, outputting parameterized leaps or steps
- Shows MACE learns faster than a single actor-critic and yields specialized experts
- Uses Boltzmann exploration and initial actor biases to encourage specialization and efficient learning, demonstrated across multiple planar characters and terrain classes
- Context
- Extends physics-based locomotion control (such as SIMBICON) and the evolved-controller tradition (Sims, Evolving Virtual Creatures) using deep reinforcement learning to remove hand-crafted feature dependence. Builds on: SIMBICON: Simple Biped Locomotion Control · Evolving Virtual Creatures
- Correctness
- Results are demonstrated on planar (2D) simulated characters over several terrain classes, so generalization to 3D characters and the realism of motion remain to be judged; the action space is restricted to parameterized leaps or steps.
- Clarity
- Conceptually accessible if you know actor-critic RL; a first pass conveys the MACE idea, with training details in later passes.
- How to read it
- First pass for the MACE experts-and-specialization idea and the input/output design; second pass on exploration, biasing, and the planar-only scope before assuming it transfers to your setting.
Motion Synthesis
-
,
Pixar open-sources USD, a scalable system for authoring, reading, and interchanging time-sampled scene descriptions across DCC tools and production pipelines.
abstract ▾ abstract ▴
Universal Scene Description is an open-source technology providing a declarative ASCII file format and runtime system for representing 3D scenes using prims, schemas, and attributes. USD serves as the central conduit in Pixar's production pipeline, integrating with multiple DCCs and tools like Maya, Houdini, and Katana through the Hydra rendering architecture.
Related A Deep Dive into Universal Scene Description and Hydra · Combining the Benefits of Nodes and Layers in a USD World · USD and Scene Interoperability: Demystifying the State of the Art · Demystifying OpenUSD: End-to-End Data Pipelines and Workflows
how to read this ▾ how to read this ▴
- Category
- System / open-source release: a scene-description format and runtime
- Contributions
-
- Open-sources Universal Scene Description, a declarative ASCII format and runtime for representing 3D scenes via prims, schemas, and attributes
- Serves as the central conduit of Pixar's production pipeline for authoring, reading, and interchanging time-sampled scene descriptions
- Integrates with DCCs and tools (Maya, Houdini, Katana) and the Hydra rendering architecture
- Context
- Addresses scene interchange and composition across heterogeneous DCC tools, positioned as the pipeline backbone connecting authoring applications to rendering via Hydra.
- Correctness
- This is a technical note describing a production system rather than an evaluated method; its claims are about pipeline integration and scalability proven in Pixar production, not benchmarked comparisons, and adoption cost in other pipelines is left to the reader.
- Clarity
- Accessible as a system overview; a first pass conveys the data model and role, with deeper composition semantics requiring the full documentation.
- How to read it
- First pass for the prims/schemas/attributes data model and where USD sits in a pipeline; go deeper into composition, layering, and Hydra only if you are adopting or integrating USD.
Rigging
- Vivace: A Practical Gauss-Seidel Method for Stable Soft Body Dynamics SIGGRAPH Asia Academic 81 cites
, ,
Parallel randomized Gauss-Seidel via graph coloring for PBD and Projective Dynamics constraints, achieving millisecond per-frame cloth and soft-body solve on GPU.
abstract ▾ abstract ▴
The solution of large sparse systems of linear constraints is at the base of most interactive solvers for physically-based animation of soft body dynamics. We focus on applications with hard and tight per-frame resource budgets, such as video games, where the solution of soft body dynamics needs to be computed in a few milliseconds. Linear iterative methods are preferred in these cases since they provide approximate solutions within a given error tolerance and in a short amount of time. We present a parallel randomized Gauss-Seidel method which can be effectively employed to enable the animation of 3D soft objects discretized as large and irregular triangular or tetrahedral meshes. At the beginning of each frame, we partition the set of equations governing the system using a randomized graph coloring algorithm. The unknowns in the equations belonging to the same partition are independent of each other. Then, all the equations belonging to the same partition are solved at the same time in parallel. Our algorithm runs completely on the GPU and can support changes in the constraints topology. We tested our method as a solver for soft body dynamics within the Projective Dynamics and Position Based Dynamics frameworks.
Related Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Wrinkle Meshes · GPU-Based Simulation of Cloth Wrinkles at Submillimeter Levels · Nonlinear Cloth Simulation with Isogeometric Analysis
how to read this ▾ how to read this ▴
- Category
- Method: a parallel GPU solver for soft-body dynamics constraints
- Contributions
-
- A parallel randomized Gauss-Seidel method for large sparse constraint systems in soft-body animation under tight per-frame budgets
- Per-frame partitioning of equations via a randomized graph-coloring algorithm so equations within a partition are independent and solvable in parallel
- A fully GPU-resident implementation that supports changes in constraint topology, used within Projective Dynamics and Position-Based solvers
- Context
- Builds on fast constraint-based simulation, in particular Projective Dynamics (Bouaziz et al.), accelerating its linear solve with a colored parallel Gauss-Seidel scheme. Builds on: Projective Dynamics: Fusing Constraint Projections for Fast Simulation
- Correctness
- Targeted at millisecond budgets (for example games) and validated on triangular and tetrahedral meshes; it is an iterative approximate solver within a tolerance, so accuracy trades against iteration count and coloring quality affects parallel efficiency.
- Clarity
- Approach is accessible if you know PBD/Projective Dynamics; a first pass conveys the coloring-then-parallel-solve idea, with convergence details in later passes.
- How to read it
- First pass for the graph-coloring partitioning and the GPU Gauss-Seidel scheme; second pass on convergence behavior and topology-change handling if you plan to implement or benchmark it.
CFX
-
, ,
Extended PBD formulation with compliant constraints providing physically consistent stiffness control independent of timestep size.
abstract ▾ abstract ▴
This paper introduces XPBD, an extension to position-based dynamics (PBD) that removes PBD's well-known dependence of constraint stiffness on time step and iteration count. The method derives from an implicit position-level time discretization and introduces the concept of a total Lagrange multiplier, giving constraints a direct correspondence to well-defined elastic and dissipative energy potentials and providing accurate constraint force estimates useful for force-dependent effects such as breakable joints and haptic devices. Constraints are solved at the position level in a Gauss-Seidel or Jacobi fashion using a compliance matrix corresponding to inverse stiffness, with an additional Rayleigh dissipation term for damping. The authors validate XPBD against a reference non-linear Newton solver on harmonic oscillators, hanging chains, cantilever beams, cloth, and inflatable balloons, showing visually indistinguishable results while requiring only a single extra scalar stored per constraint.
Related Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Position Based Dynamics · Projective Dynamics: Fusing Constraint Projections for Fast Simulation · Small Steps in Physics Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a constraint-dynamics formulation extending Position-Based Dynamics
- Contributions
-
- XPBD, an extension of PBD that removes the dependence of constraint stiffness on time step and iteration count
- Derives from an implicit position-level discretization and introduces a total Lagrange multiplier giving constraints a correspondence to elastic and dissipative energy potentials, plus accurate constraint-force estimates
- Solves constraints at the position level (Gauss-Seidel or Jacobi) via a compliance matrix (inverse stiffness) with a Rayleigh dissipation damping term
- Context
- Directly extends Position Based Dynamics (Muller et al.) and the Unified Particle Physics framework (Macklin et al.), grounding stiffness in physically meaningful compliance. Builds on: Position Based Dynamics · Unified Particle Physics for Real-Time Applications
- Correctness
- Validated against a reference non-linear Newton solver on harmonic oscillators, hanging chains, cantilever beams, cloth, and inflatable balloons, showing visually indistinguishable results with one extra scalar per constraint; it remains a real-time-oriented approximation, not a guaranteed-accurate Newton solve.
- Clarity
- Clear and well-motivated; a first pass conveys why and how it fixes PBD stiffness, with the derivation rewarding a second pass.
- How to read it
- First pass for the stiffness-independence problem and the compliance/total-multiplier fix; second pass on the implicit derivation and per-constraint update if you implement it or need force estimates.
CFX
2015
27-
, ,
Multi-domain subspace simulation efficiently animates the deformation of a large deformable body by constraining each domain's deformation to a separate subspace, with the key challenge being how to c
abstract ▾ abstract ▴
Multi-domain subspace simulation efficiently animates the deformation of a large deformable body by constraining each domain's deformation to a separate subspace, with the key challenge being how to couple multiple domains without gaps or locking artifacts. This work introduces a domain decomposition framework that connects disjoint domains through coupling elements and solves the subspace deformations and rigid motions of all domains in a single linear system. Because the coupling elements are part of the deformable body and share its elastic properties, the system avoids manual stiffness parameter tuning.
Related FEM Simulation of 3D Deformable Solids: A Practitioner's Guide to Theory, Discretization and Model Reduction · Rig-Space Physics · Robust Treatment of Degenerate Elements in Interactive Corotational FEM Simulations · Subspace Clothing Simulation Using Adaptive Bases
how to read this ▾ how to read this ▴
- Category
- Method: unified multi-domain subspace simulation of deformable bodies
- Contributions
-
- A domain decomposition framework that connects disjoint subspace domains through coupling elements
- A single linear system solving the subspace deformations and rigid motions of all domains together to avoid gaps or locking
- Coupling elements that share the body's elastic properties, removing manual stiffness parameter tuning
- Context
- Builds on multi-domain reduced simulation such as 'Physics-based Character Skinning using Multi-Domain Subspace Deformations', focusing on a coupling scheme that joins per-domain subspaces cleanly. Builds on: Physics-based Character Skinning using Multi-Domain Subspace Deformations
- Correctness
- Central premise is that domains constrained to separate subspaces can be coupled through shared-property elements solved jointly, which addresses gap and locking artifacts without stiffness tuning; readers should keep in mind that overall fidelity still depends on each domain's subspace quality, an inherent reduced-order limitation.
- Clarity
- Moderately technical; a first pass conveys the coupling idea, a second pass clarifies the combined linear system.
- How to read it
- First pass for the decomposition-and-coupling concept and why it avoids tuning; do a second pass on the unified linear system and coupling elements if you need to integrate multiple subspace domains.
CFX
-
, ,
Reshapes body scans to human proportions and auto-rigs them into skinned virtual characters using a SCAPE deformable model.
abstract ▾ abstract ▴
This work presents a system for generating animatable virtual avatars from a single 3D human body scan by fitting a SCAPE-based morphable model and transferring high quality rigging onto the scan. A simplified SCAPE model with linear blend skinning is fit to the input scan via an ICP-style pose and shape optimization, after which skeleton joints are transferred using mean-value coordinates and skin binding weights are propagated using harmonic interpolation for smooth deformation. The system also lets users interactively reshape and resize the scan along semantic body attributes such as height and weight by exploring the body shape space and applying deformation transfer. The resulting rigs show superior skin deformation quality compared to generic single-mesh auto-rigging methods such as Pinocchio.
Related Automatic Rigging and Animation of 3D Characters · Geodesic Voxel Binding for Production Character Meshes · Animation Setup Transfer for 3D Characters · Dyna: A Model of Dynamic Human Shape in Motion
how to read this ▾ how to read this ▴
- Category
- System: automatic rigging and reshaping of body scans into animatable avatars
- Contributions
-
- Fitting a simplified SCAPE-based morphable model with linear blend skinning to a single 3D body scan via ICP-style pose and shape optimization
- Skeleton joint transfer using mean-value coordinates and skin weight propagation by harmonic interpolation
- Interactive reshaping along semantic attributes such as height and weight via body-shape-space exploration and deformation transfer
- Context
- Combines morphable-model fitting with auto-rigging in the lineage of 'Automatic Rigging and Animation of 3D Characters' (Pinocchio), using a SCAPE-based deformable model as the prior. Builds on: Automatic Rigging and Animation of 3D Characters
- Correctness
- Quality of rig and reshaping rests on how well the SCAPE-based template fits the input scan; the work reports superior skin deformation versus generic single-mesh auto-rigging like Pinocchio, but it targets human bodies from clean single scans, so results outside that regime are not claimed.
- Clarity
- Accessible as a system description; a first pass conveys the pipeline, a second pass details each fitting and transfer step.
- How to read it
- Read as a pipeline: first pass for the fit then transfer then reshape flow; do a second pass on the joint and weight transfer if you want to reproduce the rigging quality.
Rigging / Skinning
-
, , , ,
Simulates hand soft tissue and tendons as a coupled tendon-routing system driven by muscle activations, producing anatomically plausible finger motion.
abstract ▾ abstract ▴
The tendons of the hand and other biomechanical systems form a complex network of sheaths, pulleys, and branches. By modeling these anatomical structures, we obtain realistic simulations of coordination and dynamics that were previously not possible. First, we introduce Eulerian-on-Lagrangian discretization of tendon strands, with a new selective quasistatic formulation that eliminates unnecessary degrees of freedom in the longitudinal direction, while maintaining the dynamic behavior in transverse directions. This formulation also allows us to take larger time steps. Second, we introduce two control methods for biomechanical systems: first, a general-purpose learning-based approach requiring no previous system knowledge, and a second approach using data extracted from the simulator. We use various examples to compare the performance of these controllers.
Related Physical Based Motion Reconstruction From Videos Using Musculoskeletal Model · Anatomy-Based Modeling of the Human Musculature · A Muscle Model for Animating Three-Dimensional Facial Expression · Reusable Facial Rigging and Animation: Create Once, Use Many
how to read this ▾ how to read this ▴
- Category
- Method: biomechanical simulation and control of hands and tendinous systems
- Contributions
-
- An Eulerian-on-Lagrangian discretization of tendon strands with a selective quasistatic formulation that removes longitudinal DOFs while keeping transverse dynamics and allows larger time steps
- Anatomical modeling of tendon sheaths, pulleys, and branches for realistic coordination and dynamics
- Two control approaches for biomechanical systems: a general learning-based method needing no prior system knowledge and a method using data extracted from the simulator
- Context
- Extends musculoskeletal soft-tissue simulation, in the lineage of 'Creating and Simulating Skeletal Muscle from the Visible Human Data Set', toward the hand's complex tendon-routing network. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Plausibility hinges on the anatomical accuracy of the modeled sheaths, pulleys, and branches and on the selective quasistatic assumption that longitudinal tendon DOFs can be dropped; the two controllers are compared on example tasks, so readers should view control performance as illustrative rather than benchmarked broadly.
- Clarity
- Dense; a first pass conveys the anatomy-driven goal, but the Eulerian-on-Lagrangian formulation needs careful multi-pass reading.
- How to read it
- First pass for the anatomical modeling and the control split; reserve a focused second and third pass for the Eulerian-on-Lagrangian discretization and the selective quasistatic formulation, which carry the technical weight.
Muscles / Rigging
-
Naughty Dog examines the motion capture pipeline for The Last of Us, covering data acquisition tools, large dataset management, raw data cleanup, and retargeting animation onto character skeletons.
Retargeting
-
, ,
Data-driven anatomical body modeling framework assembling muscles, bones, and fat to build parameterized human body shapes from measurements.
abstract ▾ abstract ▴
We propose a method to create a wide range of human body shapes from a single input 3D anatomy template. Our approach is inspired by biological processes responsible for human body growth. In particular, we simulate growth of skeletal muscles and subcutaneous fat using physics-based models which combine growth and elasticity. Together with a tool to edit proportions of the bones, our method allows us to achieve a desired shape of the human body by directly controlling hypertrophy (or atrophy) of every muscle and enlargement of fat tissues. We achieve near-interactive run times by utilizing a special quasi-statics solver (Projective Dynamics) and by crafting a volumetric discretization which results in accurate deformations without an excessive number of degrees of freedom. Our system is intuitive to use and the resulting human body models are ready for simulation using existing physics-based animation methods, because we deform not only the surface, but also the entire volumetric model.
Related Steklov-Poincare Skinning · Data-Driven Physics for Human Soft Tissue Animation · Reconstructing Personalized Anatomical Models for Physics-based Body Animation · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging
how to read this ▾ how to read this ▴
- Category
- Method: anatomically-based modeling of human body shapes
- Contributions
-
- A growth-inspired framework that creates a range of body shapes from a single 3D anatomy template by simulating muscle and subcutaneous fat growth with combined growth and elasticity models
- Direct artistic control over per-muscle hypertrophy or atrophy and fat enlargement, plus bone-proportion editing
- Near-interactive run times via a Projective Dynamics quasi-statics solver and a tailored volumetric discretization, producing simulation-ready volumetric (not just surface) models
- Context
- Sits in the anatomy-based body modeling line, related to 'Anatomy Transfer', using physics-based growth and elasticity instead of pure geometric transfer to reshape internal structures. Builds on: Anatomy Transfer
- Correctness
- Outputs are physically plausible and simulation-ready by construction, but rest on a single input anatomy template and on growth-plus-elasticity as a proxy for real biological development, so the shapes are art-directable approximations rather than anatomically validated bodies.
- Clarity
- Accessible with good intuition from the growth analogy; a first pass conveys the idea, a second pass covers the growth-elasticity model and discretization.
- How to read it
- First pass for the growth-based control metaphor and the volumetric, simulation-ready output; do a second pass on the combined growth and elasticity formulation and the Projective Dynamics solver if you need the modeling details.
Muscles / Skinning
- Dataflow: ILM's Framework for Procedural Geometry Generation, Simulation Authoring, Crowds, and More DigiPro ILM 2 cites
, , , ,
Presents ILM's node-based procedural graph system used for hair instancing, geometry, and character crowd pipelines across dozens of productions.
abstract ▾ abstract ▴
Dataflow is a node-based graph-evaluation system at Industrial Light and Magic for procedural geometry generation, particle and volume simulations, field authoring, hair instancing with attribute control, and crowd simulation. The system integrates a C++ core with Python interfaces and operates as a plugin within commercial applications including Zeno, Maya, Katana, and Houdini, having been deployed on dozens of major motion pictures over eight years.
Related Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Hair and Fur in an Evolving Pipeline · Appearance Modeling of Iridescent Feathers with Diverse Nanostructures · Avatar: The Last Airbender | Framestore | Houdini Connect
how to read this ▾ how to read this ▴
- Category
- Production system / pipeline architecture paper
- Contributions
-
- A node-based graph-evaluation system (Dataflow) for procedural geometry, particle and volume simulation, field authoring, hair instancing with attribute control, and crowd simulation
- A C++ core exposed through Python interfaces that runs as a plugin inside commercial DCC tools (Zeno, Maya, Katana, Houdini)
- Evidence of long-term, cross-production deployment at ILM over roughly eight years and dozens of films
- Context
- An in-house procedural graph framework in the lineage of node-based DCC dataflow systems, generalizing the approach to span geometry, simulation, hair, and crowds within one engine.
- Correctness
- This is a studio engineering report, not a peer-reviewed method paper; its value is demonstrated by broad production use across many films rather than by controlled comparison, so read it for architectural decisions rather than benchmarks.
- Clarity
- Accessible at the systems level; a single first pass conveys the architecture and scope without heavy math.
- How to read it
- Skim once for the system diagram and the breakdown of what each domain (geometry, sim, hair, crowds) plugs into; a second pass only pays off if you are designing a similar plugin-based procedural framework.
Rigging / CFX
-
, , , , , ,
Overview of DreamWorks Animation's facial motion system covering deformation pipeline, blendshape solving, and production integration.
abstract ▾ abstract ▴
This paper presents DreamWorks Animation's facial motion and deformation system built for the Premo animation platform, which supports direct manipulation of controls on character geometry in real time. The approach combines a free-form Featureline-based facial control set with a procedural pose space motion system and a curve based pose interpolation scheme that breaks the multi-dimensional pose space into simpler single-dimensional spaces interpolated with cubic splines, avoiding the overshoot and undershoot of scattered data methods such as thin-plate splines and radial basis functions. A highly layered deformation system driven by a curve based deformer transfers this motion into the face with coarse to fine control. The system has been used on productions including How to Train Your Dragon 2 and HOME, giving animators more artistic control, reducing rigging time, and improving reuse across characters.
Related Direct Manipulation Blendshapes · Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation · Reusable Facial Rigging and Animation: Create Once, Use Many · An Empirical Rig for Jaw Animation
how to read this ▾ how to read this ▴
- Category
- Production system paper (facial rig and deformation pipeline)
- Contributions
-
- A facial motion and deformation system for DreamWorks' Premo platform supporting real-time direct manipulation of controls on character geometry
- A free-form Featureline-based control set combined with a procedural pose-space motion system using per-dimension cubic-spline interpolation to avoid scattered-data overshoot/undershoot
- A highly layered, curve-based deformer that transfers motion into the face with coarse-to-fine control, improving artistic control, rig reuse, and rigging time
- Context
- Builds on pose-space and blendshape facial rigging practice, explicitly positioning its 1D cubic-spline interpolation against scattered-data methods such as thin-plate splines and radial basis functions.
- Correctness
- Validated by production use (How to Train Your Dragon 2, HOME) rather than formal evaluation; the decomposition into single-dimensional pose spaces is the key simplifying assumption, so consider how it behaves where expression dimensions strongly interact.
- Clarity
- Readable for a rigging-literate audience; a first pass conveys the design, a second clarifies the pose-interpolation and layering scheme.
- How to read it
- Focus on the pose-space-to-1D-spline reasoning and the layered deformer order; do a second pass on the interpolation choice if you are comparing it against RBF/TPS rigs.
Facial / Rigging
-
, , , ,
Drives sub-millimeter facial scans from video-based performance capture by solving blendshape coefficients from tracked video features.
abstract ▾ abstract ▴
We present a process for rendering a realistic facial performance with control of viewpoint and illumination. The performance is based on one or more high-quality geometry and reflectance scans of an actor in static poses, driven by one or more video streams of a performance. We compute optical flow correspondences between neighboring video frames, and a sparse set of correspondences between static scans and video frames. The latter are made possible by leveraging the relightability of the static 3D scans to match the viewpoint(s) and appearance of the actor in videos taken in arbitrary environments. As optical flow tends to compute proper correspondence for some areas but not others, we also compute a smoothed, per-pixel confidence map for every computed flow, based on normalized cross-correlation. These flows and their confidences yield a set of weighted triangulation constraints among the static poses and the frames of a performance. Given a single artist-prepared face mesh for one static pose, we optimally combine the weighted triangulation constraints, along with a shape regularization term, into a consistent 3D geometry solution over the entire performance that is drift free by construction. In contrast to previous work, even partial correspondences contribute to drift minimization, for example, where a successful match is found in the eye region but not the mouth.
Related FaceLab: Scalable Facial Performance Capture for Visual Effects · Real-Time High-Fidelity Facial Performance Capture · High Resolution Passive Facial Performance Capture · Facial Retargeting with Automatic Range of Motion Alignment
how to read this ▾ how to read this ▴
- Category
- Method: video-driven facial performance reconstruction
- Contributions
-
- A pipeline that drives high-quality static geometry and reflectance scans of an actor from one or more performance video streams, with control over viewpoint and illumination
- Optical-flow correspondences between video frames plus sparse scan-to-video correspondences, the latter enabled by relighting the static scans to match arbitrary video appearance
- A per-pixel flow confidence map (from normalized cross-correlation) feeding weighted triangulation constraints, solved with shape regularization into a drift-free 3D geometry over the whole performance
- Context
- Builds on light-stage reflectance-field acquisition (Debevec et al., Acquiring the Reflectance Field of a Human Face) to make static scans relightable and matchable to in-the-wild performance video. Builds on: Acquiring the Reflectance Field of a Human Face
- Correctness
- The drift-free claim rests on combining many weighted, confidence-gated correspondences with regularization; reliability therefore depends on flow quality and confidence estimation, which can be weak in low-texture or occluded face regions.
- Clarity
- Moderately technical; a first pass conveys the relight-and-correspond idea, but the constraint formulation needs a second pass.
- How to read it
- Read first for the relightable-scan-to-video matching trick and the confidence map; second pass on the weighted triangulation/regularization solve if you need the math.
Facial
-
, , ,
Data-driven model of soft-tissue dynamics learned from 4D body scans, capturing pose-dependent deformations beyond standard skinning.
abstract ▾ abstract ▴
To look human, digital full-body avatars need to have soft-tissue deformations like those of real people. We learn a model of soft-tissue deformations from examples using a high-resolution 4D capture system and a method that accurately registers a template mesh to sequences of 3D scans. Using over 40,000 scans of ten subjects, we learn how soft-tissue motion causes mesh triangles to deform relative to a base 3D body model. Our Dyna model uses a low-dimensional linear subspace to approximate soft-tissue deformation and relates the subspace coefficients to the changing pose of the body. Dyna uses a second-order auto-regressive model that predicts soft-tissue deformations based on previous deformations, the velocity and acceleration of the body, and the angular velocities and accelerations of the limbs. Dyna also models how deformations vary with a person's body mass index (BMI), producing different deformations for people with different shapes. Dyna realistically represents the dynamics of soft tissue for previously unseen subjects and motions. We provide tools for animators to modify the deformations and apply them to new stylized characters.
Related Real-Time Weighted Pose-Space Deformation on the GPU · Learning Skeletal Articulations with Neural Blend Shapes · STAR: Sparse Trained Articulated Human Body Regressor · Avatar Reshaping and Automatic Rigging Using a Deformable Model
how to read this ▾ how to read this ▴
- Category
- Method: data-driven soft-tissue dynamics model
- Contributions
-
- Dyna, a model of pose-dependent soft-tissue deformation learned from a large set of 4D body scans registered to a template mesh
- A low-dimensional linear subspace for soft-tissue deformation tied to body pose via a second-order autoregressive model using prior deformations plus body and limb velocity and acceleration
- Modeling of how deformations vary with body mass index, with tools for animators to edit deformations and retarget them to stylized characters
- Context
- Builds directly on the SMPL skinned body model (Loper et al.), adding learned dynamic soft-tissue behavior on top of pose-based skinning. Builds on: SMPL: A Skinned Multi-Person Linear Model
- Correctness
- Learned from many scans of a limited subject set; generalization to unseen subjects/motions is claimed but bounded by training coverage, and the linear subspace plus autoregressive predictor may smooth out fast or extreme dynamics.
- Clarity
- Clearly written SIGGRAPH paper; a first pass conveys the model, a second pass clarifies the autoregressive formulation.
- How to read it
- Focus on how pose kinematics map to subspace coefficients and the role of the second-order autoregressive term; second pass for the BMI conditioning and the registration pipeline.
Skinning / ML Deformation
-
, ,
System for creating personalized animatable 3D face avatars from casual handheld video using blendshape fitting and rig transfer.
abstract ▾ abstract ▴
We present a complete pipeline for creating fully rigged, personalized 3D facial avatars from hand-held video. Our system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded expressions using an optimization that integrates feature tracking, optical flow, and shape from shading. Fine-scale details such as wrinkles are captured separately in normal maps and ambient occlusion maps. From this user- and expression-specific data, we learn a regressor for on-the-fly detail synthesis during animation to enhance the perceptual realism of the avatars. Our system demonstrates that the use of appropriate reconstruction priors yields compelling face rigs even with a minimalistic acquisition system and limited user assistance. This facilitates a range of new applications in computer animation and consumer-level online communication based on personalized avatars. We present realtime application demos to validate our method.
Related Example-Based Facial Rigging · Reusable Facial Rigging and Animation: Create Once, Use Many · Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer · Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild
how to read this ▾ how to read this ▴
- Category
- Method / system: personalized facial avatar creation from video
- Contributions
-
- A complete pipeline producing fully rigged, personalized 3D facial avatars from hand-held video
- Blendshape-template adaptation via an optimization integrating feature tracking, optical flow, and shape from shading, with wrinkle-scale detail captured in normal and ambient-occlusion maps
- A learned regressor for on-the-fly detail synthesis during animation, demonstrated in real-time application demos
- Context
- Sits in the consumer-level facial-capture and blendshape-rig lineage, emphasizing reconstruction priors that let a minimalist acquisition setup still yield compelling rigs.
- Correctness
- Quality hinges on appropriate reconstruction priors and on the recorded expression range; with a minimalistic rig and limited user assistance, results depend on capture coverage and may degrade for expressions or details not observed in the input.
- Clarity
- Readable end-to-end system description; one pass conveys the pipeline, a second covers the optimization and detail-regressor stages.
- How to read it
- Read for the staged optimization (tracking plus flow plus shape-from-shading) and the offline-detail vs runtime-synthesis split; second pass if you need to reproduce the regressor.
Facial / Rigging
- Efficient and Stable Approach to Elasticity and Collisions for Hair Animation DigiPro DreamWorks 15 cites
,
Hybrid direct and projective iterative algorithm for hair simulation with a comprehensive elasticity model and stable collision handling.
abstract ▾ abstract ▴
Presents a hybrid direct and projective iterative algorithm for physically-based hair simulation that combines elasticity with collision and friction response. The method uses a novel treatment of bending and twisting forces in discrete elastic rods with a well-defined continuum limit, and integrates collision response implicitly without introducing artificial strain. Timesteps can be as large as a single frame at 24 Hz animation.
Related Small Steps in Physics Simulation · Robust Treatment of Collisions, Contact and Friction for Cloth Animation · Better Collisions and Faster Cloth for Pixar's Coco · Adaptive Nonlinearity for Collisions in Complex Rod Assemblies
how to read this ▾ how to read this ▴
- Category
- Method: physically-based hair simulation algorithm
- Contributions
-
- A hybrid direct-and-projective iterative algorithm for physically-based hair that couples elasticity with collision and friction response
- A novel treatment of bending and twisting forces for discrete elastic rods with a well-defined continuum limit
- Implicit collision response integrated without introducing artificial strain, allowing timesteps as large as a single 24 Hz frame
- Context
- Advances discrete-elastic-rod hair simulation and follows the mass-spring hair lineage (Selle et al., A Mass Spring Model for Hair Simulation) with a more complete elasticity and collision treatment. Builds on: A Mass Spring Model for Hair Simulation
- Correctness
- The stability and large-timestep claims rest on the hybrid solver and implicit, strain-free collision handling; as a DigiPro-style method the demonstration is qualitative/production-oriented, so judge robustness on the shown cases rather than formal error analysis.
- Clarity
- Technically dense around the rod forces and collision coupling; expect a second pass for the formulation.
- How to read it
- First pass for the hybrid direct/projective scheme and why it stays stable at a frame-size step; second and possibly third pass on the bending/twisting force model and implicit collision integration.
CFX
-
, , , , , ,
Post-production tool allowing directors to blend between different takes of a facial performance, enabling new emotion transitions via image-space blending.
abstract ▾ abstract ▴
FaceDirector is a method for continuously blending between multiple recorded video takes of an actor performing the same scene with different facial expressions or emotional states, enabling a director to specify arbitrary weighted combinations and smooth transitions in post-production. The approach contributes a robust nonlinear audio-visual synchronization technique that combines normalized facial landmarks and MFCC audio cues in a graph-based cost matrix, removing self-similar ambiguous regions through local cost matrix collapsing to obtain dense frame correspondences between takes. A seamless spatio-temporal blending stage then interpolates timing, facial expression, and local appearance using optical flow warping guided by landmark priors and mask-based compositing back into one source video. The method operates entirely in 2D image space without 3D facial reconstruction, and is demonstrated on emotion transition, performance correction, and timing control applications.
Related FaceLab: Scalable Facial Performance Capture for Visual Effects · 3D Shape Regression for Real-Time Facial Animation · It's a UVN Face Rig, Charlie Brown: Facial Techniques for Peanuts · Rigid Stabilization of Facial Expressions
how to read this ▾ how to read this ▴
- Category
- Method: video-based facial performance blending tool
- Contributions
-
- FaceDirector, a method to continuously blend between multiple recorded takes of an actor performing the same scene, with director-specified weights and smooth transitions in post-production
- A robust nonlinear audio-visual synchronization technique combining normalized facial landmarks and MFCC audio cues in a graph-based cost matrix, with local cost-matrix collapsing to remove self-similar ambiguities
- A seamless spatio-temporal blending stage using landmark-guided optical-flow warping and mask-based compositing, working entirely in 2D image space without 3D reconstruction
- Context
- Relates to image-based facial editing and performance retiming, deliberately avoiding 3D face reconstruction by operating purely in 2D with audio-visual alignment.
- Correctness
- Demonstrated on emotion transition, performance correction, and timing control; because it is 2D optical-flow and compositing based, results depend on take similarity and landmark/flow accuracy and can break under large pose or appearance differences between takes.
- Clarity
- Approachable; a first pass conveys the align-then-blend idea, a second clarifies the synchronization cost matrix.
- How to read it
- Focus on the audio-visual synchronization (landmarks plus MFCC, cost-matrix collapsing) and the warping/compositing pipeline; second pass on synchronization if you care about robustness across takes.
Facial
-
,
Fast contact determination for intersecting deformable solids with detailed surfaces, using a barycentric BVH and a topological flood to find contact regions.
abstract ▾ abstract ▴
Presents a fast contact determination method for intersecting deformable solids with detailed surface geometry. A barycentric bounding-volume hierarchy and a topological flood process discover contact points and regions efficiently, enabling robust collision response for soft bodies in interactive simulation.
Related Robust Treatment of Degenerate Elements in Interactive Corotational FEM Simulations · Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation · Untangling Cloth · Fast Cloth Simulation on Moving Humanoids
how to read this ▾ how to read this ▴
- Category
- Method: contact determination for deformable solids
- Contributions
-
- A fast contact-determination method for intersecting deformable solids with detailed surface geometry
- A barycentric bounding-volume hierarchy plus a topological flood process that efficiently discovers contact points and regions
- Robust collision response enabling soft-body interaction at interactive rates
- Context
- Builds on robust collision/contact handling for deformables (Bridson et al., Robust Treatment of Collisions, Contact and Friction for Cloth Animation), targeting volumetric solids rather than cloth. Builds on: Robust Treatment of Collisions, Contact and Friction for Cloth Animation
- Correctness
- The approach assumes intersections can be resolved by flooding contact regions over a barycentric BVH; performance and correctness depend on mesh detail and intersection depth, and as an MIG method it is shown on interactive cases rather than exhaustively stress-tested.
- Clarity
- Focused and technical; the BVH plus flood idea reads quickly, the geometric details need a second pass.
- How to read it
- Read first for what the barycentric BVH buys you and how the topological flood identifies contact regions; second pass on the region-discovery details if implementing.
CFX
-
, , , ,
Fast fully-automatic morphing algorithm creates simulatable flesh and muscle models for human and humanoid faces from a target surface mesh alone.
abstract ▾ abstract ▴
Presents a fast, fully automatic method for creating simulatable facial models from target surface meshes. Uses a high-fidelity anatomical template with muscles and skeleton, automatically detects 17 landmarks and feature curves on the target, and morphs the template to match using Poisson equation-based deformation. The resulting models contain complete internal anatomy including 41 facial muscles and can be simulated to generate a wide range of expressions.
Related Lessons from the Evolution of an Anatomical Facial Muscle Model · Art-Directed Muscle Simulation for High-End Facial Animation · Building Accurate Physics-based Face Models from Data · Automatic Determination of Facial Muscle Activations from Sparse Motion Capture Marker Data
how to read this ▾ how to read this ▴
- Category
- Method: automatic anatomical face-model generation
- Contributions
-
- A fast, fully automatic method to create simulatable facial models from only a target surface mesh
- Automatic detection of 17 landmarks and feature curves plus Poisson-equation-based morphing of a high-fidelity anatomical template (skeleton and 41 facial muscles) to the target
- Resulting models carry complete internal anatomy and can be simulated to generate a wide range of expressions
- Context
- Builds on anatomical/physics-based facial simulation (Sifakis et al., automatic determination of facial muscle activations), automating the otherwise manual construction of the flesh-and-muscle model. Builds on: Automatic Determination of Facial Muscle Activations from Sparse Motion Capture Marker Data
- Correctness
- Relies on a single high-fidelity template morphed to each target, so fidelity for human and humanoid faces depends on landmark/curve detection and how well target anatomy matches the template; faces far from the template's anatomy may be approximated rather than truly individualized.
- Clarity
- Clear for a simulation audience; a first pass conveys the template-morph pipeline, a second covers the Poisson morphing and muscle setup.
- How to read it
- Focus on the landmark/feature-curve detection and the Poisson-based template morph; second pass on the anatomical template and simulation step if you intend to drive expressions.
Facial / Muscles
-
, , , , , ,
UVN-space facial rig for The Peanuts Movie letting features slide along curved head volumes while preserving 2D shape language, developed with Autodesk for Maya.
abstract ▾ abstract ▴
The Peanuts Movie required Blue Sky Studios to bring Charles Schulz's iconic characters into 3D while preserving the clean profiles, smooth round head shapes, and extreme facial expressions of the original comic. Traditional blend-shape methods produced surface distortions and volume loss from linear interpolation, so the team transformed head mesh vertices into a UVN coordinate space defined by a NURBS surface built into each head, letting linear translations in UVN space become volume preserving curved paths in Cartesian space. A procedural Peanuts factory reshaped Charlie Brown's NURBS surface to fit each kid's head so his sculpted expressions could be reused, and art directed character views provided per viewing angle on model starting poses to match Schulz's 2D aesthetic.
Related Reconstruction of Personalized 3D Face Rigs from Monocular Video · Interactive Sculpting of Digital Faces Using an Anatomical Modeling Paradigm · FaceLab: Scalable Facial Performance Capture for Visual Effects · Creating an Actor-Specific Facial Rig from Performance Capture
how to read this ▾ how to read this ▴
- Category
- Production talk / rigging technique breakdown
- Contributions
-
- A UVN-space facial rig (built with Autodesk for Maya) that maps head-mesh vertices into a NURBS-defined UV plus normal coordinate space so linear UVN translations become volume-preserving curved paths in Cartesian space
- Preservation of the Peanuts 2D shape language (clean profiles, round heads, extreme expressions) while moving the characters into 3D, avoiding the surface distortion and volume loss of linear blendshapes
- A procedural 'Peanuts factory' that reshapes the NURBS surface per character to reuse Charlie Brown's sculpted expressions, with art-directed per-viewing-angle on-model poses
- Context
- Extends direct-manipulation blendshape rigging (Lewis & Anjyo, Direct Manipulation Blendshapes) by reparameterizing facial motion into a curved UVN volume to keep stylized silhouettes intact. Builds on: Direct Manipulation Blendshapes
- Correctness
- Studio practice for a specific stylized show, not peer-reviewed; results are production-proven on The Peanuts Movie, and the UVN approach is tailored to round, simple head topology rather than a general claim about all face rigs.
- Clarity
- Very readable as a craft talk; a single pass conveys the UVN idea and motivation.
- How to read it
- Read once for the UVN reparameterization intuition (why curved paths preserve volume) and the per-angle on-model trick; no deep second pass needed unless adapting it to your own stylized character.
Facial / Rigging
-
, ,
Neural network inverts the rig function, mapping low-level joint positions or surface geometry back to animator-friendly rig control values in real time.
abstract ▾ abstract ▴
This paper presents a real-time method for inverting the rig function, the mapping from a character's animation rig controls to its underlying skeleton or mesh. Treating the rig as a black box, the approach uses Gaussian Process Regression with a multiquadric kernel to learn an offline approximation of the inverse mapping from sparse animator-constructed example poses, and additionally learns the Jacobian of the rig function so results can be refined via gradient descent to accurately match target joint positions. Because the mapping is learned from animator data, the predicted rig parameters resemble settings an animator would naturally choose rather than drifting to undesirable configurations. The method is general and applies to quadruped, biped, deformable mesh, and facial rigs, enabling motion capture, motion editing, and full-body inverse kinematics to be applied to rigged characters for immediate editing.
Related Sketch-based Motion Editing for Articulated Characters · Retargeting Motion to New Characters · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks
how to read this ▾ how to read this ▴
- Category
- Method: learned inverse-rig mapping for animation
- Contributions
-
- Treats the rig as a black box and learns an offline approximation of its inverse with Gaussian Process Regression using a multiquadric kernel.
- Learns the Jacobian of the rig function so predictions can be refined by gradient descent to match target joint positions.
- Generalizes across quadruped, biped, deformable-mesh and facial rigs, enabling mocap, motion editing and full-body IK on rigged characters.
- Context
- A data-driven take on rig inversion and inverse kinematics, learning the mapping from sparse animator-built example poses rather than hand-deriving it.
- Correctness
- Quality hinges on having representative animator example poses and a learnable inverse; the GPR fit plus Jacobian refinement is what keeps predictions animator-like, so coverage of the pose space and rig nonlinearity are the things to watch.
- Clarity
- Accessible at a high level; a first pass conveys the inverse-rig idea, do a second pass for the GPR kernel and Jacobian-refinement formulation.
- How to read it
- First pass for the black-box-inverse framing and why animator-data priors matter; second pass on the regression plus gradient-descent refinement if you intend to reimplement.
Rigging / Retargeting
-
, , ,
Convolutional autoencoder learning a compact motion manifold from raw motion capture data for synthesis and style transfer.
abstract ▾ abstract ▴
This technical brief presents a method for learning a manifold of human motion data using a deep convolutional autoencoder trained on the complete CMU motion capture database. Motion is represented as a time series of joint positions, and the network learns forward and inverse projection operators that map data onto a bounded motion manifold using one-dimensional temporal convolution, max pooling, and denoising autoencoding. The learned manifold acts as a prior over valid human motion and supports applications such as fixing corrupted or noisy Kinect captures, filling in missing marker data, interpolating between motions without blending artefacts, and computing temporally invariant distances between motions. The approach scales to large datasets with little preprocessing and projects motion in a fraction of a millisecond at runtime.
Related A Deep Learning Framework for Character Motion Synthesis and Editing · Neural Animation Layering for Synthesizing Martial Arts Movements · Data-Driven Autocompletion for Keyframe Animation · Dynamic Hair Modeling from Monocular Videos Using Deep Neural Networks
how to read this ▾ how to read this ▴
- Category
- Method: learned motion manifold (convolutional autoencoder)
- Contributions
-
- Trains a deep convolutional autoencoder on the full CMU mocap database to learn a bounded motion manifold acting as a prior over valid human motion.
- Learns forward and inverse projection operators using 1D temporal convolution, max pooling and denoising autoencoding.
- Supports cleanup of corrupted/noisy Kinect captures, missing-marker fill-in, artefact-free interpolation, and temporally invariant motion distances at sub-millisecond runtime.
- Context
- An early deep-learning approach to data-driven motion modelling, representing motion as joint-position time series and using the manifold as a learned motion prior.
- Correctness
- The manifold is only as broad as the CMU data it is trained on, so out-of-distribution or highly stylized motion may project poorly; presented as a technical brief, so treat the application demos as illustrative rather than exhaustively evaluated.
- Clarity
- Accessible as a brief; a first pass conveys the manifold-prior idea, a second pass clarifies the temporal-convolution architecture.
- How to read it
- First pass for the manifold-as-prior concept and its applications; second pass on the autoencoder structure and projection operators if building motion-synthesis tooling.
Motion Synthesis
-
, ,
Presents real-time motion retargeting and interactive motion blending within the Golaem Crowd plugin, reducing asset count for production crowd shots.
abstract ▾ abstract ▴
Motion retargeting for crowd simulation using an intermediate skeleton representation (Golaem Skeleton) based on hierarchical bone chains. The system enables real-time retargeting of animations across different character morphologies by defining skeleton-motion mappings and using inverse kinematics algorithms (analytic solution and FABRIK) with caching optimization to support thousands of characters in production crowds.
Related Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Robust Marker Trajectory Repair for MOCAP Using Kinematic Reference · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks
how to read this ▾ how to read this ▴
- Category
- Production system: crowd motion retargeting (Golaem)
- Contributions
-
- Introduces an intermediate skeleton representation (Golaem Skeleton) of hierarchical bone chains for mapping motion across differing character morphologies.
- Performs real-time retargeting using analytic and FABRIK inverse-kinematics solvers with caching to scale to thousands of characters.
- Reduces asset count for production crowd shots via interactive motion blending inside the Golaem Crowd plugin.
- Context
- An applied, crowd-scale extension of Gleicher's Retargeting Motion to New Characters, adapting per-character retargeting to thousands of agents in production. Builds on: Retargeting Motion to New Characters
- Correctness
- An engineering/production paper: validated by use in crowd shots rather than formal benchmarks, so the intermediate-skeleton mapping and IK choices trade some per-character fidelity for the throughput needed at crowd scale.
- Clarity
- Accessible and practical; a single pass conveys the pipeline, with a second pass only needed for the skeleton-mapping and caching details.
- How to read it
- First pass for the intermediate-skeleton and caching strategy; revisit the IK-solver specifics only if implementing a crowd retargeting pipeline.
Retargeting
-
,
Two-layer skinning that applies position-based dynamics constraints on top of LBS to achieve volume preservation and soft jiggle in real time.
abstract ▾ abstract ▴
In this paper, we introduce a two‐layered approach addressing the problem of creating believable mesh‐based skin deformation. For each frame, the skin is first deformed with a classic linear blend skinning approach, which usually leads to unsightly artefacts such as the well‐known candy‐wrapper effect and volume loss. Then we enforce some geometric constraints which displace the positions of the vertices to mimic the behaviour of the skin and achieve effects like volume preservation and jiggling. We allow the artist to control the amount of jiggling and the area of the skin affected by it. The geometric constraints are solved using a position‐based dynamics (PBDs) schema. We employ a graph colouring algorithm for parallelizing the computation of the constraints. Being based on PBDs guarantees efficiency and real‐time performances while enduring robustness and unconditional stability. We demonstrate the visual quality and the performance of our approach with a variety of skeleton‐driven soft body characters.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Fast Complementary Dynamics via Skinning Eigenmodes · Geodesic Voxel Binding for Production Character Meshes · Learning Skeletal Articulations with Neural Blend Shapes
how to read this ▾ how to read this ▴
- Category
- Method: a skinning algorithm (two-layer LBS + PBD)
- Contributions
-
- Two-layered skin deformation that first applies linear blend skinning, then corrects it with geometric constraints solved via position-based dynamics.
- Achieves volume preservation and jiggling while reducing classic LBS artefacts such as the candy-wrapper effect and volume loss.
- Parallelizes constraint solving with a graph-colouring algorithm and gives the artist control over jiggle amount and affected skin region, in real time.
- Context
- Builds skeleton-driven soft-body skinning on top of Mueller et al.'s Position Based Dynamics, layering PBD constraints over standard linear blend skinning. Builds on: Position Based Dynamics
- Correctness
- Demonstrated on a variety of skeleton-driven soft-body characters with real-time, unconditionally stable PBD; results are geometric/plausible rather than physically rigorous, so it targets believable jiggle and volume, not accurate soft-tissue mechanics.
- Clarity
- Accessible if you know LBS and PBD; a first pass conveys the two-layer idea, a second pass for the constraint formulation and graph-colouring parallelism.
- How to read it
- First pass for the layering concept and artist controls; second pass on the PBD constraints and parallel solver if integrating into a real-time rig.
Skinning
-
, , , ,
Real-time method adding plausible wrinkle details to coarsely simulated or keyframed cloth by using a data-driven wrinkle prediction model.
abstract ▾ abstract ▴
We present a real-time method for adding dynamic, believable wrinkles to coarse, low triangle count cloth animation typical of video games, where no reference shape exists and the input animation is noisy. The approach uses a two stage stretch tensor estimation process: a graph-cut segmentation first detects spatially and temporally coherent compressing, stable, and stretching surface patches, and these motion patterns are then used to build a per-triangle temporally adaptive reference shape and stretch tensor. Wrinkle paths are traced across areas of high compression on the piecewise constant tensor field and evolved with a global optimization for temporal persistence. Wrinkle geometry and normals are generated on the fly using the GPU tessellation unit and fragment shader, sustaining 60 frames per second on current console and PC hardware.
Related Adaptive Anisotropic Remeshing for Cloth Simulation · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Simulating Cloth Using Bilinear Elements · Dynamic Deformables: Implementation and Production Practicalities
how to read this ▾ how to read this ▴
- Category
- Method: real-time wrinkle synthesis for coarse cloth
- Contributions
-
- Adds dynamic, believable wrinkles to coarse, low-triangle game cloth animation with no reference shape and noisy input.
- Uses a two-stage stretch-tensor estimation: graph-cut segmentation of coherent compressing/stable/stretching patches, then a per-triangle temporally adaptive reference shape and stretch tensor.
- Traces wrinkle paths across high-compression regions, enforces temporal persistence via global optimization, and generates geometry/normals on the GPU at 60 fps on console and PC hardware.
- Context
- A real-time, reference-free counterpart to example-based wrinkle methods such as Wang et al.'s Example-Based Wrinkle Synthesis for Clothing Animation. Builds on: Example-Based Wrinkle Synthesis for Clothing Animation
- Correctness
- Targets the hard case of noisy keyframed/coarse cloth with no rest reference, so wrinkles are plausible and temporally coherent rather than physically derived; quality depends on the stretch-tensor segmentation tracking the true compression regions.
- Clarity
- Moderately dense; a first pass conveys the pipeline stages, a second pass for the tensor estimation and wrinkle-path optimization.
- How to read it
- First pass for the two-stage tensor-to-wrinkle pipeline; second pass on the graph-cut segmentation and temporal optimization if implementing a real-time cloth-detail shader.
CFX
-
, , ,
Real-time system for capturing high-fidelity facial performance from RGB cameras using a personalized blendshape model.
abstract ▾ abstract ▴
We present the first real-time high-fidelity facial capture method. The core idea is to enhance a global real-time face tracker, which provides a low-resolution face mesh, with local regressors that add in medium-scale details, such as expression wrinkles. Our main observation is that although wrinkles appear in different scales and at different locations on the face, they are locally very self-similar and their visual appearance is a direct consequence of their local shape. We therefore train local regressors from high-resolution capture data in order to predict the local geometry from local appearance at runtime. We propose an automatic way to detect and align the local patches required to train the regressors and run them efficiently in real-time. Our formulation is particularly designed to enhance the low-resolution global tracker with exactly the missing expression frequencies, avoiding superimposing spatial frequencies in the result. Our system is generic and can be applied to any real-time tracker that uses a global prior, e.g. blend-shapes. Once trained, our online capture approach can be applied to any new user without additional training, resulting in high-fidelity facial performance reconstruction with person-specific wrinkle details from a monocular video camera in real-time.
Related FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction · Driving High-Resolution Facial Scans with Video Performance Capture · FaceWarehouse: A 3D Facial Expression Database for Visual Computing · Dynamic 3D Avatar Creation from Hand-Held Video Input
how to read this ▾ how to read this ▴
- Category
- Capture system: real-time high-fidelity facial capture
- Contributions
-
- Presents a real-time high-fidelity facial capture method that augments a low-resolution global face tracker with medium-scale detail such as expression wrinkles.
- Trains local regressors on high-resolution capture data to predict local geometry from local appearance at runtime, exploiting the self-similarity of wrinkles.
- Formulated to add exactly the frequencies missing from the global tracker (avoiding superimposed frequencies), is tracker-agnostic, and generalizes to new users without extra training.
- Context
- Extends real-time blendshape-based facial tracking, building directly on Cao et al.'s Displaced Dynamic Expression Regression by adding a local-detail regression layer. Builds on: Displaced Dynamic Expression Regression for Real-Time Facial Tracking and Animation
- Correctness
- Relies on the assumption that wrinkle appearance is locally self-similar and a direct consequence of local shape; demonstrated on RGB camera input with a personalized blendshape model, so detail fidelity depends on the training data and the global tracker's quality.
- Clarity
- Accessible in concept (global tracker plus local detail); a first pass conveys the idea, a second pass for the patch detection/alignment and regressor training.
- How to read it
- First pass for the global-plus-local-detail decomposition and the missing-frequency argument; second pass on the local regressor training if reproducing the detail layer.
Facial
-
, , ,
Data-driven pipeline reconstructing complete 3D hairstyles from a single photograph using USC-HairSalon database retrieval and user-guided strand growing.
abstract ▾ abstract ▴
Human hair presents highly convoluted structures and spans an extraordinarily wide range of hairstyles, which is essential for the digitization of compelling virtual avatars but also one of the most challenging to create. Cutting-edge hair modeling techniques typically rely on expensive capture devices and significant manual labor. We introduce a novel data-driven framework that can digitize complete and highly complex 3D hairstyles from a single-view photograph. We first construct a large database of manually crafted hair models from several online repositories. Given a reference photo of the target hairstyle and a few user strokes as guidance, we automatically search for multiple best matching examples from the database and combine them consistently into a single hairstyle to form the large-scale structure of the hair model. We then synthesize the final hair strands by jointly optimizing for the projected 2D similarity to the reference photo, the physical plausibility of each strand, as well as the local orientation coherency between neighboring strands. We demonstrate the effectiveness and robustness of our method on a variety of hairstyles and challenging images, and compare our system with state-of-the-art hair modeling algorithms.
Related Structure-Aware Hair Capture · A Data-Driven Approach to Four-View Image-Based Hair Modeling · Robust Hair Capture Using Simulated Examples · NeuralHDHair: Automatic High-Fidelity Hair Modeling from a Single Image Using Implicit Neural Representations
how to read this ▾ how to read this ▴
- Category
- Method: data-driven single-view hair modeling
- Contributions
-
- Digitizes complete, complex 3D hairstyles from a single-view photograph plus a few user strokes, avoiding expensive capture rigs and heavy manual labor.
- Builds a large database of hand-crafted hair models and retrieves/combines multiple best-matching examples into the large-scale hair structure.
- Synthesizes final strands by jointly optimizing 2D similarity to the photo, per-strand physical plausibility, and local orientation coherency between neighboring strands.
- Context
- A database-retrieval approach to hair digitization that extends prior capture work such as Hu et al.'s Robust Hair Capture Using Simulated Examples to the single-photo setting. Builds on: Robust Hair Capture Using Simulated Examples
- Correctness
- Quality is bounded by database coverage and the user-provided strokes/photo; the result is a plausible, coherent match to the reference rather than a true measurement of the subject's hair, so unusual styles outside the database are a limitation.
- Clarity
- Accessible; a first pass conveys the retrieve-then-synthesize pipeline, a second pass for the joint strand-optimization terms.
- How to read it
- First pass for the database-retrieval-plus-strand-synthesis structure; second pass on the optimization objectives if building a hair-reconstruction pipeline.
CFX
-
, , , , ,
Stroke-based posing tool that snaps multiple joints to a drawn silhouette, used heavily on Inside Out for character and cloth-simulation results tuning.
abstract ▾ abstract ▴
This Pixar talk presents a sketch-based interface in the Presto animation system that lets animators pose multiple joints at once by snapping them to a user-drawn stroke. A multiple-joint guide depicts joints as dots connected by a centerline that is implicitly generated from the character rig and transformed to match the drawn stroke. Special rig-aware solvers detect which guides are displayed and write only into the same parameters those guides would use, so animators can freely switch between sketching and direct manipulation. Three stroke conversion schemes are provided, including flattening the model into a camera-parallel plane, moving each articulation point to touch the stroke in screen space while preserving world-space distance, and interpreting strokes within an automatically chosen XY, YZ or XZ guide plane.
Related Wig Refitting in Pixar's Inside Out 2 · SkinMixer: Blending 3D Animated Models · Stable and Efficient Differential IK · Using Deep Learning to Approximate Joint Placement in 3D Bipedal Characters
how to read this ▾ how to read this ▴
- Category
- Production talk: sketch-based posing in Pixar Presto
- Contributions
-
- Demonstrates a stroke-based interface in Presto that poses multiple joints at once by snapping them to an animator-drawn stroke.
- Shows a multiple-joint guide (dots on an implicitly generated centerline) plus rig-aware solvers that write only into the parameters those guides use, allowing free switching between sketching and direct manipulation.
- Presents three stroke-conversion schemes: camera-parallel flattening, screen-space touch with world-space distance preserved, and interpretation within an auto-chosen XY/YZ/XZ guide plane.
- Context
- A production application of sketch-based posing, conceptually related to learned-prior posing such as Grochow et al.'s Style-Based Inverse Kinematics but driven by rig-aware solvers, used heavily on Inside Out. Builds on: Style-Based Inverse Kinematics
- Correctness
- Studio practice rather than peer-reviewed; the approach is production-proven on a feature film, so judge it by workflow value and integration with the rig, not formal evaluation.
- Clarity
- Highly accessible; a single read conveys the interaction model, with little need for a deeper formulation pass.
- How to read it
- Read once for the interaction design and the three stroke-conversion schemes; useful as inspiration for posing UX rather than as an algorithm to reimplement.
Rigging
-
, ,
Algebraic multigrid method for accelerating cloth simulation on unstructured meshes with varying material properties and collision constraints.
abstract ▾ abstract ▴
Existing multigrid methods for cloth simulation are based on geometric multigrid. While good results have been reported, geometric methods are problematic for unstructured grids, widely varying material properties, and varying anisotropies, and they often have difficulty handling constraints arising from collisions. This paper applies the algebraic multigrid method known as smoothed aggregation to cloth simulation. This method is agnostic to the underlying tessellation, which can even vary over time, and it only requires the user to provide a fine-level mesh. To handle contact constraints efficiently, a prefiltered preconditioned conjugate gradient method is introduced. For highly efficient preconditioners, like the ones proposed here, prefiltering is essential, but, even for simple preconditioners, prefiltering provides significant benefits in the presence of many constraints. Numerical tests of the new approach on a range of examples confirm 6--8 x speedups on a fully dressed character with 371k vertices, and even larger speedups on synthetic examples.
Related CAMA: Contact-Aware Matrix Assembly with Unified Collision Handling for GPU-based Cloth Simulation · Continuum-based Strain Limiting · Projective Dynamics with Dry Frictional Contact · GPU-Based Simulation of Cloth Wrinkles at Submillimeter Levels
how to read this ▾ how to read this ▴
- Category
- Method: algebraic multigrid solver for cloth
- Contributions
-
- Applies smoothed-aggregation algebraic multigrid to cloth simulation, agnostic to tessellation (which may even vary over time) and requiring only a fine-level mesh.
- Introduces a prefiltered preconditioned conjugate gradient method to handle contact constraints efficiently.
- Reports 6 to 8x speedups on a fully dressed 371k-vertex character, with larger speedups on synthetic examples.
- Context
- Replaces geometric multigrid with algebraic multigrid for the implicit cloth solver in the lineage of Baraff and Witkin's Large Steps in Cloth Simulation. Builds on: Large Steps in Cloth Simulation
- Correctness
- Validated by numerical tests across a range of examples including a full garment; speedups assume the AMG plus prefiltered PCG setup suits the problem, and the prefiltering is reported as essential for efficient preconditioners under many constraints.
- Clarity
- Numerically dense; a first pass conveys why algebraic beats geometric multigrid here, with a careful second/third pass needed for the solver and prefiltering math.
- How to read it
- First pass for the motivation (unstructured meshes, varying materials, constraints); plan a thorough second pass on the AMG and prefiltered-PCG formulation if implementing the solver.
CFX
-
, , , ,
SMPL body model factoring shape and pose-dependent deformations for efficient synthesis of realistic human body shapes and animations.
abstract ▾ abstract ▴
We present a learned model of human body shape and pose-dependent shape variation that is more accurate than previous models and is compatible with existing graphics pipelines. Our Skinned Multi-Person Linear model (SMPL) is a skinned vertex-based model that accurately represents a wide variety of body shapes in natural human poses. The parameters of the model are learned from data including the rest pose template, blend weights, pose-dependent blend shapes, identity-dependent blend shapes, and a regressor from vertices to joint locations. Unlike previous models, the pose-dependent blend shapes are a linear function of the elements of the pose rotation matrices. This simple formulation enables training the entire model from a relatively large number of aligned 3D meshes of different people in different poses. We quantitatively evaluate variants of SMPL using linear or dual-quaternion blend skinning and show that both are more accurate than a Blend-SCAPE model trained on the same data. We also extend SMPL to realistically model dynamic soft-tissue deformations. Because it is based on blend skinning, SMPL is compatible with existing rendering engines and we make it available for research purposes.
Related ClothCap: Seamless 4D Clothing Capture and Retargeting · Expressive Body Capture: 3D Hands, Face, and Body from a Single Image · STAR: Sparse Trained Articulated Human Body Regressor · 3D Mesh Pose Transfer Based on Skeletal Deformation
how to read this ▾ how to read this ▴
- Category
- Method / model: learned parametric body model (SMPL)
- Contributions
-
- Presents a skinned vertex-based human body model that factors identity-dependent and pose-dependent shape variation and is compatible with existing graphics pipelines.
- Makes pose-dependent blend shapes a linear function of the elements of the pose rotation matrices, enabling training the whole model (template, blend weights, blend shapes, joint regressor) from many aligned 3D meshes.
- Shows SMPL with linear or dual-quaternion blend skinning is more accurate than a Blend-SCAPE model on the same data, and extends to dynamic soft-tissue deformation.
- Context
- A blend-skinning successor to deformation-based body models such as Anguelov et al.'s SCAPE, designed to stay compatible with standard rendering engines. Builds on: SCAPE: Shape Completion and Animation of People
- Correctness
- Accuracy claims are relative to a Blend-SCAPE baseline trained on the same data and depend on the quality and coverage of the aligned mesh training set; the linear pose-blendshape formulation is a deliberate simplification that trades some expressiveness for trainability and pipeline compatibility.
- Clarity
- Accessible given a skinning background; a first pass conveys the factored model, a careful second pass for the blend-shape parameterization and training.
- How to read it
- First pass for the shape/pose factorization and why linear pose blendshapes matter; second pass on the formulation and training if using or extending SMPL.
Skinning / Retargeting
- talk Ubisoft Cloth Simulation: Performance Postmortem and Journey from C++ to Compute Shaders GDC Industrial
Ubisoft analyzed cloth simulation performance in Assassin's Creed Unity and Far Cry 4, then ported the system to GPU compute shaders achieving over 10x performance improvement.
CFX
- Vdub: Modifying Face Video of Actors for Plausible Visual Alignment to a Dubbed Audio Track Eurographics Academic 135 cites
, , , , , ,
Monocular face modification system for dubbing, aligning lip motion to dubbed audio by warping the actor's mouth region.
abstract ▾ abstract ▴
In many countries, foreign movies and TV productions are dubbed, i.e., the original voice of an actor is replaced with a translation that is spoken by a dubbing actor in the country's own language. Dubbing is a complex process that requires specific translations and accurately timed recitations such that the new audio at least coarsely adheres to the mouth motion in the video. However, since the sequence of phonemes and visemes in the original and the dubbing language are different, the video‐to‐audio match is never perfect, which is a major source of visual discomfort. In this paper, we propose a system to alter the mouth motion of an actor in a video, so that it matches the new audio track. Our paper builds on high‐quality monocular capture of 3D facial performance, lighting and albedo of the dubbing and target actors, and uses audio analysis in combination with a space‐time retrieval method to synthesize a new photo‐realistically rendered and highly detailed 3D shape model of the mouth region to replace the target performance. We demonstrate plausible visual quality of our results compared to footage that has been professionally dubbed in the traditional way, both qualitatively and through a user study.
Related Reconstruction of Personalized 3D Face Rigs from Monocular Video · FaceLab: Scalable Facial Performance Capture for Visual Effects · Monocular Facial Performance Capture via Deep Expression Matching · Face2Face: Real-Time Face Capture and Reenactment of RGB Videos
how to read this ▾ how to read this ▴
- Category
- Method: a monocular face-video editing system for visual dubbing
- Contributions
-
- Alters an actor's mouth motion in a video so it matches a newly dubbed audio track for plausible lip alignment
- Combines high-quality monocular 3D facial-performance capture (shape, lighting, albedo) with audio analysis and a space-time retrieval method to synthesize a new detailed mouth-region model
- Demonstrates photo-realistic rendering of the edited mouth and compares quality against traditionally, professionally dubbed footage
- Context
- Builds on 3D-face modeling lineage rooted in Blanz and Vetter's morphable model (blanz-morphable-1999), applying monocular performance capture plus audio-driven viseme retrieval to the dubbing problem. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Operates from monocular footage and assumes reliable single-view capture of geometry, lighting and albedo; results are shown qualitatively against professional dubs, so a reader should keep in mind that mouth-region synthesis quality depends on capture accuracy and the audio-to-viseme retrieval match.
- Clarity
- Accessible at a high level; a first pass conveys the pipeline idea, a second pass pays off for the capture and space-time retrieval formulation.
- How to read it
- First pass for the overall pipeline (capture, audio analysis, mouth retrieval, compositing); do a second pass on the retrieval and rendering stages if you care how the new mouth is synthesized and blended.
Facial
2014
27-
, ,
Guide-strand reduced model with optimal interpolation relationships simulates 150K hair strands in real time with a hair-correction collision pass.
abstract ▾ abstract ▴
Realistic hair animation is a crucial component in depicting virtual characters in interactive applications. While much progress has been made in high-quality hair simulation, the overwhelming computation cost hinders similar fidelity in realtime simulations. To bridge this gap, we propose a data-driven solution. Building upon precomputed simulation data, our approach constructs a reduced model to optimally represent hair motion characteristics with a small number of guide hairs and the corresponding interpolation relationships. At runtime, utilizing such a reduced model, we only simulate guide hairs that capture the general hair motion and interpolate all rest strands. We further propose a hair correction method that corrects the resulting hair motion with a position-based model to resolve hair collisions and thus captures motion details. Our hair simulation method enables a simulation of a full head of hairs with over 150K strands in realtime. We demonstrate the efficacy and robustness of our method with various hairstyles and driven motions (e.g., head movement and wind force), and compared against full simulation results that does not appear in the training data.
Related Physics-Inspired Upsampling for Cloth Simulation in Games · Artistic Simulation of Curly Hair · Fast Automatic Skinning Transformations · Hair Modeling and Simulation by Style
how to read this ▾ how to read this ▴
- Category
- Method: data-driven reduced model for real-time hair simulation
- Contributions
-
- A reduced model, built from precomputed simulation data, that represents hair motion with a small number of guide hairs plus optimal interpolation relationships
- A runtime scheme simulating only guide hairs and interpolating the rest, reaching real-time simulation of full heads with over 150K strands
- A position-based hair correction pass that resolves collisions and recovers motion detail
- Context
- A data-driven reduction over full hair simulation, building on mass-spring hair dynamics such as Selle et al.'s A Mass Spring Model for Hair Simulation. Builds on: A Mass Spring Model for Hair Simulation
- Correctness
- Assumes guide hairs plus learned interpolation capture general motion and that the correction pass handles collisions; quality depends on the precomputed training set, and the authors note comparison against full simulations including motions not in the training data, which is the key generalization concern.
- Clarity
- Accessible; a first pass conveys the guide-hair plus interpolation plus correction structure, a second pass clarifies how interpolation relationships are learned.
- How to read it
- Focus on how guide hairs and interpolation weights are chosen and what the correction pass adds; a second pass pays off for the reduced-model construction and out-of-training behavior.
CFX
-
, ,
Eulerian-on-Lagrangian framework simulating volumetric muscles in close contact with volume preservation, large deformation, and active contraction.
abstract ▾ abstract ▴
We introduce a new framework for simulating the dynamics of musculoskeletal systems, with volumetric muscles in close contact and a novel data-driven muscle activation model. Muscles are simulated using an Eulerian-on-Lagrangian discretization that handles volume preservation, large deformation, and close contact between adjacent tissues. Volume preservation is crucial for accurately capturing the dynamics of muscles and other biological tissues. We show how to couple the dynamics of soft tissues with Lagrangian multi-body dynamics simulators, which are widely available. Our physiologically based muscle activation model utilizes knowledge of the active shapes of muscles, which can be easily obtained from medical imaging data or designed to meet artistic needs. We demonstrate results with models derived from MRI data and models designed for artistic effect.
Related Anatomy-Based Modeling of the Human Musculature · Volume Preserving Simulation of Soft Tissue with Skin · Muscle and Fascia Simulation with Extended Position Based Dynamics · Art-Directed Muscle Simulation for High-End Facial Animation
how to read this ▾ how to read this ▴
- Category
- Method: volumetric musculoskeletal simulation framework
- Contributions
-
- An Eulerian-on-Lagrangian discretization for volumetric muscles handling volume preservation, large deformation, and close contact between adjacent tissues
- A data-driven muscle activation model based on active muscle shapes from medical imaging or artistic design
- Coupling of soft-tissue dynamics with widely available Lagrangian multi-body dynamics simulators
- Context
- Advances physically based muscle simulation in the lineage of Teran et al.'s skeletal muscle work from the Visible Human dataset, emphasizing volume preservation and contact. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Premised on volume preservation being crucial for biological tissue dynamics and on activation shapes obtainable from MRI or art direction; demonstrated on MRI-derived and artist-designed models, so it is a demonstration of plausibility/controllability rather than validated muscle-force accuracy.
- Clarity
- Moderately technical; a first pass conveys the Eulerian-on-Lagrangian and active-shape ideas, a second/third pass is needed for the discretization and coupling.
- How to read it
- Focus on why Eulerian-on-Lagrangian helps with volume and contact, and how the activation model is driven; a second pass pays off for the discretization, a third if implementing the multi-body coupling.
Muscles / CFX
-
, , , ,
Stable simulation of large rod assemblies by adapting nonlinearity in collision response, addressing the severe time-step restrictions caused by transversal impacts.
abstract ▾ abstract ▴
We develop an algorithm for the efficient and stable simulation of large-scale elastic rod assemblies. We observe that the time-integration step is severely restricted by a strong nonlinearity in the response of stretching modes to transversal impact, the degree of this nonlinearity varying greatly with the shape of the rod. Building on these observations, we propose a collision response algorithm that adapts its degree of nonlinearity. We illustrate the advantages of the resulting algorithm by analyzing simulations involving elastic rod assemblies of varying density and scale, with up to 1.7 million individual contacts per time step.
Related Incremental Potential Contact: Intersection- and Inversion-free Large-Deformation Dynamics · Efficient and Stable Approach to Elasticity and Collisions for Hair Animation · Super-Helices for Predicting the Dynamics of Natural Hair · A Hybrid Iterative Solver for Robustly Capturing Coulomb Friction in Hair Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: stable collision response for large elastic rod assemblies
- Contributions
-
- Identification that the time-step is severely limited by strong nonlinearity in stretching-mode response to transversal impact, varying with rod shape
- A collision response algorithm that adapts its degree of nonlinearity to that observation
- Efficient, stable simulation of large rod assemblies, demonstrated with up to 1.7 million contacts per time step
- Context
- Builds on the discrete elastic rods model (Bergou et al.) to tackle the stability bottleneck in densely colliding rod systems such as hair and fiber assemblies. Builds on: Discrete Elastic Rods
- Correctness
- Grounded in an analysis of the stretching-mode nonlinearity under transversal impact; the adaptivity targets stability/efficiency and is illustrated across rod assemblies of varying density and scale, so the contribution is solver robustness rather than new material behavior.
- Clarity
- Technical; a first pass conveys the diagnosis and the adapt-nonlinearity idea, a second/third pass is needed for the response formulation and stability analysis.
- How to read it
- Focus on the observation about transversal-impact nonlinearity and how the response adapts to it; a second pass pays off for the time-integration and stability argument.
CFX
-
Irrational Games details the cross-disciplinary animation, systemic, and narrative techniques that gave BioShock Infinite's Elizabeth the illusion of humanity through art, design, and systems collaboration.
Facial / Retargeting
-
, , ,
Rhythm and Hues' detail-preserving smoothing deformer that cleans rough skinning automatically, now a standard node in every DCC.
Skinning
- Displaced Dynamic Expression Regression for Real-Time Facial Tracking and Animation SIGGRAPH Academic 409 cites
, ,
Regresses displaced expression shapes from RGB video enabling real-time facial tracking and animation with detailed wrinkle representation.
abstract ▾ abstract ▴
We present a fully automatic approach to real-time facial tracking and animation with a single video camera. Our approach does not need any calibration for each individual user. It learns a generic regressor from public image datasets, which can be applied to any user and arbitrary video cameras to infer accurate 2D facial landmarks as well as the 3D facial shape from 2D video frames. The inferred 2D landmarks are then used to adapt the camera matrix and the user identity to better match the facial expressions of the current user. The regression and adaptation are performed in an alternating manner. With more and more facial expressions observed in the video, the whole process converges quickly with accurate facial tracking and animation. In experiments, our approach demonstrates a level of robustness and accuracy on par with state-of-the-art techniques that require a time-consuming calibration step for each individual user, while running at 28 fps on average. We consider our approach to be an attractive solution for wide deployment in consumer-level applications.
Related Face2Face: Real-Time Face Capture and Reenactment of RGB Videos · A Facial Motion Retargeting Pipeline for Appearance Agnostic 3D Characters · Single-Shot High-Quality Facial Geometry and Skin Appearance Capture · Performance-Driven Facial Animation
how to read this ▾ how to read this ▴
- Category
- Method: regression-based real-time facial tracking from RGB video
- Contributions
-
- A fully automatic, calibration-free real-time facial tracking and animation system using a single video camera
- A generic regressor learned from public image datasets that infers 2D landmarks and 3D facial shape, applicable to any user and camera
- Alternating regression and adaptation of camera matrix and user identity, running at around 28 fps
- Context
- Uses the FaceWarehouse 3D expression database (Cao et al.) as the basis for learning the generic regressor and shape priors. Builds on: FaceWarehouse: A 3D Facial Expression Database for Visual Computing
- Correctness
- Assumes a generic regressor trained on public datasets transfers to arbitrary users/cameras and that alternating adaptation converges; reported to match calibration-based methods in robustness/accuracy while running real-time, so reader should note accuracy depends on RGB-only cues without depth.
- Clarity
- Accessible; a first pass conveys the regress-then-adapt loop, a second pass is needed for the displaced-expression regression formulation.
- How to read it
- Focus on what the regressor predicts and how the alternating identity/camera adaptation refines it; a second pass pays off for the displaced dynamic expression regression details.
Facial
-
,
Bungie shared how 42 player-selectable heads across different alien races and genders shared a single facial animation set via parameterized topology and rig design.
Facial / Retargeting
-
, , , ,
Large-scale 3D facial expression database with bilinear model fitting enabling data-driven face tracking, synthesis, and editing.
abstract ▾ abstract ▴
We present FaceWarehouse, a database of 3D facial expressions for visual computing applications. We use Kinect, an off-the-shelf RGBD camera, to capture 150 individuals aged 7-80 from various ethnic backgrounds. For each person, we captured the RGBD data of her different expressions, including the neutral expression and 19 other expressions such as mouth-opening, smile, kiss, etc. For every RGBD raw data record, a set of facial feature points on the color image such as eye corners, mouth contour, and the nose tip are automatically localized, and manually adjusted if better accuracy is required. We then deform a template facial mesh to fit the depth data as closely as possible while matching the feature points on the color image to their corresponding points on the mesh. Starting from these fitted face meshes, we construct a set of individual-specific expression blendshapes for each person. These meshes with consistent topology are assembled as a rank-3 tensor to build a bilinear face model with two attributes: identity and expression. Compared with previous 3D facial databases, for every person in our database, there is a much richer matching collection of expressions, enabling depiction of most human facial actions.
Related Animating Facial Expressions · A Muscle Model for Animating Three-Dimensional Facial Expression · 3D Shape Regression for Real-Time Facial Animation · Facial Performance Enhancement Using Dynamic Shape Space Analysis
how to read this ▾ how to read this ▴
- Category
- Dataset / 3D facial database with a bilinear model
- Contributions
-
- A database of 3D facial expressions for 150 individuals (ages 7-80, varied ethnicities) captured with a Kinect RGBD camera
- For each person, a neutral plus 19 expression captures, with facial feature points auto-localized and manually adjusted
- A template mesh fitted to depth and color features, yielding individual-specific blendshapes assembled into a rank-3 tensor (identity x expression) bilinear face model
- Context
- Builds on the morphable-model tradition for faces (Blanz and Vetter's 'A Morphable Model for the Synthesis of 3D Faces'), extending it to a consumer-depth-camera-captured, expression-rich bilinear formulation. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Validated as an asset for data-driven tracking, synthesis, and editing; note the data comes from commodity Kinect RGBD with partly manual feature correction, so reconstruction fidelity is bounded by that capture setup and the chosen 19-expression set.
- Clarity
- Accessible; a first pass conveys what the database offers, a second pass pays off for the mesh-fitting and tensor (bilinear) construction details.
- How to read it
- Skim the abstract and figures first to judge fit as a resource; if you plan to use the model, do a second pass on the fitting pipeline and the rank-3 tensor decomposition, and check the expression list against your needs.
Facial
-
, , , , , , , ,
Enhances dynamic facial performance captures by propagating high-frequency detail from a reference scan through the sequence.
abstract ▾ abstract ▴
The facial performance of an individual is inherently rich in subtle deformation and timing details. Although these subtleties make the performance realistic and compelling, they often elude both motion capture and hand animation. We present a technique for adding fine-scale details and expressiveness to low-resolution art-directed facial performances, such as those created manually using a rig, via marker-based capture, by fitting a morphable model to a video, or through Kinect reconstruction using recent faceshift technology. We employ a high-resolution facial performance capture system to acquire a representative performance of an individual in which he or she explores the full range of facial expressiveness. From the captured data, our system extracts an expressiveness model that encodes subtle spatial and temporal deformation details specific to that particular individual. Once this model has been built, these details can be transferred to low-resolution art-directed performances. We demonstrate results on various forms of input; after our enhancement, the resulting animations exhibit the same nuances and fine spatial details as the captured performance, with optional temporal enhancement to match the dynamics of the actor. Finally, we show that our technique outperforms the current state-of-the-art in example-based facial animation.
Related Realtime Facial Animation with On-the-fly Correctives · The Digital Emily Project: Achieving a Photorealistic Digital Actor · Online Modeling for Realtime Facial Animation · Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer
how to read this ▾ how to read this ▴
- Category
- Method: facial performance detail enhancement / transfer
- Contributions
-
- A technique to add fine-scale spatial detail and expressiveness to low-resolution, art-directed facial performances
- An expressiveness model extracted from a high-resolution capture of one individual exploring their full expressive range, encoding subtle spatial and temporal deformation details
- Transfer of those details onto inputs from rigs, marker capture, morphable-model video fits, or Kinect reconstruction
- Context
- Relates to high-quality facial capture (Beeler et al.'s 'High-Quality Passive Facial Performance Capture Using Anchor Frames') and uses it as the source of the per-individual detail model that is then propagated to coarse performances. Builds on: High-Quality Passive Facial Performance Capture Using Anchor Frames
- Correctness
- The enhancement is person-specific: it assumes a representative high-resolution capture of the same individual exists and that the low-res input is expressively compatible; results are demonstrated across several input types, but transfer quality is bounded by how well the captured range covers the target performance.
- Clarity
- Accessible in concept; a first pass conveys the capture-then-transfer idea, a second pass pays off for the shape-space analysis and the spatial/temporal detail formulation.
- How to read it
- Read for the pipeline (build expressiveness model, then transfer); focus on what the dynamic shape space encodes and the assumed match between reference capture and target, then a second pass on the math if you intend to reproduce it.
Facial
-
, ,
Learns face-motion retargeting from actor to virtual model using MLP and RBF neural networks trained on small datasets.
abstract ▾ abstract ▴
This paper presents a system that maps optical motion capture markers from an actor's face to blendshape weights on a virtual model through supervised learning of a small training dataset. The authors compare Radial Basis Function Networks (RBFNs), previously used for this task, against Multi-Layer Perceptron Artificial Neural Networks (ANNs), which to their knowledge had not been applied to marker-to-blendshape retargeting. The face is broken into upper face, eye, and lower face regions, with PCA reduction applied per region before learning the mapping. Their results found that both systems produced similar output, with the ANN sometimes proving more expressive but harder to train and more sensitive to preprocessing than the RBFN.
Related Facial Retargeting with Automatic Range of Motion Alignment · FDLS: A Deep Learning Approach to Production Quality, Controllable, and Retargetable Facial Performances · High Fidelity Facial Animation Capture and Retargeting with Contours · Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild
how to read this ▾ how to read this ▴
- Category
- Method: marker-to-blendshape facial retargeting via neural networks
- Contributions
-
- A supervised system mapping an actor's optical mocap face markers to blendshape weights on a virtual model, trained on a small dataset
- A comparison of Multi-Layer Perceptron ANNs against the previously used Radial Basis Function Networks for this marker-to-blendshape task
- A regional decomposition (upper face, eye, lower face) with per-region PCA reduction before learning the mapping
- Context
- Builds on artist-friendly performance-driven retargeting (Seol et al.'s 'Artist Friendly Facial Animation Retargeting'), substituting learned MLP/RBF regressors for the marker-to-rig mapping. Builds on: Artist Friendly Facial Animation Retargeting
- Correctness
- An empirical comparison: the authors report both methods produce similar output, with the ANN sometimes more expressive but harder to train and more sensitive to preprocessing; conclusions rest on a small training set, so generalization beyond their data should not be assumed.
- Clarity
- Accessible and short; a single first pass conveys the setup and findings, with a second pass only if you want the regional PCA and network configuration details.
- How to read it
- Read mainly for the practical takeaway (ANN vs RBFN trade-offs); focus on the training-data size, regional split, and preprocessing sensitivity rather than expecting a definitive winner.
Facial / Retargeting
- Generalizing Locomotion Style to New Animals with Inverse Optimal Regression SIGGRAPH Academic 48 cites
, ,
Inverse optimal control framework for generalizing locomotion style to novel animal morphologies by regressing locomotion objectives.
abstract ▾ abstract ▴
We present a technique for analyzing a set of animal gaits to predict the gait of a new animal from its shape alone. This method works on a wide range of bipeds and quadrupeds, and adapts the motion style to the size and shape of the animal. We achieve this by combining inverse optimization with sparse data interpolation. Starting with a set of reference walking gaits extracted from sagittal plane video footage, we first use inverse optimization to learn physically motivated parameters describing the style of each of these gaits. Given a new animal, we estimate the parameters describing its gait with sparse data interpolation, then solve a forward optimization problem to synthesize the final gait. To improve the realism of the results, we introduce a novel algorithm called joint inverse optimization which learns coherent patterns in motion style from a database of example animal-gait pairs. We quantify the predictive performance of our model by comparing its synthesized gaits to ground truth walking motions for a range of different animals. We also apply our method to the prediction of gaits for dinosaurs and other extinct creatures.
Related Physically Based Motion Transformation · Physics-Based Motion Retargeting from Sparse Inputs · Meet MotionMaker: New AI Animation Tool In Maya · Learning Locomotion Skills Using DeepRL: Does the Choice of Action Space Matter?
how to read this ▾ how to read this ▴
- Category
- Method: inverse-optimal-control synthesis of animal locomotion style
- Contributions
-
- A framework that predicts a new animal's gait from its shape alone, adapting motion style to size and shape across bipeds and quadrupeds
- Inverse optimization to learn physically motivated style parameters from reference gaits, combined with sparse data interpolation to estimate parameters for a new animal
- A joint inverse optimization algorithm that learns coherent style patterns from a database of animal-gait pairs, with application to extinct creatures
- Context
- Rooted in physics-based optimization of motion (Witkin and Kass's 'Spacetime Constraints'), recast as inverse optimal control to recover the objectives that explain observed gaits and regress them to new morphologies. Builds on: Spacetime Constraints
- Correctness
- Predictive performance is quantified against ground-truth walking motions for a range of animals; extrapolation to dinosaurs and extinct creatures is plausible by construction but inherently unverifiable, and inputs are reference gaits from sagittal-plane video, so out-of-plane motion is not directly modeled.
- Clarity
- Conceptually clear but technically dense; a first pass conveys the inverse-then-forward optimization loop, and a second pass is worth it for the joint inverse optimization formulation.
- How to read it
- Read for the inverse-optimization-plus-interpolation pipeline; focus on what the learned style parameters represent and how the joint formulation pools across examples, with a careful second pass before trusting the extinct-animal extrapolations.
Motion Synthesis / Retargeting
-
, ,
Production talk introducing Furball, DNEG's GPU-accelerated procedural node-graph fur grooming tool used for the lion character in Hercules.
CFX
-
Postmortem of Harmonix's object-oriented Maya Python rigging framework, covering the framework's evolution from early failures to a production-proven system for character rig and tool development.
Rigging
-
, ,
Penalty-based contact is simple and popular, but it suffers from stability problems caused by highly variable and unpredictable net stiffness, especially with time-varying distributed contact over geo
abstract ▾ abstract ▴
Penalty-based contact is simple and popular, but it suffers from stability problems caused by highly variable and unpredictable net stiffness, especially with time-varying distributed contact over geometrically complex shapes. This paper combines semi-implicit integration, exact analytical contact gradients, symbolic Gaussian elimination, and an SVD solver to produce stable penalty-based frictional contact across large time-varying contact areas among many rigid and articulated rigid bodies in conforming contact and self-contact. The authors also derive implicit proportional-derivative control forces for real-time control of articulated structures with loops.
Related An Implicit Frictional Contact Solver for Adaptive Cloth Simulation · Frictional Contact on Smooth Elastic Solids · Towards Realtime: A Hybrid Physics-based Method for Hair Animation on GPU · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact
how to read this ▾ how to read this ▴
- Category
- Method: stable penalty-based distributed contact for rigid/articulated bodies
- Contributions
-
- A penalty-based frictional contact scheme that stays stable across large, time-varying contact areas among many rigid and articulated bodies, including conforming and self-contact
- Combines semi-implicit integration, exact analytical contact gradients, symbolic Gaussian elimination, and an SVD solver to tame variable net stiffness
- Derivation of implicit proportional-derivative control forces for real-time control of articulated structures with loops
- Context
- Addresses long-standing penalty-contact instability and connects to robust simultaneous-collision handling (Harmon et al.'s 'Robust Treatment of Simultaneous Collisions'), here via an implicit, gradient-exact formulation. Builds on: Robust Treatment of Simultaneous Collisions
- Correctness
- The core claim is improved stability of penalty contact under distributed, geometrically complex contact; it targets rigid and articulated rigid bodies, so a reader should not assume the same guarantees for deformable contact, and stability still depends on the integration and solver choices described.
- Clarity
- Technical; a first pass conveys the problem and the ingredient list, but the formulation (analytical gradients, symbolic elimination, implicit PD) genuinely needs a second pass.
- How to read it
- Read for why naive penalty stiffness destabilizes and how each component (semi-implicit, exact gradients, SVD) addresses it; plan a second pass on the contact-gradient derivation and the implicit PD forces if implementing.
CFX
-
Custom pipeline processes photogrammetry facial scans and mocap data into real-time blendshapes, delivering high-quality facial animation at moderate cost drawing on EA and Naughty Dog experience.
Facial / Retargeting
-
, , , , ,
State-of-the-art report on blendshape facial models covering theory, practice, workflow, and mathematical foundations used in VFX and games.
abstract ▾ abstract ▴
This state of the art report surveys the practice and theory of blendshapes, a simple linear model of facial expression that represents a face as a linear combination of sculpted expression targets and is the prevalent approach to realistic facial animation in film and commercial packages. The report covers blendshape terminology and history, the linear algebra of whole-face and delta formulations, combination and intermediate shapes, methods for constructing models, and animation techniques including keyframing, performance-driven capture, expression cloning, and direct manipulation. It contrasts the interpretable semantic blendshape basis with orthogonal PCA models and frames facial animation as a high-dimensional scattered interpolation problem. The authors show that despite the simplicity of the approach, several open problems remain associated with this fundamental technique.
Related Direct Manipulation Blendshapes · Animating Facial Expressions · Transferring the Rig and Animations from a Character to Different Face Models · Reusable Facial Rigging and Animation: Create Once, Use Many
how to read this ▾ how to read this ▴
- Category
- Survey / state-of-the-art report on blendshape facial models
- Contributions
-
- A survey of blendshape terminology, history, and the linear algebra of whole-face and delta formulations, including combination and intermediate shapes
- A review of model-construction methods and animation techniques (keyframing, performance-driven capture, expression cloning, direct manipulation)
- A framing of facial animation as high-dimensional scattered interpolation, contrasting interpretable semantic blendshapes with orthogonal PCA models, and a list of open problems
- Context
- Consolidates the blendshape literature used across VFX and games, building on direct-manipulation work (Lewis and Anjyo's 'Direct Manipulation Blendshapes') and the broader linear-facial-model lineage. Builds on: Direct Manipulation Blendshapes
- Correctness
- As a state-of-the-art report it organizes and frames existing methods rather than presenting new validated results; treat it as an authoritative map and a source of open problems, not as empirical evidence for any single technique.
- Clarity
- Highly accessible and pedagogical; a first pass gives the conceptual landscape, and a second pass pays off for the delta/whole-face linear algebra and the scattered-interpolation framing.
- How to read it
- Use it as your entry point and reference for the whole topic; first pass for vocabulary and the map of methods, then return to specific sections (linear algebra, construction, manipulation) as needed for the technique you care about.
Facial
-
, , , ,
Fast implicit simulation framework using local-global constraint projection, unifying cloth, soft bodies, and position-based dynamics.
abstract ▾ abstract ▴
We present a new method for implicit time integration of physical systems. Our approach builds a bridge between nodal Finite Element methods and Position Based Dynamics, leading to a simple, efficient, robust, yet accurate solver that supports many different types of constraints. We propose specially designed energy potentials that can be solved efficiently using an alternating optimization approach. Inspired by continuum mechanics, we derive a set of continuum-based potentials that can be efficiently incorporated within our solver. We demonstrate the generality and robustness of our approach in many different applications ranging from the simulation of solids, cloths, and shells, to example-based simulation. Comparisons to Newton-based and Position Based Dynamics solvers highlight the benefits of our formulation.
Related Nonlinear Cloth Simulation with Isogeometric Analysis · Position Based Dynamics · Multi-Resolution Isotropic Strain Limiting · Strain Based Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: implicit-integration simulation framework (local-global solver)
- Contributions
-
- A method for implicit time integration that bridges nodal Finite Element methods and Position Based Dynamics into one simple, robust solver
- Specially designed energy potentials solved efficiently via an alternating (local-global) optimization, plus continuum-based potentials
- Demonstrations across solids, cloth, shells, and example-based simulation, with comparisons to Newton-based and Position Based Dynamics solvers
- Context
- Generalizes and grounds Position Based Dynamics (Mueller et al.'s 'Position Based Dynamics') in a continuum-mechanics energy formulation, sitting between PBD and full nodal FEM. Builds on: Position Based Dynamics
- Correctness
- Claims of speed, robustness, and accuracy are supported by comparisons to Newton and PBD solvers; note the efficiency hinges on the special quadratic-friendly potentials and the local-global split, so behavior for energies that do not fit that structure is outside the demonstrated scope.
- Clarity
- Clearly written; a first pass conveys the local-global idea and the PBD/FEM bridge, and a second pass pays off for the constraint-projection energy derivation.
- How to read it
- Read for the alternating local-global formulation and the constraint-projection energies; a second pass on the potential design and the global-step linear solve is worth it if you plan to implement or extend it.
CFX
-
,
Robust skeleton extraction and skinning weight computation from mesh animation sequences using SSDR with improved stability.
abstract ▾ abstract ▴
We introduce an example-based rigging approach to automatically generate linear blend skinning models with skeletal structure. Based on a set of example poses, our approach can output its skeleton, joint positions, linear blend skinning weights, and corresponding bone transformations. The output can be directly used to set up skeleton-based animation in various 3D modeling and animation software as well as game engines. Specifically, we formulate the solving of a linear blend skinning model with a skeleton as an optimization with joint constraints and weight smoothness regularization, and solve it using an iterative rigging algorithm that (i) alternatively updates skinning weights, joint locations, and bone transformations, and (ii) automatically prunes redundant bones that can be generated by an over-estimated bone initialization. Due to the automatic redundant bone pruning, our approach is more robust than existing example-based rigging approaches. Furthermore, in terms of rigging accuracy, even with a single set of parameters, our approach can soundly outperform state of the art methods on various types of experimental datasets including humans, quadrupled animals, and highly deformable models.
Related Smooth Skinning Decomposition with Rigid Bones · A Statistical Model of Human Pose and Body Shape · One Model to Rig Them All: Diverse Skeleton Rigging with UniRig · Automatic Rigging and Animation of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: example-based skeletal rigging and skinning from mesh sequences
- Contributions
-
- An automatic approach that outputs a skeleton, joint positions, linear blend skinning weights, and bone transformations from a set of example poses
- Formulation as an optimization with joint constraints and weight-smoothness regularization, solved by an iterative algorithm that alternates over weights, joints, and bone transforms
- Automatic pruning of redundant bones from an over-estimated initialization, improving robustness over prior example-based rigging
- Context
- Extends example-based and automatic rigging (related to Baran and Popovic's 'Automatic Rigging and Animation of 3D Characters') and the Smooth Skinning Decomposition with Rigid bones line, here with redundant-bone pruning for stability. Builds on: Automatic Rigging and Animation of 3D Characters
- Correctness
- The output is directly usable in standard DCC tools and game engines; accuracy is reported to outperform state-of-the-art methods, even with a single parameter set, across humans, quadrupeds, and highly deforming meshes, but results are bounded by the representativeness of the supplied example poses.
- Clarity
- Accessible for practitioners; a first pass conveys the LBS-with-skeleton optimization and the pruning idea, with a second pass for the alternating-update math.
- How to read it
- Read for the optimization structure (alternating weights/joints/transforms) and especially the bone-pruning mechanism that drives the robustness; second pass on the constraints and regularization if you intend to rig from your own sequences.
Rigging / Skinning
-
, , ,
Data-driven capture framework fitting hair strands to multi-view point clouds via a voting algorithm using a database of physics-simulated example strands.
abstract ▾ abstract ▴
We introduce a data-driven hair capture framework based on example strands generated through hair simulation. Our method can robustly reconstruct faithful 3D hair models from unprocessed input point clouds with large amounts of outliers. Current state-of-the-art techniques use geometrically-inspired heuristics to derive global hair strand structures, which can yield implausible hair strands for hairstyles involving large occlusions, multiple layers, or wisps of varying lengths. We address this problem using a voting-based fitting algorithm to discover structurally plausible configurations among the locally grown hair segments from a database of simulated examples. To generate these examples, we exhaustively sample the simulation configurations within the feasible parameter space constrained by the current input hairstyle. The number of necessary simulations can be further reduced by leveraging symmetry and constrained initial conditions. The final hairstyle can then be structurally represented by a limited number of examples. To handle constrained hairstyles such as a ponytail of which realistic simulations are more difficult, we allow the user to sketch a few strokes to generate strand examples through an intuitive interface. Our approach focuses on robustness and generality.
Related Structure-Aware Hair Capture · Strand-Accurate Multi-View Hair Capture · Simulation-Ready Hair Capture · Single-View Hair Modeling Using a Hairstyle Database
how to read this ▾ how to read this ▴
- Category
- Capture method: data-driven hair reconstruction from point clouds
- Contributions
-
- A data-driven hair capture framework that reconstructs faithful 3D hair models from unprocessed multi-view point clouds with many outliers
- A voting-based fitting algorithm that finds structurally plausible strand configurations using a database of physics-simulated example strands
- Exhaustive sampling of simulation configurations (reduced via symmetry and constrained initial conditions), plus user sketch strokes for constrained styles like ponytails
- Context
- Builds on geometry-driven multi-view hair capture (Luo et al.'s 'Structure-Aware Hair Capture'), replacing purely geometric heuristics with priors drawn from physically simulated example strands. Builds on: Structure-Aware Hair Capture
- Correctness
- Robustness comes from constraining strands to physically simulable examples, which helps with occlusion, layering, and varying wisp lengths; the trade-off is that fidelity depends on how well the sampled simulation space covers the real hairstyle, and difficult constrained styles still need user sketches.
- Clarity
- Accessible; a first pass conveys the simulate-database-then-vote idea, and a second pass pays off for the voting/fitting algorithm and the sampling strategy.
- How to read it
- Read for the pipeline (simulate examples, grow local segments, vote for plausible global structure); focus on what the example database covers and where the user sketch input is required, then a second pass on the voting algorithm if reproducing.
CFX
-
,
A robust scheme for collapsed and inverted elements in interactive corotational FEM, giving faster, smoother degeneration recovery for deformable-object simulation.
abstract ▾ abstract ▴
Addresses robust, efficient treatment of element collapse and inversion in corotational finite-element simulations of deformable objects in 2D and 3D. The method avoids flaws of prior approaches, yields faster and smoother recovery from degenerate configurations, and widens the range of well-behaved degenerate elements, improving the stability of interactive soft-body simulation.
Related Fast Contact Determination for Intersecting Deformable Solids · Physically Based Rigging for Deformable Characters · A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains · Projective Dynamics: Fusing Constraint Projections for Fast Simulation
how to read this ▾ how to read this ▴
- Category
- Method: robust degenerate-element handling in corotational FEM
- Contributions
-
- A robust scheme for treating element collapse and inversion in corotational FEM simulations of deformable objects in 2D and 3D
- Faster and smoother recovery from degenerate configurations than prior approaches, avoiding their flaws
- A widened range of well-behaved degenerate elements, improving stability of interactive soft-body simulation
- Context
- Sits in the deformable-object FEM lineage (Terzopoulos et al.'s 'Elastically Deformable Models') and the corotational-elasticity tradition, targeting the long-known problem of inverted and collapsed elements. Builds on: Elastically Deformable Models
- Correctness
- Demonstrated for interactive corotational FEM in 2D and 3D with improved degeneration recovery; it addresses element degeneracy specifically rather than a full constitutive overhaul, so a reader should view it as a robustness fix within the corotational framework, not a replacement for it.
- Clarity
- Technical but focused; a first pass conveys the degeneracy problem and the claimed improvement, with a second pass needed for the recovery scheme details.
- How to read it
- Read if you hit instability from inverted/collapsed elements; focus on how the method detects and recovers from degeneracy versus prior schemes, and do a second pass on the formulation before integrating it into a corotational solver.
CFX
-
, , ,
Describes wind field rigs and simulation integration developed for Frozen, capturing cloth and hair interaction with arctic environmental wind conditions.
abstract ▾ abstract ▴
This paper describes the techniques used by the Character Simulation team to capture wind interaction with the cloth and hair simulations in Disney's Frozen. The approach combined an improved lift and drag model, switching from linear Stokes' drag to quadratic forces suitable for high Reynolds number fluids like air, with new SeExpr-based authoring tools for crafting and visualizing wind fields. Windicator and gust rigs gave directors and animators a way to plan wind performance and continuity before cloth and hair were simulated. Together these tools let artists author physically plausible wind behavior while hitting the desired art direction across Layout, Technical Animation and Effects.
Related Scriptable Character FX Solution · Choreography of Hair and Cloth in Disney's Moana 2 · Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact
how to read this ▾ how to read this ▴
- Category
- Production talk / pipeline paper: wind authoring and simulation for cloth and hair
- Contributions
-
- An improved aerodynamic force model, switching from linear Stokes' drag to quadratic forces suited to high Reynolds number air
- SeExpr-based authoring tools for crafting and visualizing wind fields
- Windicator and gust rigs that let directors and animators plan wind performance and continuity before simulation
- Context
- A production-tech effort on Frozen that follows Disney's earlier character-simulation work, notably 'Simulating Rapunzel's Hair in Disney's Tangled', extending environmental forcing of cloth and hair to arctic wind conditions. Builds on: Simulating Rapunzel's Hair in Disney's Tangled
- Correctness
- Aimed at physically plausible behavior that still hits art direction across Layout, Technical Animation and Effects, so it is studio practice tuned for a single film rather than a peer-reviewed general method; readers should treat the quadratic drag model and rigs as production-proven choices, not validated against ground-truth aerodynamics.
- Clarity
- Accessible as a workflow description; a first pass conveys the tooling idea, a second pass helps if you want the lift and drag formulation.
- How to read it
- Read for the authoring workflow and the drag model change first; focus on how Windicator and gust rigs slot into the pipeline, and only do a second pass for the force model if you plan to reimplement it.
CFX
-
, , ,
Comprehensive SIGGRAPH course covering LBS, DQS, pose-space deformation, and example-based skinning methods for real-time character deformation.
Skinning
-
, ,
Physics-based surface-only skinning method that captures full volumetric flesh behavior using surface degrees of freedom with collision support.
abstract ▾ abstract ▴
Presents a surface-based elastic model for physics-based character skinning that uses the Steklov-Poincare operator to map boundary displacements directly to boundary forces while capturing volumetric flesh behavior. The method employs ex-rotated elasticity to achieve affine force models with pose-dependent coefficients, avoiding the nonlinearity of corotational elasticity. Includes full collision handling with preconditioning techniques enabling interactive simulation of high-resolution flesh models under self-collisions and external impacts.
Related Data-Driven Physics for Human Soft Tissue Animation · Computational Bodybuilding: Anatomically-Based Modeling of Human Bodies · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging · NIMBLE: A Non-rigid Hand Model with Bones and Muscles
how to read this ▾ how to read this ▴
- Category
- Method: a physics-based, surface-only character skinning algorithm
- Contributions
-
- A surface elastic model using the Steklov-Poincare operator to map boundary displacements to boundary forces while still capturing volumetric flesh behavior
- An ex-rotated elasticity formulation giving affine force models with pose-dependent coefficients, avoiding corotational nonlinearity
- Full collision handling with preconditioning that enables interactive simulation under self-collisions and external impacts
- Context
- Sits in the physics-based skinning lineage and contrasts with geometric weight-based approaches such as 'Bounded Biharmonic Weights for Real-Time Deformation', trading interpolation weights for an elasticity-driven surface formulation. Builds on: Bounded Biharmonic Weights for Real-Time Deformation
- Correctness
- The central assumption is that volumetric flesh response can be represented through boundary (surface) degrees of freedom via the Steklov-Poincare operator; demonstrated on high-resolution flesh models with collisions, but readers should note the ex-rotated affine model is an approximation of true corotational elasticity and applicability outside the shown soft-tissue cases is not established here.
- Clarity
- Technical; a first pass gives the surface-DOF intuition, but the operator and ex-rotated elasticity need a careful second pass.
- How to read it
- First pass for the high-level idea of reducing to surface DOFs; budget a focused second pass on the Steklov-Poincare mapping and ex-rotated elasticity, plus a third on the preconditioning if you intend to implement the collision solver.
Skinning / Muscles
-
, , ,
PBD extension constraining Green-St Venant strain tensor entries directly, enabling anisotropic cloth behavior without polar decomposition overhead.
abstract ▾ abstract ▴
This paper introduces a new set of constraints within the Position Based Dynamics framework that control strain in directions independent of the simulation mesh edges. Instead of constraining distances between points, the method constrains the entries of the Green St Venant strain tensor, so varying the per-coefficient stiffness produces anisotropic behavior without the polar decomposition required by most strain limiting approaches. The authors propose modified diagonal constraints solvable in a single step and modified off-diagonal constraints that decouple stretch from shear resistance, and they add separate volume and area conservation constraints. Because the constraints are formulated within PBD, they can drive the actual simulation of deformable objects such as cloth and tetrahedral solids rather than only limiting overstretched material.
Related Nonlinear Cloth Simulation with Isogeometric Analysis · Multi-Resolution Isotropic Strain Limiting · Projective Dynamics: Fusing Constraint Projections for Fast Simulation · Codimensional Incremental Potential Contact
how to read this ▾ how to read this ▴
- Category
- Method: a Position Based Dynamics extension for strain-based deformable simulation
- Contributions
-
- Constraints on the entries of the Green St Venant strain tensor rather than on mesh edge distances, giving anisotropy without polar decomposition
- Modified diagonal constraints solvable in a single step and off-diagonal constraints that decouple stretch from shear
- Separate volume and area conservation constraints, with the formulation driving the actual simulation rather than only limiting overstretch
- Context
- A direct extension of 'Position Based Dynamics', recasting elastic behavior as strain-tensor constraints within the PBD constraint-projection framework. Builds on: Position Based Dynamics
- Correctness
- Works within PBD, so stiffness behavior is iteration- and timestep-dependent as is typical for the method, and results are demonstrated on cloth and tetrahedral solids; readers should remember PBD strain control is geared to plausible, controllable behavior rather than physically calibrated material response.
- Clarity
- Accessible if you know PBD; a first pass conveys the strain-constraint idea, a second pass gives the per-constraint projection formulas.
- How to read it
- Read with the PBD paper at hand; focus first on what each strain-tensor constraint controls, then do a second pass on the diagonal and off-diagonal constraint derivations to implement the solver step.
CFX
-
, , , , , , ,
Subspace cloth simulation using pose-dependent adaptive bases, enabling fast and plausible garment simulation for animated characters.
abstract ▾ abstract ▴
We present a new approach to clothing simulation using low-dimensional linear subspaces with temporally adaptive bases. Our method exploits full-space simulation training data in order to construct a pool of low-dimensional bases distributed across pose space. For this purpose, we interpret the simulation data as offsets from a kinematic deformation model that captures the global shape of clothing due to body pose. During subspace simulation, we select low-dimensional sets of basis vectors according to the current pose of the character and the state of its clothing. Thanks to this adaptive basis selection scheme, our method is able to reproduce diverse and detailed folding patterns with only a few basis vectors. Our experiments demonstrate the feasibility of subspace clothing simulation and indicate its potential in terms of quality and computational efficiency.
Related Efficient Simulation of Example-Based Materials · A Pixel-Based Framework for Data-Driven Clothing · A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains · Progressive Simulation for Cloth Quasistatics
how to read this ▾ how to read this ▴
- Category
- Method: subspace (reduced-order) clothing simulation with adaptive bases
- Contributions
-
- A pose-distributed pool of low-dimensional linear bases built from full-space simulation training data
- Interpretation of simulation data as offsets from a kinematic deformation model capturing pose-driven global garment shape
- Temporally adaptive basis selection from current pose and cloth state, reproducing detailed folds with few basis vectors
- Context
- Builds on data-driven reduced clothing work such as 'Stable Spaces for Real-time Clothing', advancing it from a fixed subspace to pose-dependent adaptive bases. Builds on: Stable Spaces for Real-time Clothing
- Correctness
- Quality depends on coverage of the training poses and on the assumption that garment behavior is well represented as offsets from the kinematic model; the paper frames results as demonstrating feasibility and potential, so generalization beyond trained pose space is a limitation to keep in mind.
- Clarity
- Readable; a first pass conveys the adaptive-basis idea, a second pass clarifies the basis construction and selection scheme.
- How to read it
- First pass for the adaptive-basis concept and the offset-from-kinematic-model framing; do a second pass on how bases are distributed across pose space and selected at runtime if reduced-order quality matters to you.
CFX
-
, , ,
Unified particle system handling cloth, fluids, rigid bodies, and soft bodies in a single GPU-accelerated position-based framework.
abstract ▾ abstract ▴
We present a unified dynamics framework for real-time visual effects. Using particles connected by constraints as our fundamental building block allows us to treat contact and collisions in a unified manner, and we show how this representation is flexible enough to model gases, liquids, deformable solids, rigid bodies and cloth with two-way interactions. We address some common problems with traditional particle-based methods and describe a parallel constraint solver based on position-based dynamics that is efficient enough for real-time applications.
Related Position Based Dynamics · Vivace: A Practical Gauss-Seidel Method for Stable Soft Body Dynamics · Small Steps in Physics Simulation · Wrinkle Meshes
how to read this ▾ how to read this ▴
- Category
- Method / framework: a unified particle solver for real-time visual effects
- Contributions
-
- Particles connected by constraints as a single building block for gases, liquids, deformable and rigid bodies, and cloth, with two-way interaction
- Unified handling of contact and collisions across all object types in one representation
- A parallel position-based-dynamics constraint solver efficient enough for real-time, addressing common particle-method problems
- Context
- Generalizes 'Position Based Dynamics' into a single particle-and-constraint framework spanning multiple material types, foundational to later unified solvers. Builds on: Position Based Dynamics
- Correctness
- The unifying assumption is that diverse materials can be modeled as constrained particles solved by PBD, which favors real-time stability and breadth over per-material physical accuracy; readers should treat it as a versatile real-time framework rather than a high-fidelity per-domain simulator.
- Clarity
- Clearly written and broad; a first pass conveys the unified approach, with specific constraints worth a targeted second pass.
- How to read it
- Read first for the unifying idea and the catalog of object types it supports; do a second pass on the particular constraint formulations and the parallel solver if you plan to build on the framework.
CFX
-
, , ,
Yarn-level simulation of woven cloth using efficient contact handling between yarns to reproduce fabric appearance and mechanical behavior.
abstract ▾ abstract ▴
The large-scale mechanical behavior of woven cloth is determined by the mechanical properties of the yarns, the weave pattern, and frictional contact between yarns. Using standard simulation methods for elastic rod models and yarn-yarn contact handling, the simulation of woven garments at realistic yarn densities is deemed intractable. This paper introduces an efficient solution for simulating woven cloth at the yarn level. Central to our solution is a novel discretization of interlaced yarns based on yarn crossings and yarn sliding, which allows modeling yarn-yarn contact implicitly, avoiding contact handling at yarn crossings altogether. Combined with models for internal yarn forces and inter-yarn frictional contact, as well as a massively parallel solver, we are able to simulate garments with hundreds of thousands of yarn crossings at practical frame-rates on a desktop machine, showing combinations of large-scale and fine-scale effects induced by yarn-level mechanics.
Related Homogenized Yarn-Level Cloth · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation · Directing Cloth Draping through Blended UVs · Untangling Cloth
how to read this ▾ how to read this ▴
- Category
- Method: yarn-level simulation of woven cloth
- Contributions
-
- A discretization of interlaced yarns based on yarn crossings and yarn sliding that models yarn-yarn contact implicitly, avoiding explicit contact handling at crossings
- Internal yarn force and inter-yarn frictional contact models combined with this discretization
- A massively parallel solver enabling woven garments with hundreds of thousands of yarn crossings at practical frame rates on a desktop
- Context
- Extends yarn-level cloth simulation, following 'Simulating Knitted Cloth at the Yarn Level', from knits toward woven fabrics with a tractable contact treatment. Builds on: Simulating Knitted Cloth at the Yarn Level
- Correctness
- The key assumption is that yarn-yarn contact at crossings can be encoded implicitly by the crossing-and-sliding discretization rather than resolved explicitly, which is what makes realistic yarn densities tractable; demonstrated on woven garments, so applicability is strongest for woven structures and rests on the elastic-rod and friction models chosen.
- Clarity
- Fairly technical; a first pass conveys the crossing-based idea, a second pass is needed for the discretization and contact formulation.
- How to read it
- First pass to grasp why implicit crossing contact makes yarn-level weaving feasible; do a careful second pass on the crossing-and-sliding discretization and the friction model, and a third on the parallel solver for implementation.
CFX
2013
13-
, , ,
Real-time 3D face shape regression from RGB video, tracking both rigid head pose and non-rigid expression deformation simultaneously.
abstract ▾ abstract ▴
We present a real-time performance-driven facial animation system based on 3D shape regression. In this system, the 3D positions of facial landmark points are inferred by a regressor from 2D video frames of an ordinary web camera. From these 3D points, the pose and expressions of the face are recovered by fitting a user-specific blendshape model to them. The main technical contribution of this work is the 3D regression algorithm that learns an accurate, user-specific face alignment model from an easily acquired set of training data, generated from images of the user performing a sequence of predefined facial poses and expressions. Experiments show that our system can accurately recover 3D face shapes even for fast motions, non-frontal faces, and exaggerated expressions. In addition, some capacity to handle partial occlusions and changing lighting conditions is demonstrated.
Related FaceWarehouse: A 3D Facial Expression Database for Visual Computing · Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks · Face Transfer with Multilinear Models · 3D Gaussian Blendshapes for Head Avatar Animation
how to read this ▾ how to read this ▴
- Category
- Method / system: real-time performance-driven facial animation
- Contributions
-
- Real-time facial animation from an ordinary webcam via 3D shape regression that infers 3D landmark positions from 2D frames
- Recovers head pose and expression by fitting a user-specific blendshape model to the regressed 3D points
- Learns the user-specific alignment model from an easily acquired set of predefined poses and expressions
- Context
- Rests on the morphable / blendshape face-model tradition (Blanz and Vetter A Morphable Model), pairing it with a learned 3D regressor for monocular tracking. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Shown to recover shapes for fast motion, non-frontal faces, and exaggerated expressions with some tolerance to partial occlusion and lighting change; it requires a per-user training/calibration session and accuracy is tied to that user-specific model.
- Clarity
- Accessible system description; a first pass conveys the regress-then-fit pipeline, a second pass details the regressor training.
- How to read it
- First read the pipeline (2D to 3D regression, then blendshape fit) and the calibration step; a second pass on the regression training pays off if you care about tracking robustness or want to reproduce it.
Facial
-
, , , , , ,
Transfers a full anatomical model (bones, muscles, fat) onto arbitrary character shapes, automating anatomy-based rig setup.
abstract ▾ abstract ▴
Characters with precise internal anatomy are important in film and visual effects, as well as in medical applications. We propose the first semi-automatic method for creating anatomical structures, such as bones, muscles, viscera and fat tissues. This is done by transferring a reference anatomical model from an input template to an arbitrary target character, only defined by its boundary representation (skin). The fat distribution of the target character needs to be specified. We can either infer this information from MRI data, or allow the users to express their creative intent through a new editing tool. The rest of our method runs automatically: it first transfers the bones to the target character, while maintaining their structure as much as possible. The bone layer, along with the target skin eroded using the fat thickness information, are then used to define a volume where we map the internal anatomy of the source model using harmonic (Laplacian) deformation. This way, we are able to quickly generate anatomical models for a large range of target characters, while maintaining anatomical constraints.
Related Anatomically Based Modeling · Anatomy-Based Modeling of the Human Musculature · OSSO: Obtaining Skeletal Shape from Outside · Differentiable Simulation of Inertial Musculotendons
how to read this ▾ how to read this ▴
- Category
- Method: anatomy transfer for rig setup
- Contributions
-
- First semi-automatic method to create internal anatomy (bones, muscles, viscera, fat) by transferring a reference model to a target defined only by its skin
- Transfers bones while preserving their structure, then maps interior anatomy via harmonic (Laplacian) deformation into a volume bounded by skin and bone
- Drives the target fat distribution either from MRI data or from an interactive editing tool for creative control
- Context
- Extends anatomically based character modeling (Teran et al. Creating and Simulating Skeletal Muscle from the Visible Human Data Set) by automating anatomy setup onto arbitrary target shapes. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Demonstrated across a range of target characters while maintaining anatomical constraints; it is semi-automatic (fat distribution must be supplied) and transfer quality depends on correspondence between the template and a target that differs strongly in proportion.
- Clarity
- Accessible pipeline; a first pass conveys the transfer stages, a second pass is needed for the harmonic mapping and constraint handling.
- How to read it
- Focus on the bone-then-interior transfer stages and the fat-distribution input; a second pass on the Laplacian volume mapping pays off if you intend to set up anatomy-based rigs.
Muscles
-
, , , , ,
Pixar production hair simulation for curly hair using a novel curl model that preserves artistic intent throughout the simulation.
CFX
-
, , , , ,
Multi-camera capture and semi-parametric statistical model of shoulder-arm skin and muscle deformations across subjects and poses.
abstract ▾ abstract ▴
We present a comprehensive data‐driven statistical model for skin and muscle deformation of the human shoulder‐arm complex. Skin deformations arise from complex bio‐physical effects such as non‐linear elasticity of muscles, fat, and connective tissue; and vary with physiological constitution of the subjects and external forces applied during motion. Thus, they are hard to model by direct physical simulation. Our alternative approach is based on learning deformations from multiple subjects performing different exercises under varying external forces. We capture the training data through a novel multi‐camera approach that is able to reconstruct fine‐scale muscle detail in motion. The resulting reconstructions from several people are aligned into one common shape parametrization, and learned using a semi‐parametric non‐linear method. Our learned data‐driven model is fast, compact and controllable with a small set of intuitive parameters, pose, body shape and external forces, through which a novice artist can interactively produce complex muscle deformations. Our method is able to capture and synthesize fine‐scale muscle bulge effects to a greater level of realism than achieved previously. We provide quantitative and qualitative validation of our method.
Related Data-driven Modeling of Skin and Muscle Deformation · Anatomically Based Modeling · How to Build a Human: Practical Physics-Based Character Animation · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Capture system + data-driven statistical model
- Contributions
-
- Multi-camera capture approach that reconstructs fine-scale muscle detail of the shoulder-arm complex in motion
- Aligns reconstructions from several subjects into a common shape parametrization and learns a semi-parametric non-linear deformation model
- Yields a fast, compact, controllable model driven by intuitive parameters (pose, body shape, external forces) for interactive muscle bulging
- Context
- An alternative to direct physical muscle simulation (Teran et al. Creating and Simulating Skeletal Muscle from the Visible Human Data Set), learning skin and muscle deformation from captured multi-subject data instead. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Validated as capturing fine-scale bulge effects across subjects and external forces; as a learned model its fidelity is bounded by the captured subjects, poses, and force conditions, so extrapolation beyond the training distribution is uncertain.
- Clarity
- Accessible; a first pass conveys the capture-then-learn pipeline and the control parameters, a second pass clarifies the alignment and the semi-parametric fitting.
- How to read it
- Read for the capture rig and the parametrization/learning choices; a second pass on the statistical model pays off if you want to drive or extend muscle deformation from data.
Muscles / Skinning
-
,
Voxelization-based geodesic distance approach automatically computes LBS influence weights tolerating non-manifold and multi-component production meshes.
abstract ▾ abstract ▴
Proposes a fully automatic method for computing skinning weights for character deformation using voxelization and geodesic distances from bone geometry. The method handles production meshes with non-manifold geometry and intersecting triangles by using a hardware-accelerated voxelization algorithm based on z-buffer slicing. Geodesic distances between skeleton bones and mesh boundary voxels are computed using Dijkstra's algorithm to generate smooth weights for linear blend skinning without requiring manual weight painting.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · SkinMixer: Blending 3D Animated Models · Automatic Rigging and Animation of 3D Characters · Segmentation-Based Skinning
how to read this ▾ how to read this ▴
- Category
- Method: automatic skinning-weight computation
- Contributions
-
- Fully automatic linear-blend-skinning weights from voxelized geodesic distances between skeleton bones and the mesh surface
- Hardware-accelerated voxelization via z-buffer slicing that tolerates non-manifold, intersecting, multi-component production meshes
- Computes bone-to-voxel geodesic distances with Dijkstra to produce smooth weights without manual weight painting
- Context
- Addresses the practical weight-binding problem for messy production geometry, an alternative to surface-based or harmonic weight schemes that assume clean manifold meshes.
- Correctness
- Aimed at robustness on real production meshes where clean-mesh methods fail; geodesics are computed on a voxel grid, so weight smoothness and locality depend on voxel resolution and the resulting weights are an automatic starting point that may still need artist touch-up.
- Clarity
- Accessible and pragmatic; a first pass conveys the voxelize-then-geodesic idea, a second pass covers the slicing voxelization and distance computation.
- How to read it
- Focus on why voxelization buys robustness and how geodesic distance maps to weights; a second pass on the voxelization details is worth it if you bind weights on imperfect meshes.
Skinning / Rigging
-
, , , ,
Contour-based facial animation capture system enabling high-fidelity retargeting of subtle expressions to CG character rigs.
abstract ▾ abstract ▴
Presents a performance capture method that focuses on contour features of the face, particularly the eyelids and inner mouth, to capture subtle facial expressions missed by marker-only approaches. Uses a two-step optimization combining blendshape fitting with per-frame corrective shapes to match tracked contours and markers accurately.
Related Realtime Facial Animation with On-the-fly Correctives · Example-Based Facial Rigging · Direct Manipulation Blendshapes · FaceLab: Scalable Facial Performance Capture for Visual Effects
how to read this ▾ how to read this ▴
- Category
- Capture system: contour-based facial performance capture and retargeting
- Contributions
-
- Performance capture that exploits contour features (eyelids and inner mouth) to recover subtle expressions missed by marker-only approaches
- Two-step optimization combining blendshape fitting with per-frame corrective shapes to match tracked contours and markers
- Enables high-fidelity retargeting of subtle expressions to CG character rigs
- Context
- Builds on high-resolution facial performance capture (Bradley et al. High Resolution Passive Facial Performance Capture), adding contour cues to improve fidelity in regions markers cannot track. Builds on: High Resolution Passive Facial Performance Capture
- Correctness
- Demonstrated on subtle expressions around eyes and mouth that markers miss; relies on robust contour tracking and on the corrective-shape step to absorb residual error, so capture quality depends on contour-detection reliability.
- Clarity
- Accessible; a first pass conveys the contours-plus-markers idea, a second pass clarifies the two-step blendshape-then-correctives optimization.
- How to read it
- Focus on which contours are tracked and how the corrective step complements blendshape fitting; a second pass pays off if facial-capture fidelity or retargeting is your goal.
Facial / Retargeting
-
, , , , , , ,
Implicit surface composition on top of geometric skinning that produces contact and bulge effects at the joints in real time.
abstract ▾ abstract ▴
Geometric skinning techniques, such as smooth blending or dual-quaternions, are very popular in the industry for their high performances, but fail to mimic realistic deformations. Other methods make use of physical simulation or control volume to better capture the skin behavior, yet they cannot deliver real-time feedback. In this paper, we present the first purely geometric method handling skin contact effects and muscular bulges in real-time. The insight is to exploit the advanced composition mechanism of volumetric, implicit representations for correcting the results of geometric skinning techniques. The mesh is first approximated by a set of implicit surfaces. At each animation step, these surfaces are combined in real-time and used to adjust the position of mesh vertices, starting from their smooth skinning position. This deformation step is done without any loss of detail and seamlessly handles contacts between skin parts. As it acts as a post-process, our method fits well into the standard animation pipeline. Moreover, it requires no intensive computation step such as collision detection, and therefore provides real-time performances.
Related Patch-based Surface Relaxation · Delta Mush: Smoothing Deformations While Preserving Detail · Skinning: Real-time Shape Deformation · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method: real-time skin deformation with contact
- Contributions
-
- First purely geometric real-time method to produce skin contact effects and muscular bulges at joints
- Approximates the mesh with a set of implicit surfaces and uses their volumetric composition operators to correct geometric-skinning positions
- Acts as a post-process on smooth-skinning output, adding no detail loss and needing no collision-detection step, fitting the standard animation pipeline
- Context
- Corrects classic geometric skinning (Kavan et al. Skinning with Dual Quaternions and smooth blending) using implicit-surface composition to recover effects those methods miss. Builds on: Skinning with Dual Quaternions
- Correctness
- Achieves real-time contact and bulge without physical simulation or explicit collision detection; results are an implicit-composition approximation of skin behavior rather than a physically validated one, and quality depends on how well the implicit surfaces approximate the mesh.
- Clarity
- Accessible motivation; a first pass conveys the post-process correction idea, a second pass is needed for the implicit-surface fitting and composition operators.
- How to read it
- First read why geometric skinning fails at joints and how implicit composition fixes it as a post-process; a second pass on the surface approximation and blending operators pays off for implementers.
Skinning
-
, , , , ,
Precomputation approach for secondary cloth motion on articulated characters, enabling real-time playback of complex garment dynamics.
abstract ▾ abstract ▴
The central argument against data-driven methods in computer graphics rests on the curse of dimensionality: it is intractable to precompute "everything" about a complex space. In this paper, we challenge that assumption by using several thousand CPU-hours to perform a massive exploration of the space of secondary clothing effects on a character animated through a large motion graph. Our system continually explores the phase space of cloth dynamics, incrementally constructing a secondary cloth motion graph that captures the dynamics of the system. We find that it is possible to sample the dynamical space to a low visual error tolerance and that secondary motion graphs containing tens of gigabytes of raw mesh data can be compressed down to only tens of megabytes. These results allow us to capture the effect of high-resolution, off-line cloth simulation for a rich space of character motion and deliver it efficiently as part of an interactive application.
Related Homogenized Yarn-Level Cloth · MeshGraphNetRP: Improving Generalization of GNN-based Cloth Simulation · Directing Cloth Draping through Blended UVs · Untangling Cloth
how to read this ▾ how to read this ▴
- Category
- Method: data-driven precomputation of secondary cloth motion
- Contributions
-
- Massive offline exploration of the secondary cloth dynamics space over a large motion graph, using thousands of CPU-hours
- An incrementally constructed secondary cloth motion graph that samples the phase space to a low visual error tolerance
- Compression of tens of gigabytes of mesh data down to tens of megabytes for real-time interactive playback
- Context
- Challenges the curse-of-dimensionality argument against data-driven cloth and builds on reduced/learned clothing models such as de Aguiar et al.'s Stable Spaces for Real-time Clothing. Builds on: Stable Spaces for Real-time Clothing
- Correctness
- The approach assumes the character motion lives within a (large but bounded) motion graph and that the dynamical space can be sampled to a low visual-error tolerance; results are demonstrated as near-real-time playback rather than novel free-form motion, so generalization beyond the explored space is the limitation to keep in mind.
- Clarity
- Accessible framing; a first pass conveys the precompute-then-compress idea, a second pass pays off for the sampling and compression scheme.
- How to read it
- Focus on how the phase space is sampled and how the motion graph is compressed; a second pass is worth it to understand the error tolerance and storage tradeoff that make playback interactive.
CFX
-
, ,
Online 3D morphable model fitting for real-time facial animation from depth sensor input with continuous model adaptation.
abstract ▾ abstract ▴
We present a new algorithm for realtime face tracking on commodity RGB-D sensing devices. Our method requires no user-specific training or calibration, or any other form of manual assistance, thus enabling a range of new applications in performance-based facial animation and virtual interaction at the consumer level. The key novelty of our approach is an optimization algorithm that jointly solves for a detailed 3D expression model of the user and the corresponding dynamic tracking parameters. Realtime performance and robust computations are facilitated by a novel subspace parameterization of the dynamic facial expression space. We provide a detailed evaluation that shows that our approach significantly simplifies the performance capture workflow, while achieving accurate facial tracking for realtime applications.
Related Avengers: Capturing Thanos's Complex Face · Realtime Facial Animation with On-the-fly Correctives · Realtime Performance-Based Facial Animation · Facial Performance Enhancement Using Dynamic Shape Space Analysis
how to read this ▾ how to read this ▴
- Category
- Method: real-time facial tracking via online model fitting
- Contributions
-
- A calibration-free, training-free real-time face tracking algorithm for commodity RGB-D sensors
- Joint optimization that simultaneously solves for a detailed 3D expression model of the user and the dynamic tracking parameters
- A subspace parameterization of the dynamic facial expression space for robust, real-time computation
- Context
- Continues performance-based facial animation from depth input, building on Weise et al.'s Realtime Performance-Based Facial Animation by removing the user-specific training/calibration step. Builds on: Realtime Performance-Based Facial Animation
- Correctness
- Relies on commodity RGB-D input and on the assumption that a per-user expression model can be recovered online jointly with tracking; the authors provide an evaluation showing accurate tracking, but reader should remember robustness depends on depth-sensor quality and the chosen subspace's expressiveness.
- Clarity
- Clearly motivated; a first pass conveys the online-modeling idea, a second pass is needed for the joint optimization and subspace formulation.
- How to read it
- Focus on what the joint solve estimates (model vs. tracking parameters) and the subspace parameterization; do a second pass on the optimization to see how calibration is avoided.
Facial
-
, , ,
Real-time facial animation system that fits corrective blendshapes on-the-fly from depth input to handle fine wrinkle and contact deformation.
abstract ▾ abstract ▴
We introduce a real-time and calibration-free facial performance capture framework based on a sensor with video and depth input. In this framework, we develop an adaptive PCA model using shape correctives that adjust on-the-fly to the actor's expressions through incremental PCA-based learning. Since the fitting of the adaptive model progressively improves during the performance, we do not require an extra capture or training session to build this model. As a result, the system is highly deployable and easy to use: it can faithfully track any individual, starting from just a single face scan of the subject in a neutral pose. Like many real-time methods, we use a linear subspace to cope with incomplete input data and fast motion. To boost the training of our tracking model with reliable samples, we use a well-trained 2D facial feature tracker on the input video and an efficient mesh deformation algorithm to snap the result of the previous step to high frequency details in visible depth map regions. We show that the combination of dense depth maps and texture features around eyes and lips is essential in capturing natural dialogues and nuanced actor-specific emotions. We demonstrate that using an adaptive PCA model not only improves the fitting accuracy for tracking but also increases the expressiveness of the retargeted character.
Related High Fidelity Facial Animation Capture and Retargeting with Contours · Facial Retargeting with Automatic Range of Motion Alignment · Avengers: Capturing Thanos's Complex Face · Online Modeling for Realtime Facial Animation
how to read this ▾ how to read this ▴
- Category
- Method: real-time facial capture with on-the-fly corrective shapes
- Contributions
-
- A calibration-free real-time facial performance capture framework using video plus depth input
- An adaptive PCA model whose shape correctives adjust on-the-fly via incremental PCA-based learning, removing the need for a separate training session
- Combining dense depth maps with 2D texture features around eyes and lips, plus mesh deformation, to capture fine details and actor-specific expressions
- Context
- Shares the depth-sensor performance-capture lineage of Weise et al.'s Realtime Performance-Based Facial Animation, extending it with progressively learned corrective blendshapes from a single neutral scan. Builds on: Realtime Performance-Based Facial Animation
- Correctness
- Assumes a single neutral face scan as starting point and that a linear subspace suffices to cope with incomplete/fast input; fine detail depends on visible depth regions and the eye/lip texture features, so accuracy degrades where depth or those cues are weak.
- Clarity
- Readable and well motivated; a first pass conveys the adaptive-corrective idea, a second pass clarifies the incremental PCA and detail-snapping steps.
- How to read it
- Focus on the incremental PCA correctives and how the 2D feature tracker plus depth snapping add detail; a second pass pays off for the fitting pipeline order.
Facial
-
Facial blendshape transfer method that enforces smooth contact constraints during deformation to prevent interpenetration artifacts.
abstract ▾ abstract ▴
Improved facial blendshapes synthesis via deformation transfer by addressing penetrations and surface crumpling. Introduces virtual triangles to preserve spatial relationships between face parts and adds Laplacian constraints to maintain smoothness. Applied successfully in film production to reduce facial rigging labor.
Related Transferring the Rig and Animations from a Character to Different Face Models · Reusable Facial Rigging and Animation: Create Once, Use Many · Direct Manipulation Blendshapes · Example-Based Facial Rigging
how to read this ▾ how to read this ▴
- Category
- Method: contact-aware facial blendshape transfer
- Contributions
-
- Improved blendshape synthesis via deformation transfer that addresses penetrations and surface crumpling
- Virtual triangles to preserve spatial relationships between separate face parts
- Laplacian smoothness constraints, with reported use in film production to reduce facial rigging labor
- Context
- Extends Sumner and Popovic's Deformation Transfer for Triangle Meshes by adding contact and smoothness handling specific to faces. Builds on: Deformation Transfer for Triangle Meshes
- Correctness
- Targets a known failure mode (interpenetration and crumpling) of plain deformation transfer; validated through reported film-production use rather than a formal benchmark, so the evidence is qualitative and practitioner-oriented.
- Clarity
- Concise and applied (DigiPro venue); a first pass conveys the fix, a second pass clarifies the virtual-triangle construction.
- How to read it
- Focus on how virtual triangles and Laplacian constraints are added on top of deformation transfer; a single careful pass suffices for most readers, a second pass only if implementing.
Facial / Skinning
-
, ,
Multi-view hair capture system that reconstructs coherent, wisp-aware strand geometry from still photographs without special lighting setups.
abstract ▾ abstract ▴
Existing hair capture systems fail to produce strands that reflect the structures of real-world hairstyles. We introduce a system that reconstructs coherent and plausible wisps aware of the underlying hair structures from a set of still images without any special lighting. Our system first discovers locally coherent wisp structures in the reconstructed point cloud and the 3D orientation field, and then uses a novel graph data structure to reason about both the connectivity and directions of the local wisp structures in a global optimization. The wisps are then completed and used to synthesize hair strands which are robust against occlusion and missing data and plausible for animation and simulation. We show reconstruction results for a variety of complex hairstyles including curly, wispy, and messy hair.
Related Robust Hair Capture Using Simulated Examples · Single-View Hair Modeling Using a Hairstyle Database · Strand-Accurate Multi-View Hair Capture · Simulation-Ready Hair Capture
how to read this ▾ how to read this ▴
- Category
- Capture system: structure-aware multi-view hair reconstruction
- Contributions
-
- A system that reconstructs coherent, plausible wisps from still images without special lighting
- Discovery of locally coherent wisp structures from the point cloud and 3D orientation field
- A graph data structure for globally optimizing wisp connectivity and direction, yielding strands robust to occlusion and missing data and suitable for animation/simulation
- Context
- Addresses the gap in prior hair capture systems that fail to reflect real hairstyle structure, drawing on multi-view stereo and orientation-field reconstruction for hair.
- Correctness
- Assumes enough multi-view coverage to recover a point cloud and orientation field; the global optimization is what fills occluded/missing regions, so plausibility (not measured ground-truth strand accuracy) is the claim, and results span curly, wispy, and messy styles.
- Clarity
- Accessible at a high level; a first pass conveys the wisp-then-global-graph pipeline, a second pass is needed for the graph optimization details.
- How to read it
- Focus on the two stages (local wisp discovery, then global graph optimization) and what makes output simulation-ready; a second pass pays off for the connectivity/direction reasoning.
CFX
-
,
Compact two-layer representation for dense skinning weights enabling efficient compression while preserving deformation quality.
abstract ▾ abstract ▴
Weighted linear interpolation has been widely used in many skinning techniques including linear blend skinning, dual quaternion blend skinning, and cage based deformation. To speed up performance, these skinning models typically employ a sparseness constraint, in which each 3D model vertex has a small fixed number of non-zero weights. However, the sparseness constraint also imposes certain limitations to skinning models and their various applications. This paper introduces an efficient two-layer sparse compression technique to substantially reduce the computational cost of a dense-weight skinning model, with insignificant loss of its visual quality. It can directly work on dense skinning weights or use example-based skinning decomposition to further improve its accuracy. Experiments and comparisons demonstrate that the introduced sparse compression model can significantly outperform state of the art weight reduction algorithms, as well as skinning decomposition algorithms with a sparseness constraint.
Related Smooth Skinning Decomposition with Rigid Bones · Direct Delta Mush Skinning Compression with Continuous Examples · Robust and Accurate Skeletal Rigging from Mesh Sequences · Skinning with Dual Quaternions
how to read this ▾ how to read this ▴
- Category
- Method: compression of dense blend-skinning weights
- Contributions
-
- A two-layer sparse compression technique that substantially reduces the cost of dense-weight skinning with little visual-quality loss
- Works directly on dense weights or via example-based skinning decomposition for improved accuracy
- Reported to outperform state-of-the-art weight-reduction and sparse skinning-decomposition methods
- Context
- Applies broadly to weighted-interpolation skinning (linear blend, dual quaternion, cage-based) and builds on the authors' Smooth Skinning Decomposition with Rigid Bones. Builds on: Smooth Skinning Decomposition with Rigid Bones
- Correctness
- Premised on the idea that dense weights can be factored into two sparse layers with insignificant visual error; comparisons are against existing weight-reduction and decomposition baselines, so gains are relative to those and the 'insignificant loss' claim is a visual-quality judgment.
- Clarity
- Clear problem statement; a first pass conveys why sparseness constraints limit skinning, a second pass is needed for the two-layer factorization math.
- How to read it
- Focus on what the two layers represent and how they replace a single sparse weight set; a second pass pays off for the decomposition formulation and the comparison setup.
Skinning
2012
20-
, , , , ,
Extends blendshape facial models with physics-based collision response so blendshape targets automatically account for contact deformation.
abstract ▾ abstract ▴
This paper presents a blendshape technique that augments linear shape interpolation with a physically based mass-spring system to overcome two limitations of linear blendshapes: degradation under large rotations and the inability to handle self collision or scene interaction. A mass-spring system is built for each blendshape target and initialized to its steady state, after which the spring rest lengths are linearly interpolated and the interpolated shape is computed as the equilibrium of the resulting system. Because the equilibrium minimizes local area and volume distortion, the method yields physically plausible deformations even across large rotations between targets. The mass-spring formulation also enables physical interaction with other scene elements through collision detection and handling, without requiring any precomputation, skeleton, or manual intervention.
Related Smooth Contact-Aware Facial Blendshapes Transfer · Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation · Advances for Digital Humans in VFX Production at Goodbye Kansas Studios · Practice and Theory of Blendshape Facial Models
how to read this ▾ how to read this ▴
- Category
- Method: a physics-augmented blendshape facial model
- Contributions
-
- Augments linear blendshape interpolation with a physically based mass-spring system to fix degradation under large rotations and the inability to handle collisions
- Builds a mass-spring system per blendshape target, interpolates spring rest lengths, and solves for the equilibrium shape (minimizing local area and volume distortion) for plausible deformation across large rotations
- Enables physical interaction with other scene elements via collision detection and handling, with no precomputation, skeleton, or manual intervention
- Context
- Builds on blendshape facial animation and direct-manipulation blendshapes (Lewis and Anjyo 2010), adding a physical equilibrium layer to address linear-blending artifacts. Builds on: Direct Manipulation Blendshapes
- Correctness
- Plausible deformation across large rotations and contact is the claim; results are physically based but use a mass-spring (not full FEM) model so realism is approximate, and a reader should note quality depends on the per-target spring system and on collision detection robustness.
- Clarity
- Clearly motivated by two known blendshape failure modes; a first pass conveys the rest-length-interpolation-plus-equilibrium idea, with a second pass for the mass-spring setup and solve.
- How to read it
- First pass for why linear blends fail and how the equilibrium reformulation helps; second pass on the mass-spring construction and collision handling if extending or implementing it.
Facial / Skinning
-
, ,
Editing tool for blendshape facial rigs enabling composite operations, mixing and retargeting facial expressions for production characters.
abstract ▾ abstract ▴
Proposes an interactive editing system for rapidly creating new blendshape face models by assembling facial features from multiple source characters. Uses gradient-domain blending and discrete Poisson equation solving to create smooth composite faces that preserve position relationships and regional detail variation from source characters.
Related Smooth Contact-Aware Facial Blendshapes Transfer · High-Quality Face Capture Using Anatomical Muscles · Accurate Face Rig Approximation with Deep Differential Subspace Reconstruction · Direct Manipulation Blendshapes
how to read this ▾ how to read this ▴
- Category
- Method / tool: an interactive blendshape face composite editor
- Contributions
-
- An interactive editing system for rapidly creating new blendshape face models by assembling facial features from multiple source characters
- Uses gradient-domain blending and discrete Poisson equation solving to create smooth composite faces
- Preserves position relationships and regional detail variation from the source characters
- Context
- Relates to blendshape facial rigs and direct-manipulation blendshapes (Lewis and Anjyo 2010), applying gradient-domain (Poisson) image/mesh-editing techniques to facial composition. Builds on: Direct Manipulation Blendshapes
- Correctness
- Presented as a production-oriented editing tool (DigiPro); it assumes compatible blendshape source characters and relies on Poisson blending to merge features smoothly, so composite quality depends on source compatibility and is judged by editor usability rather than a quantitative benchmark.
- Clarity
- Accessible and tool-focused; a first pass conveys what the editor does and the gradient-domain approach, with a second pass only for the Poisson formulation.
- How to read it
- First pass for the composite-editing workflow and the gradient-domain blending idea; a brief second pass on the discrete Poisson solve if you need the blending details.
Facial
-
, ,
Adaptive cloth remeshing system adding resolution where needed for fine wrinkles while coarsening in flat regions for efficiency.
abstract ▾ abstract ▴
We present a technique for cloth simulation that dynamically refines and coarsens triangle meshes so that they automatically conform to the geometric and dynamic detail of the simulated cloth. Our technique produces anisotropic meshes that adapt to surface curvature and velocity gradients, allowing efficient modeling of wrinkles and waves. By anticipating buckling and wrinkle formation, our technique preserves fine-scale dynamic behavior. Our algorithm for adaptive anisotropic remeshing is simple to implement, takes up only a small fraction of the total simulation time, and provides substantial computational speedup without compromising the fidelity of the simulation. We also introduce a novel technique for strain limiting by posing it as a nonlinear optimization problem. This formulation works for arbitrary non-uniform and anisotropic meshes, and converges more rapidly than existing solvers based on Jacobi or Gauss-Seidel iterations.
Related Continuum-based Strain Limiting · Efficient Simulation of Inextensible Cloth · Multi-Resolution Isotropic Strain Limiting · Efficient Simulation of Example-Based Materials
how to read this ▾ how to read this ▴
- Category
- Method: an adaptive cloth simulation / remeshing algorithm
- Contributions
-
- Dynamic anisotropic remeshing that refines and coarsens triangle meshes to match surface curvature and velocity gradients
- Anticipation of buckling and wrinkle formation to preserve fine-scale dynamic behavior at low cost
- A strain-limiting technique posed as a nonlinear optimization that handles non-uniform anisotropic meshes and converges faster than Jacobi or Gauss-Seidel solvers
- Context
- Sits in the implicit cloth-simulation lineage of Baraff and Witkin's Large Steps in Cloth Simulation, adding adaptive spatial resolution and a reformulated strain limiter on top of that style of solver. Builds on: Large Steps in Cloth Simulation
- Correctness
- The claimed speedup rests on the assumption that detail can be localized (resolution added only where curvature or velocity gradients warrant), so a reader should keep in mind that fidelity depends on the remeshing criteria anticipating wrinkles correctly rather than reacting after the fact.
- Clarity
- Accessible at a high level; a first pass conveys the adapt-where-needed idea, but a second pass is needed for the remeshing criteria and the strain-limiting optimization.
- How to read it
- First pass for the core idea (anisotropic adapt plus optimization-based strain limiting); do a focused second pass on the sizing/metric criteria and the nonlinear strain-limiting formulation if you intend to implement it.
CFX
-
,
DICE covers animation philosophy and production challenges for Battlefield 3, addressing single and multiplayer pipeline issues and solutions for their largest and most complex title.
Retargeting
-
, , ,
Automatically creates skeletons, rigid skin weights, and joint mappings for multi-component 3D characters from shape databases.
abstract ▾ abstract ▴
Rigging an arbitrary 3D character by creating an animation skeleton is a time‐consuming process even for experienced animators. In this paper, we present an algorithm that automatically creates animation rigs for multi‐component 3D models, as they are typically found in online shape databases. Our algorithm takes as input a multi‐component model and an input animation skeleton with associated motion data. It then creates a target skeleton for the input model, calculates the rigid skinning weights, and a mapping between the joints of the target skeleton and the input animation skeleton. The automatic approach does not need additional semantic information, such as component labels or user‐provided correspondences, and succeeds on a wide range of models where the number of components is significantly different. It implicitly handles large scale and proportional differences between input and target skeletons and can deal with certain morphological differences, e.g., if input and target have different numbers of limbs. The output of our algorithm can be directly used in a retargeting system to create a plausible animated character.
Related Geodesic Voxel Binding for Production Character Meshes · Sparse Rig Parameter Optimization for Character Animation · Automatic Rigging and Animation of 3D Characters · MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds
how to read this ▾ how to read this ▴
- Category
- Method: an automatic rigging algorithm for multi-component characters
- Contributions
-
- Automatically builds a target animation skeleton for a multi-component model from an input skeleton plus motion data
- Computes rigid skinning weights and a joint mapping between target and input skeletons without component labels or user correspondences
- Handles large scale, proportional, and certain morphological differences (for example differing numbers of limbs) so output feeds directly into retargeting
- Context
- Builds on the auto-rigging line started by Baran and Popovic's Automatic Rigging and Animation of 3D Characters, extending it to disconnected multi-component models typical of online shape databases. Builds on: Automatic Rigging and Animation of 3D Characters
- Correctness
- Demonstrated on a range of database models with differing component counts; note that it produces rigid skinning weights (not smooth deformation) and assumes an input skeleton with motion is available, so quality of the final animation leans on the supplied skeleton and the retargeting stage.
- Clarity
- Reasonably accessible; a first pass conveys the pipeline (skeleton, weights, mapping), with a second pass needed for the correspondence and mapping details.
- How to read it
- First pass to grasp the three-stage pipeline; second pass on how correspondences and the joint mapping are inferred without semantic labels, the load-bearing part for handling morphological differences.
Rigging
-
, ,
Generalizes harmonic coordinates by interpolating boundary derivatives, yielding smoother 2D cage deformations with fewer global distortions.
abstract ▾ abstract ▴
Barycentric coordinates are an established mathematical tool in computer graphics and geometry processing, providing a convenient way of interpolating scalar or vector data from the boundary of a planar domain to its interior. Many different recipes for barycentric coordinates exist, some offering the convenience of a closed‐form expression, some providing other desirable properties at the expense of longer computation times. For example, harmonic coordinates, which are solutions to the Laplace equation, provide a long list of desirable properties (making them suitable for a wide range of applications), but lack a closed‐form expression. We derive a new type of barycentric coordinates based on solutions to the biharmonic equation. These coordinates can be considered a natural generalization of harmonic coordinates, with the additional ability to interpolate boundary derivative data. We provide an efficient and accurate way to numerically compute the biharmonic coordinates and demonstrate their advantages over existing schemes. We show that biharmonic coordinates are especially appealing for (but not limited to) 2D shape and image deformation and have clear advantages over existing deformation methods.
Related Somigliana Coordinates: an Elasticity-Derived Approach for Cage Deformation · Mean Value Coordinates for Closed Triangular Meshes · Green Coordinates · Green Coordinates for Triquad Cages in 3D
how to read this ▾ how to read this ▴
- Category
- Method: a new family of barycentric (cage) coordinates
- Contributions
-
- Derives biharmonic coordinates as solutions to the biharmonic equation, a natural generalization of harmonic coordinates
- Adds the ability to interpolate boundary derivative data, not just boundary values
- Provides an efficient, accurate numerical scheme to compute them and shows advantages for 2D shape and image deformation
- Context
- Generalizes harmonic coordinates and relates to the bounded-biharmonic-weights work of Jacobson et al., extending the barycentric-coordinate toolkit used for cage-based deformation. Builds on: Bounded Biharmonic Weights for Real-Time Deformation
- Correctness
- Demonstrated primarily on 2D shape and image deformation; like harmonic coordinates the scheme lacks a closed-form expression and requires numerical solution, and the abstract scopes the benefits to (though not strictly limited to) the planar case, so 3D generality should not be assumed from this read.
- Clarity
- Mathematically dense but well-motivated; a first pass conveys what the coordinates buy you, a second pass is needed to follow the biharmonic derivation.
- How to read it
- First pass to understand why interpolating boundary derivatives matters; budget a careful second/third pass on the derivation and numerical computation if you need to implement or compare against harmonic coordinates.
Skinning
-
, , , , , ,
Measures complex 3D cloth deformation from physical samples and fits nonlinear simulation model parameters to the observations.
abstract ▾ abstract ▴
Progress in cloth simulation for computer animation and apparel design has led to a multitude of deformation models, each with its own way of relating geometry, deformation, and forces. As simulators improve, differences between these models become more important, but it is difficult to choose a model and a set of parameters to match a given real material simply by looking at simulation results. This paper provides measurement and fitting methods that allow nonlinear models to be fit to the observed deformation of a particular cloth sample. Unlike standard textile testing, our system measures complex 3D deformations of a sheet of cloth, not just one‐dimensional force‐displacement curves, so it works under a wider range of deformation conditions. The fitted models are then evaluated by comparison to measured deformations with motions very different from those used for fitting.
Related Directing Cloth Draping through Blended UVs · Untangling Cloth · Art-Directed Costumes at Pixar: Design, Tailoring, and Simulation in Production · Fast Cloth Simulation on Moving Humanoids
how to read this ▾ how to read this ▴
- Category
- Method: data-driven measurement and parameter fitting for cloth models
- Contributions
-
- A measurement system that captures complex 3D deformations of a real cloth sheet rather than only 1D force-displacement curves
- Fitting methods that estimate nonlinear cloth-model parameters from those observations
- Validation by comparing fitted models against measured deformations under motions different from the fitting data
- Context
- Addresses parameterization of the deformation models used by implicit cloth simulators in the Baraff and Witkin Large Steps lineage, connecting measured real cloth to simulation parameters. Builds on: Large Steps in Cloth Simulation
- Correctness
- Because it measures full 3D deformation over a wider range of conditions than standard textile testing, it is better posed than 1D fitting; the key caveat is that fidelity depends on the chosen nonlinear model's expressiveness and on how representative the captured deformations are of target motions.
- Clarity
- Accessible in motivation; a first pass conveys the measure-then-fit idea, with a second pass needed for the optimization and the specific deformation model fitted.
- How to read it
- First pass for the capture-and-fit concept and validation protocol; second pass on the parameterization and the fitting objective if you plan to calibrate a simulator to real material.
CFX
-
, , , , ,
Example-based material simulation system enabling artists to specify target deformation examples that guide realistic elastic behavior.
abstract ▾ abstract ▴
We present a method for efficiently simulating art-directable example-based deformable materials, where an artist supplies a set of example poses that define a subspace of desirable deformations via linear interpolation. The central idea is to represent both input examples and interpolated poses using an incompatible representation of disconnected elements, expressed in a rotation-free reference frame, which lets us interpolate elements individually and bypass the costly geometry reconstruction required by prior example-based elastic materials. We extend the approach to thin shells using edge lengths and dihedral angles, and introduce a formulation of example-based plasticity for both solids and shells. We further explore control mechanisms such as explicit weight control and velocity-dependent activation of example poses.
Related Dynamic Deformables: Implementation and Production Practicalities · Subspace Clothing Simulation Using Adaptive Bases · Directing Cloth Draping through Blended UVs · Multi-Resolution Isotropic Strain Limiting
how to read this ▾ how to read this ▴
- Category
- Method: example-based (art-directable) deformable-material simulation
- Contributions
-
- Represents input examples and interpolated poses with disconnected elements in a rotation-free frame, avoiding costly geometry reconstruction
- Extends the approach to thin shells via edge lengths and dihedral angles, plus an example-based plasticity formulation for solids and shells
- Adds control mechanisms such as explicit weight control and velocity-dependent activation of example poses
- Context
- Builds on example-based elastic material methods, recasting the representation to make interpolation cheaper; relates generally to art-directed deformation and elastic-energy simulation.
- Correctness
- The efficiency gain hinges on the incompatible disconnected-element representation that bypasses reconstruction, and behavior is defined by a subspace of artist examples via linear interpolation, so plausibility outside the supplied example span is not guaranteed and remains an art-direction choice.
- Clarity
- Conceptually clear with a strong central idea; a first pass conveys the disconnected-element trick, a second pass is needed for the shell and plasticity formulations.
- How to read it
- First pass for the representation idea and why it avoids reconstruction; second pass on the shell extension and plasticity if you work on directable elastic materials.
CFX / Skinning
-
,
Optimizes LBS and DQS weights to minimize elastic energy, yielding deformations that mimic nonlinear simulation quality.
abstract ▾ abstract ▴
Current approaches to skeletally-controlled character articulation range from real-time, closed-form skinning methods to offline, physically-based simulation. In this paper, we seek a closed-form skinning method that approximates nonlinear elastic deformations well while remaining very fast. Our contribution is two-fold: (1) we optimize skinning weights for the standard linear and dual quaternion skinning techniques so that the resulting deformations minimize an elastic energy function. We observe that this is not sufficient to match the visual quality of the original elastic deformations and therefore, we develop (2) a new skinning method based on the concept of joint-based deformers . We propose a specific deformer which is visually similar to nonlinear variational deformation methods. Our final algorithm is fully automatic and requires little or no input from the user other than a rest-pose mesh and a skeleton. The runtime complexity requires minimal memory and computational overheads compared to linear blend skinning, while producing higher quality deformations than both linear and dual quaternion skinning.
Related Skinning: Real-time Shape Deformation · Stretchable and Twistable Bones for Skeletal Shape Deformation · Direct Delta Mush Skinning and Variants · Real-Time Skeletal Skinning with Optimized Centers of Rotation
how to read this ▾ how to read this ▴
- Category
- Method: a skinning algorithm approximating elastic deformation
- Contributions
-
- Optimizes linear-blend and dual-quaternion skinning weights so deformations minimize an elastic energy
- Introduces a joint-based deformer that visually approximates nonlinear variational deformation methods
- Fully automatic from a rest-pose mesh and skeleton, with memory and runtime overhead close to linear blend skinning
- Context
- Builds on dual-quaternion skinning (Kavan et al.) and closed-form skinning broadly, seeking to close the quality gap toward offline physically-based elastic simulation. Builds on: Skinning with Dual Quaternions
- Correctness
- The authors themselves note that merely optimizing LBS/DQS weights does not match elastic quality, which motivates the new deformer; the trade-off is an approximation of nonlinear elasticity, so it should be read as a fast closed-form stand-in, not a substitute for full simulation.
- Clarity
- Accessible and well-scoped; a first pass conveys both contributions, a second pass is needed for the energy optimization and deformer construction.
- How to read it
- First pass to understand the two contributions and why weight optimization alone falls short; second pass on the joint-based deformer formulation if you need real-time quality near simulation.
Skinning
-
, , , ,
Fast linear optimization over bounded biharmonic weight space for automatic linear blend skinning from sparse user-provided handles.
abstract ▾ abstract ▴
This paper presents a method to automatically determine 2D and 3D skinning transformations from a sparse set of user-specified controls, inferring the remaining degrees of freedom by minimizing nonlinear as-rigid-as-possible elastic energies. By expressing the energy purely in terms of handle transformations and clustering vertices in weight space into representative rotation groups, the algorithm runs orders of magnitude faster than previous methods while preserving deformation quality. The approach enables new modes of control such as shape-aware inverse kinematics and disconnected skeletons, and can enrich the deformation subspace with automatically generated abstract handle weights. Skinning transformations for one hundred armadillos with 86k triangles each are computed at 30fps on a single CPU core.
Related Direct Delta Mush Skinning Compression with Continuous Examples · Bounded Biharmonic Weights for Real-Time Deformation · Smooth Skinning Decomposition with Rigid Bones · Efficient Dynamic Skinning with Low-Rank Helper Bone Controllers
how to read this ▾ how to read this ▴
- Category
- Method: automatic skinning-transformation solver from sparse handles
- Contributions
-
- Determines 2D and 3D skinning transformations from sparse user controls by minimizing as-rigid-as-possible elastic energies
- Expresses the energy purely in handle transformations and clusters vertices in weight space, running orders of magnitude faster than prior methods
- Enables new control modes such as shape-aware inverse kinematics and disconnected skeletons
- Context
- Builds directly on Jacobson et al.'s Bounded Biharmonic Weights, solving for the handle transformations over that weight space instead of supplying them manually. Builds on: Bounded Biharmonic Weights for Real-Time Deformation
- Correctness
- Speed comes from reducing the problem to handle transformations and clustering vertices into representative rotation groups, an approximation; quality therefore depends on the clustering and on the bounded-biharmonic weight basis it assumes, and the reported real-time rates are for the demonstrated meshes.
- Clarity
- Clearly written with concrete results; a first pass conveys the reformulation and its payoff, a second pass is needed for the rotation-clustering and optimization details.
- How to read it
- First pass for the reduce-to-handles idea and the new control modes; second pass on the weight-space clustering and ARAP minimization if you want the performance, since that is where the speedup lives.
Skinning
- FEM Simulation of 3D Deformable Solids: A Practitioner's Guide to Theory, Discretization and Model Reduction Course 391 cites
,
These SIGGRAPH course notes give a practical, self-contained introduction to finite element method simulation of 3D deformable solids, covering the deformation gradient, strain and stress measures, el
abstract ▾ abstract ▴
These SIGGRAPH course notes give a practical, self-contained introduction to finite element method simulation of 3D deformable solids, covering the deformation gradient, strain and stress measures, elastic energy, and corotational and isotropic hyperelastic constitutive models. They then survey the discretization and numerical solvers used in practice, including conjugate gradients and multigrid, along with invertible element treatment. A second part presents model reduction techniques for real-time simulation such as linear modal analysis, modal warping, subspace integration, and domain decomposition.
Related A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains · Rig-Space Physics · Dynamic Deformables: Implementation and Production Practicalities · Nonlinear Cloth Simulation with Isogeometric Analysis
how to read this ▾ how to read this ▴
- Category
- Course notes / tutorial on FEM for deformable solids
- Contributions
-
- A self-contained introduction to FEM for 3D deformable solids: deformation gradient, strain and stress measures, and elastic energy
- Coverage of corotational and isotropic hyperelastic constitutive models, discretization, and solvers including conjugate gradients, multigrid, and invertible-element handling
- A second part on model reduction for real-time simulation (linear modal analysis, modal warping, subspace integration, domain decomposition)
- Context
- Synthesizes the deformable-solids tradition rooted in Terzopoulos et al.'s Elastically Deformable Models into a practitioner-oriented reference rather than presenting a new result. Builds on: Elastically Deformable Models
- Correctness
- As tutorial notes the content is expository and consolidated from established theory, so there are no new claims to validate; the main caveat for a reader is that it presents standard models and solvers and is a starting point, not the latest state of the art.
- Clarity
- Designed for accessibility and self-contained study; a first pass orients you, and the real value comes from working through it section by section.
- How to read it
- Treat as a reference, not a one-pass read: skim the structure first, then study the theory part for fundamentals and the model-reduction part separately when you need real-time techniques.
CFX
-
, , , , ,
Multithreaded dependency graph architecture for character rig evaluation, enabling scalable parallel computation of complex DreamWorks rigs.
abstract ▾ abstract ▴
Presents LibEE, a multithreaded dependency graph evaluation engine for character animation built from the ground up for multicore CPU architectures. The system enables concurrent evaluation of multiple graph nodes while supporting internal node parallelism, achieving over one order of magnitude speedup. Addresses production challenges including thread-safety classification, visualization tools for graph optimization, and restructuring of character systems to exploit graph-level parallelism.
Related Premo: Powerful Character Rigging, Fast Animation · ChopRig System · Achieving and Maintaining Real-Time Rigs · LibEE 2: Enabling Fast Edits and Evaluation
how to read this ▾ how to read this ▴
- Category
- Systems / production paper: a multithreaded rig-evaluation engine
- Contributions
-
- LibEE, a dependency-graph evaluation engine for character animation built for multicore CPUs
- Concurrent evaluation of multiple graph nodes plus internal node parallelism, reported as over an order of magnitude speedup
- Production practices for thread-safety classification, graph-optimization visualization, and restructuring rigs to expose graph-level parallelism
- Context
- Relates to the dependency-graph architectures underlying character rigs in animation packages, reworked for parallel evaluation in a feature-film (DreamWorks) pipeline.
- Correctness
- This is a production-engineering paper, so results are pipeline-proven on real DreamWorks rigs rather than a controlled benchmark; the reported speedup depends on rigs being restructured to expose parallelism and on correct thread-safety classification, which a reader should treat as prerequisites, not free wins.
- Clarity
- Accessible and practically framed; a first pass conveys the architecture and the engineering lessons, with deeper passes for the parallelism and thread-safety specifics.
- How to read it
- First pass for the architecture and the production lessons (thread-safety, restructuring rigs); a second pass pays off mainly if you build or parallelize a rig-evaluation system yourself.
Rigging
-
, , , ,
Multi-camera performance capture system producing high-quality facial geometry and appearance for direct use in film production.
Facial
-
EA DICE built a modular deformation rig combined with the ANT animation system and motion capture to raise character quality across Battlefield 3's large asset volume.
Rigging / Skinning
-
, , , , ,
Fabrication-oriented method for cloning the physical mechanics of a human face into a robotic or animatronic face system.
abstract ▾ abstract ▴
This paper presents a complete process for designing, simulating, and fabricating synthetic skin for an animatronics character that mimics a given subject's face and expressions. The process begins by measuring the elastic properties of a silicone material used to manufacture synthetic soft tissue, then captures 3D facial expressions of a target subject through performance capture. A physics-based optimization scheme, built on a neo-Hookean finite element model, jointly determines the thickness of the synthetic skin and the actuation parameters of the robotic base that best match the target expressions. The method is validated by physically cloning a real human face onto an animatronic figure via injection molding.
Related Fully Automatic Generation of Anatomical Face Simulation Models · Art-Directed Muscle Simulation for High-End Facial Animation · Building Accurate Physics-based Face Models from Data · Automatic Determination of Facial Muscle Activations from Sparse Motion Capture Marker Data
how to read this ▾ how to read this ▴
- Category
- Method / fabrication pipeline: cloning a face into animatronic skin
- Contributions
-
- An end-to-end process to design, simulate, and fabricate synthetic silicone skin for an animatronic character matching a subject's face
- Measurement of the silicone's elastic properties plus performance capture of the target subject's facial expressions
- A neo-Hookean FEM physics-based optimization that jointly solves for skin thickness and robotic actuation parameters, validated by physically cloning a real face
- Context
- Combines facial performance capture with physically-based FEM simulation of soft tissue, applied to the fabrication/animatronics setting rather than pure on-screen rendering.
- Correctness
- Validated by actually fabricating an animatronic face via injection molding, which is strong evidence; the results are bounded by the neo-Hookean material model, the captured expression set, and the actuation hardware, so matching is approximate and constrained by what the physical base can actuate.
- Clarity
- Accessible as a pipeline narrative; a first pass conveys the design-simulate-fabricate loop, a second pass is needed for the FEM optimization formulation.
- How to read it
- First pass for the full pipeline and the joint thickness/actuation optimization idea; second pass on the neo-Hookean FEM and the optimization setup if you care about the physics or fabrication side.
Facial / Muscles
-
, , , , ,
Physics-based facial rig coupling blendshape controls with a finite-element tissue model for artist-directed, anatomically plausible animation.
Facial / Muscles
-
, , , , ,
Runs physical simulation directly in the space of an artist rig's parameters, so secondary motion lands on animator-friendly controls.
abstract ▾ abstract ▴
We present a method that brings the benefits of physics-based simulations to traditional animation pipelines. We formulate the equations of motions in the subspace of deformations defined by an animator's rig. Our framework fits seamlessly into the workflow typically employed by artists, as our output consists of animation curves that are identical in nature to the result of manual keyframing. Artists can therefore explore the full spectrum between handcrafted animation and unrestricted physical simulation. To enhance the artist's control, we provide a method that transforms stiffness values defined on rig parameters to a non-homogeneous distribution of material parameters for the underlying FEM model. In addition, we use automatically extracted high-level rig parameters to intuitively edit the results of our simulations, and also to speed up computation. To demonstrate the effectiveness of our method, we create compelling results by adding rich physical motions to coarse input animations. In the absence of artist input, we create realistic passive motion directly in rig space.
Related A Unified Approach for Subspace Simulation of Deformable Bodies in Multiple Domains · Complementary Dynamics · FEM Simulation of 3D Deformable Solids: A Practitioner's Guide to Theory, Discretization and Model Reduction · How the Rig Design Impacts the Animation Process
how to read this ▾ how to read this ▴
- Category
- Method: physics simulation in rig subspace
- Contributions
-
- Formulates the equations of motion directly in the subspace of deformations spanned by an animator's rig
- Outputs standard animation curves, letting artists blend freely between keyframed and fully simulated motion
- Maps rig-parameter stiffness values onto a non-homogeneous FEM material distribution, and uses high-level rig parameters for intuitive editing and speedup
- Context
- Bridges physics-based FEM simulation and traditional keyframe rigging, projecting dynamics into the animator's control space rather than the full mesh.
- Correctness
- Demonstrated by adding rich secondary motion to coarse input animations and producing passive motion in the absence of artist input; results live entirely within the rig's expressive range, so motion the rig cannot represent stays out of reach.
- Clarity
- Accessible in intent; a first pass conveys the rig-space idea, a second pass is needed for the subspace equations of motion and the stiffness-to-material mapping.
- How to read it
- Focus first on the rig-space formulation and the artist-control story; do a second pass on the subspace EOM and stiffness transform if you intend to implement or evaluate the simulation.
Rigging / CFX
-
,
SSDR: automatically extracting bone transforms and weights from example animations, widely used for rig conversion and crowd pipelines.
abstract ▾ abstract ▴
This paper introduces the Smooth Skinning Decomposition with Rigid Bones (SSDR), an automated algorithm to extract the linear blend skinning (LBS) from a set of example poses. The SSDR model can effectively approximate the skin deformation of nearly articulated models as well as highly deformable models by a low number of rigid bones and a sparse, convex bone-vertex weight map. Formulated as a constrained optimization problem where the least squared error of the reconstructed vertices by LBS is minimized, the SSDR model can be solved by a block coordinate descent-based algorithm to iteratively update the weight map and the bone transformations. By employing the sparseness and convex constraints on the weight map, the SSDR model can be used for traditional skinning decomposition tasks such as animation compression and hardware-accelerated rendering. Moreover, by imposing the orthogonal constraints on the bone rotation matrices (rigid bones), the SSDR model can also be applied in motion editing, skeleton extraction, and collision detection tasks. Through qualitative and quantitative evaluations, we show the SSDR model can measurably outperform the state-of-the-art skinning decomposition schemes in terms of accuracy and applicability.
Related Two-Layer Sparse Compression of Dense-Weight Blend Skinning · Robust and Accurate Skeletal Rigging from Mesh Sequences · Fast Automatic Skinning Transformations · Direct Delta Mush Skinning Compression with Continuous Examples
how to read this ▾ how to read this ▴
- Category
- Method: skinning decomposition algorithm
- Contributions
-
- SSDR: automatically extracts linear blend skinning (bone transforms and a weight map) from a set of example poses
- Solves a constrained least-squares reconstruction via block coordinate descent that alternately updates weights and bone transformations
- Enforces sparse, convex weights and orthogonal (rigid) bone rotations, enabling compression, hardware skinning, skeleton extraction, motion editing, and collision tasks
- Context
- Builds on pose-space and example-driven skinning lineage (Lewis et al. Pose Space Deformation, James and Twigg Skinning Mesh Animations), targeting an LBS-compatible decomposition. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation · Skinning Mesh Animations
- Correctness
- Validated qualitatively and quantitatively against prior skinning-decomposition methods on articulated and highly deformable models; quality hinges on the number of bones chosen and how representative the example poses are, and block coordinate descent gives a local optimum.
- Clarity
- Accessible; a first pass conveys the LBS-fitting goal and the alternating solve, a second pass clarifies the constraint handling and convergence.
- How to read it
- Read for the optimization setup (objective, sparsity and rigidity constraints, alternating update); a second pass pays off if you plan to reimplement or tune bone count for a rig-conversion or crowd pipeline.
Skinning / ML Deformation
-
, , , , ,
Spacetime optimization framework for cloning facial expressions across blendshape rigs while preserving timing and dynamic feel.
abstract ▾ abstract ▴
The goal of a practical facial animation retargeting system is to reproduce the character of a source animation on a target face while providing room for additional creative control by the animator. This article presents a novel spacetime facial animation retargeting method for blendshape face models. Our approach starts from the basic principle that the source and target movements should be similar. By interpreting movement as the derivative of position with time, and adding suitable boundary conditions, we formulate the retargeting problem as a Poisson equation. Specified (e.g., neutral) expressions at the beginning and end of the animation as well as any user-specified constraints in the middle of the animation serve as boundary conditions. In addition, a model-specific prior is constructed to represent the plausible expression space of the target face during retargeting. A Bayesian formulation is then employed to produce target animation that is consistent with the source movements while satisfying the prior constraints. Since the preservation of temporal derivatives is the primary goal of the optimization, the retargeted motion preserves the rhythm and character of the source movement and is free of temporal jitter.
Related Artist Friendly Facial Animation Retargeting · A Facial Composite Editor for Blendshape Characters · Adult2Child: Motion Style Transfer Using CycleGANs · Fully Automatic Generation of Anatomical Face Simulation Models
how to read this ▾ how to read this ▴
- Category
- Method: facial retargeting (spacetime optimization)
- Contributions
-
- Spacetime retargeting of facial animation onto blendshape rigs by matching movement (the time-derivative of position) rather than absolute pose
- Casts the problem as a Poisson equation with neutral end-poses and user constraints as boundary conditions
- Adds a model-specific plausible-expression prior in a Bayesian formulation, preserving the source's rhythm and character while leaving room for animator control
- Context
- Extends artist-friendly facial retargeting (Seol et al. 2011) by reformulating it as a temporal-derivative-preserving spacetime optimization. Builds on: Artist Friendly Facial Animation Retargeting
- Correctness
- Built on the principle that source and target movements should be similar; preserving temporal derivatives keeps timing and dynamic feel, but output realism depends on the target prior and on how well the source rig maps to the target blendshape basis.
- Clarity
- Conceptually clear (match motion, not pose); a first pass gives the idea, a second pass is needed for the Poisson and Bayesian formulation.
- How to read it
- First grasp the movement-preservation principle and the boundary-condition setup; a second pass on the Poisson equation and the prior pays off if facial retargeting fidelity is your concern.
Facial / Retargeting
-
, , ,
Motion synthesis approach using tiled motion patches for seamlessly transitioning and looping character locomotion in interactive applications.
abstract ▾ abstract ▴
Proposes a tiling algorithm for creating dense crowds of interacting virtual characters using deformable motion patches. The method collects episodes of multiple characters and tiles them spatially and temporally to generate seamless multi-character animation with complex interactions like hand shaking and object carrying, using a combination of stochastic sampling and deterministic search.
Related Near-Optimal Character Animation with Continuous Control · DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds · Physics-Based Character Controllers Using Conditional VAEs · Automated Extraction and Parameterization of Motions in Large Data Sets
how to read this ▾ how to read this ▴
- Category
- Method: multi-character motion synthesis
- Contributions
-
- Tiles deformable motion patches spatially and temporally to synthesize dense crowds of interacting characters
- Collects episodes of multiple characters and assembles them into seamless multi-character animation
- Combines stochastic sampling with deterministic search to generate complex interactions such as hand shaking and object carrying
- Context
- Builds on the motion-graph lineage (Kovar et al. Motion Graphs), extending data-driven reassembly from single-character paths to tiled multi-character interaction patches. Builds on: Motion Graphs
- Correctness
- Demonstrated on crowds with close interactions; the result space is bounded by the captured episodes and the tileability of patches, so interactions outside the recorded repertoire will not appear and seam quality depends on patch compatibility.
- Clarity
- Accessible; a first pass conveys the tiling metaphor, a second pass clarifies the sampling-plus-search synthesis loop.
- How to read it
- Focus on what a motion patch is and how tiles compose without seams; a second pass on the sampling/search algorithm is worth it if you are building interactive crowds.
Motion Synthesis
2011
15- A Hybrid Iterative Solver for Robustly Capturing Coulomb Friction in Hair Dynamics SIGGRAPH Asia Academic
, ,
Hybrid iterative solver robustly handling Coulomb friction in hair dynamics using a Signorini-Coulomb contact model.
abstract ▾ abstract ▴
Dry friction between hair fibers plays a major role in the collective hair dynamic behavior as it accounts for typical nonsmooth features such as stick-slip instabilities. However, due the challenges posed by the modeling of nonsmooth friction, previous mechanical models for hair either neglect friction or use an approximate smooth friction model, thus losing important visual features. In this paper we present a new generic robust solver for capturing Coulomb friction in large assemblies of tightly packed fibers such as hair. Our method is based on an iterative algorithm where each single contact problem is efficiently and robustly solved by introducing a hybrid strategy that combines a new zero-finding formulation of (exact) Coulomb friction together with an analytical solver as a fail-safe. Our global solver turns out to be very robust and highly scalable as it can handle up to a few thousand densely packed fibers subject to tens of thousands frictional contacts at a reasonable computational cost. It can be conveniently combined to any fiber model with various rest shapes, from smooth to curly. Our results, visually validated against real hair motions, depict typical hair collective effects and greatly enhance the realism of standard hair simulators.
Related Super-Helices for Predicting the Dynamics of Natural Hair · Artistic Simulation of Curly Hair · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · Adaptive Nonlinearity for Collisions in Complex Rod Assemblies
how to read this ▾ how to read this ▴
- Category
- Method: a friction solver for hair dynamics
- Contributions
-
- A robust hybrid iterative solver capturing exact Coulomb (Signorini-Coulomb) friction in large assemblies of tightly packed fibers
- Per-contact solving via a new zero-finding formulation of exact Coulomb friction with an analytical solver as a fail-safe
- Scaling to a few thousand densely packed fibers and tens of thousands of frictional contacts at reasonable cost, combinable with various fiber models and rest shapes
- Context
- Targets nonsmooth dry friction (stick-slip) in hair, applicable on top of fiber models such as Discrete Elastic Rods, replacing earlier neglected or smoothed-friction treatments. Builds on: Discrete Elastic Rods
- Correctness
- Results are visually validated against real hair motions and emphasize robustness and scalability; the validation is qualitative/visual rather than quantitative, and cost grows with the very large contact counts the method targets.
- Clarity
- Technically dense; a first pass conveys why nonsmooth Coulomb friction matters, but the contact formulation and hybrid solver need a careful second and likely third pass.
- How to read it
- Read first for the motivation (stick-slip from exact friction) and the hybrid zero-finding plus analytical fail-safe strategy; reserve a deeper pass for the per-contact Signorini-Coulomb math if you implement it.
CFX
-
Describes a scripted pipeline that automatically generates character runtime rigs, reducing manual technical director work and improving rig consistency across all character assets in production.
Rigging
-
, , , ,
Retargeting method for facial animation that maps captured motion to diverse character rigs while preserving artistic intent.
abstract ▾ abstract ▴
Presents a facial animation retargeting system designed for animator efficiency that generates blendshape weights matching manual keyframing style. Proposes sequential retargeting that avoids large canceling weights by processing blendshapes in anatomical order, coupled with automatic graph simplification that reduces dense curves to editable keyframes while preserving visual characteristics and timing details.
Related Spacetime Expression Cloning for Blendshapes · An Implicit Physical Face Model Driven by Expression and Style · FaceWarehouse: A 3D Facial Expression Database for Visual Computing · Groom Styles Interpolation with Features Preservation for Digital Creatures Effects
how to read this ▾ how to read this ▴
- Category
- Method: facial-animation retargeting for artist workflows
- Contributions
-
- A facial retargeting system that generates blendshape weights matching a manual keyframing style for animator efficiency
- Sequential retargeting that processes blendshapes in anatomical order to avoid large canceling weights
- Automatic graph simplification that reduces dense curves to editable keyframes while preserving visual character and timing
- Context
- Sits in the facial motion-capture retargeting lineage, with the distinguishing goal of producing animator-editable, keyframe-like blendshape output rather than dense per-frame curves.
- Correctness
- The key assumption is that anatomically ordered, sequential solving yields cleaner non-canceling weights; the emphasis is artist efficiency and editability, so a reader should treat 'artist-friendly' as the primary design criterion and watch how much fidelity the curve simplification trades away.
- Clarity
- Accessible; a first pass conveys the sequential-order and curve-simplification ideas, with a second pass for how the anatomical ordering and simplification are computed.
- How to read it
- Read for the two core ideas (anatomical sequential retargeting and curve-to-keyframe simplification); a second pass helps if you care about the simplification criterion and fidelity trade-off.
Facial / Retargeting
-
, , ,
Bounded biharmonic weight computation for cage and skeleton deformation providing smooth, localized, non-negative influence functions.
abstract ▾ abstract ▴
Object deformation with linear blending dominates practical use as the fastest approach for transforming raster images, vector graphics, geometric models and animated characters. Unfortunately, linear blending schemes for skeletons or cages are not always easy to use because they may require manual weight painting or modeling closed polyhedral envelopes around objects. Our goal is to make the design and control of deformations simpler by allowing the user to work freely with the most convenient combination of handle types. We develop linear blending weights that produce smooth and intuitive deformations for points, bones and cages of arbitrary topology. Our weights, called bounded biharmonic weights, minimize the Laplacian energy subject to bound constraints. Doing so spreads the influences of the controls in a shape-aware and localized manner, even for objects with complex and concave boundaries. The variational weight optimization also makes it possible to customize the weights so that they preserve the shape of specified essential object features. We demonstrate successful use of our blending weights for real-time deformation of 2D and 3D shapes.
Related Robust Skin Weights Transfer via Weight Inpainting · Real-Time Deformation with Coupled Cages and Skeletons · Stretchable and Twistable Bones for Skeletal Shape Deformation · Fast Automatic Skinning Transformations
how to read this ▾ how to read this ▴
- Category
- Method: a skinning-weight computation (bounded biharmonic weights)
- Contributions
-
- Bounded biharmonic weights that minimize Laplacian energy subject to bound constraints for smooth, intuitive linear-blend deformation
- Unified support for mixed handle types (points, bones, cages of arbitrary topology) with shape-aware, localized, non-negative influence
- Variational weight customization to preserve specified essential features, enabling real-time deformation of 2D and 3D shapes
- Context
- Advances generalized-coordinate/weight schemes for linear blend skinning, building on the Harmonic Coordinates for character articulation line of work while adding bound constraints and mixed handle support. Builds on: Harmonic Coordinates for Character Articulation
- Correctness
- The core guarantees (smoothness, locality, non-negativity, partition-of-unity behavior) follow from the constrained Laplacian-energy minimization; the weights are precomputed via a variational optimization, so a reader should keep the precompute cost and reliance on a meaningful interior discretization in mind, with deformation itself remaining real-time.
- Clarity
- Clearly written but mathematically grounded; a first pass conveys what the weights deliver, and a second pass is needed for the variational optimization and bound constraints.
- How to read it
- Focus first on the four desirable weight properties and the mixed-handle motivation; a second pass on the constrained Laplacian optimization pays off if you implement or extend it.
Rigging / Skinning
-
, , ,
Compression scheme for large blendshape models that preserves direct manipulation capabilities, reducing memory without sacrificing editability.
abstract ▾ abstract ▴
Introduces a method to compress large blendshape facial models using hierarchically semi-separable (HSS) matrix representation with reordering, achieving under 10% storage while enabling real-time GPU-accelerated playback. Extends compression to interactive direct manipulation, allowing artists to control facial expressions by dragging vertices while automatically determining active blendshapes.
Related Direct Manipulation Blendshapes · DreamWorks Animation Facial Motion and Deformation System · Practice and Theory of Blendshape Facial Models · An Empirical Rig for Jaw Animation
how to read this ▾ how to read this ▴
- Category
- Method: blendshape-model compression with direct manipulation
- Contributions
-
- Compressing large blendshape facial models with a hierarchically semi-separable (HSS) matrix representation plus reordering, to under 10% storage
- Real-time GPU-accelerated playback of the compressed model
- Extending compression to interactive direct manipulation, letting artists drag vertices while active blendshapes are determined automatically
- Context
- Combines a structured-matrix (HSS) compression scheme with the Direct Manipulation Blendshapes interaction paradigm it cites, so editability survives compression. Builds on: Direct Manipulation Blendshapes
- Correctness
- The headline is roughly an order-of-magnitude storage reduction with preserved direct manipulation; the result depends on the blendshape matrix being well-approximated by an HSS structure (after reordering), so compressibility and any approximation error will vary with the model.
- Clarity
- Moderately technical; a first pass conveys the compress-and-still-manipulate goal, but the HSS representation and reordering need a focused second pass.
- How to read it
- Read first for the dual goal (storage reduction without losing direct manipulation); do a second pass on the HSS structure and reordering if the numerical-linear-algebra side is your interest.
Facial
-
, , , ,
Hybrid simulation and keyframe approach for art-directing Tangled's hair to match story-driven 2D draw-over targets during complex character interactions.
abstract ▾ abstract ▴
We describe the hybrid approach used to direct hair motion on Disney's feature film Tangled, where physically plausible simulation alone was insufficient given the high degree of art-direction and hair-character interaction conveyed through detailed 2D drawovers. The system interleaves custom hair dynamics with rig-based keyframed animation on a shot-by-shot basis, classifying shots as passive, animation-driven, or simulation-driven. A two-tiered hair rig provides global and per-subgroup IK, FK, twist, and pinch controls for posing and keyframing, while simulation properties are controlled locally through override sets and curve maps. Rig-animated curves are enhanced with a simulation layer via targeting forces, and simulation-driven shots can be edited post-simulation through deformers that ride along with the simulated hair.
Related Simulating Rapunzel's Hair in Disney's Tangled · Simulating Wind Effects on Cloth and Hair in Disney's Frozen · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Scriptable Character FX Solution
how to read this ▾ how to read this ▴
- Category
- Production talk / system breakdown: art-directed hair
- Contributions
-
- A hybrid simulation-plus-keyframe workflow that classifies shots as passive, animation-driven, or simulation-driven to hit 2D draw-over targets
- A two-tiered hair rig with global and per-subgroup IK, FK, twist and pinch controls, with simulation properties controlled locally via override sets and curve maps
- Layering simulation onto rig-animated curves through targeting forces, plus post-simulation editing via deformers that ride along with the simulated hair
- Context
- A companion production account to the earlier 'Simulating Rapunzel's Hair in Tangled', focused on directability rather than the underlying solver. Builds on: Simulating Rapunzel's Hair in Disney's Tangled
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on Tangled, and the workflow exists precisely because physically plausible simulation alone could not meet the heavy art direction, so it is a control framework rather than a validated dynamics model.
- Clarity
- Accessible; a single pass conveys the shot-classification scheme and the rig/override/targeting toolbox, with no heavy math.
- How to read it
- Read once for the directability workflow (shot classes, two-tier rig, targeting forces, ride-along deformers); pair it with the 2010 Rapunzel hair talk for the simulation side underneath.
CFX
- Efficient Elasticity for Character Skinning with Contact and Collisions SIGGRAPH Industrial 293 cites
, , , , , ,
Near-interactive corotational elasticity on hexahedral lattices for character soft-tissue deformation, handling contact and collisions in production at Disney.
abstract ▾ abstract ▴
This paper presents a near-interactive algorithm for simulating skeleton-driven, high-resolution corotational elasticity for soft-tissue character skinning. The method introduces a novel one-point quadrature discretization of corotational elasticity over a uniform hexahedral lattice, with a stabilization scheme that suppresses hourglassing nullspace modes while requiring only a single polar decomposition per cell. The authors enforce positive definiteness of the element stiffness matrix through an inexpensive matrix-free indefiniteness correction, enabling efficient conjugate gradient solves for both quasistatics and implicit dynamics, and develop a multigrid solver (including a Full Approximation Scheme nonlinear variant) that scales to hundreds of thousands of degrees of freedom. The system targets parallelism with a branch-free vectorized SVD and handles body collisions, self-collisions, and soft constraints via embedded proxy points with penalty-based response, demonstrated in a production character skinning pipeline.
Related Data-Driven Physics for Human Soft Tissue Animation · Steklov-Poincare Skinning · Flesh, Flab, and Fascia Simulation on Zootopia · Efficient Dynamic Skinning with Low-Rank Helper Bone Controllers
how to read this ▾ how to read this ▴
- Category
- Method: a near-interactive physics-based skinning algorithm (corotational elasticity)
- Contributions
-
- A one-point quadrature discretization of corotational elasticity over a uniform hexahedral lattice with a stabilization scheme suppressing hourglass modes, needing only one polar decomposition per cell
- A matrix-free indefiniteness correction enforcing element stiffness positive-definiteness, enabling efficient CG solves for quasistatics and implicit dynamics
- A multigrid solver (with a Full Approximation Scheme nonlinear variant) and a branch-free vectorized SVD scaling to hundreds of thousands of DOFs, handling body/self-collisions and soft constraints
- Context
- Production soft-tissue character skinning at Disney, building on the corotational-elasticity and FEM simulation lineage and pushing it toward interactive rates via lattice discretization and multigrid.
- Correctness
- Demonstrated in a production character-skinning pipeline with collision and constraint handling; a reader should note the method assumes a uniform hexahedral embedding lattice and relies on the stabilization to control nullspace modes, so behavior depends on lattice resolution and the penalty-based collision response.
- Clarity
- Dense numerical-methods paper; a first pass conveys the system design and goals, but a second pass is needed to follow the quadrature, stabilization and multigrid formulation.
- How to read it
- First pass for the overall pipeline and why one-point quadrature plus stabilization matters; do a careful second pass on the discretization, the indefiniteness correction, and the multigrid/FAS solver if you intend to implement or evaluate it.
Skinning / CFX / Muscles
-
Epic Games technical animator presented a universal facial rig pipeline for Gears of War 3 emphasizing speed and efficiency in character facial setup.
Facial / Rigging
-
, ,
Digital birds are used in computer graphics to replace live animals for safety and to allow more control over performance, but the current treatment of avian wings is often over-simplified,
abstract ▾ abstract ▴
Digital birds are used in computer graphics to replace live animals for safety and to allow more control over performance, but the current treatment of avian wings is often over-simplified, causing a loss of realism due to incorrect form and motion of the feathers. This research uses the structure and motion of real bird anatomy to inform the creation of biologically accurate kinematic motion for wings, testing the hypothesis that a wing rig following biological accuracy will appear realistic in motion and facilitate efficient animation. The work produced the Wing Creator tool, a Maya rig generation script that builds a group based feathered wing rig from real bird anatomy.
Related Mesh-Driven Generation and Animation of Groomed Feathers · Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · Feathers for Mystical Creatures: Pegasus · A.C.M.E. Multilimb System
how to read this ▾ how to read this ▴
- Category
- Method / tool: an anatomy-informed rigging system (thesis)
- Contributions
-
- Uses real bird anatomy (structure and motion) to inform biologically accurate kinematic motion for feathered wings
- Tests the hypothesis that a biologically accurate wing rig appears realistic in motion and supports efficient animation
- Delivers the Wing Creator tool, a Maya script that builds a group-based feathered wing rig from real bird anatomy
- Context
- Relates to character rigging and procedural feather/appendage setup, motivated by the over-simplified treatment of avian wings in CG and grounded in observed bird anatomy rather than prior cited methods.
- Correctness
- A thesis presenting a tool and a hypothesis-driven approach; validation rests on anatomical fidelity and animator usability rather than a quantitative benchmark, so a reader should treat realism claims as demonstrated within the author's test cases and Maya-specific implementation.
- Clarity
- Accessible and applied; a first pass conveys the motivation and what the tool does, with a deeper read needed only if you want the anatomical mapping and rig construction details.
- How to read it
- First pass for the anatomy-to-rig idea and the Wing Creator workflow; revisit the anatomy and group-structure sections only if building or evaluating a feathered-wing rig yourself.
Rigging
- High-Quality Passive Facial Performance Capture Using Anchor Frames SIGGRAPH Disney Research 367 cites
, , , , , , ,
Passive multi-view facial performance capture using anchor frames for temporal coherence, achieving high-quality reconstruction without active illumination.
abstract ▾ abstract ▴
We present a new technique for passive and markerless facial performance capture based on anchor frames. Our method starts with high resolution per-frame geometry acquisition using state-of-the-art stereo reconstruction, and proceeds to establish a single triangle mesh that is propagated through the entire performance. Leveraging the fact that facial performances often contain repetitive subsequences, we identify anchor frames as those which contain similar facial expressions to a manually chosen reference expression. Anchor frames are automatically computed over one or even multiple performances. We introduce a robust image-space tracking method that computes pixel matches directly from the reference frame to all anchor frames, and thereby to the remaining frames in the sequence via sequential matching. This allows us to propagate one reconstructed frame to an entire sequence in parallel, in contrast to previous sequential methods. Our anchored reconstruction approach also limits tracker drift and robustly handles occlusions and motion blur. The parallel tracking and mesh propagation offer low computation times. Our technique will even automatically match anchor frames across different sequences captured on different occasions, propagating a single mesh to all performances.
Related High Resolution Passive Facial Performance Capture · Driving High-Resolution Facial Scans with Video Performance Capture · FaceLab: Scalable Facial Performance Capture for Visual Effects · Performance-Driven Facial Animation
how to read this ▾ how to read this ▴
- Category
- Capture system / method: passive markerless facial performance capture
- Contributions
-
- Passive, markerless facial performance capture based on anchor frames, propagating a single triangle mesh through an entire performance
- A robust image-space tracking method computing pixel matches directly from a reference frame to anchor frames, then to remaining frames via sequential matching, enabling parallel propagation
- Anchored reconstruction that limits tracker drift, handles occlusions and motion blur, and can match anchor frames across multiple performances at low computation time
- Context
- Builds on the authors' single-shot facial geometry capture (Beeler et al. 2010) and state-of-the-art stereo reconstruction, extending per-frame geometry into temporally coherent performance capture. Builds on: High-Quality Single-Shot Capture of Facial Geometry
- Correctness
- Demonstrated on passive multi-view facial performances; the method assumes performances contain repetitive subsequences (similar expressions) so suitable anchor frames exist, and relies on a manually chosen reference expression, which a reader should keep in mind for sparse or highly varied performances.
- Clarity
- Clearly motivated and well-structured; a first pass conveys the anchor-frame idea and pipeline, with a second pass for the image-space tracking and matching details.
- How to read it
- First pass to grasp anchor frames and why parallel propagation beats sequential tracking; second pass on the tracking/matching formulation if you care about drift handling or reimplementation.
Facial
-
,
Multi-domain subspace deformation method for real-time physics-based character skinning using model reduction per body segment.
abstract ▾ abstract ▴
We propose a domain-decomposition method to simulate articulated deformable characters entirely within a subspace framework, supporting quasistatic and dynamic deformations, nonlinear kinematics and materials at interactive rates. The simulation mesh is partitioned into bone-associated domains, each given a local-frame deformation subspace estimated from quasistatic poses and modal analysis, with reduced-order forces computed via cubature. To avoid locking and seam artifacts when coupling low-rank domains, the method uses penalty-based spring coupling forces rather than hard constraints, and evaluates inter-domain coupling forces between rotated domains efficiently using a novel Fast Sandwich Transform that removes vertex-dependent runtime cost. The authors report speedups of three to four orders of magnitude over full-rank unreduced simulation on quarter-million-element character models.
Related Steklov-Poincare Skinning · Hand Modeling and Simulation Using Stabilized Magnetic Resonance Imaging · Capture and Statistical Modeling of Arm-Muscle Deformations · Data-Driven Physics for Human Soft Tissue Animation
how to read this ▾ how to read this ▴
- Category
- Method: physics-based character skinning via multi-domain subspace (model reduction)
- Contributions
-
- A domain-decomposition method simulating articulated deformable characters entirely in a subspace, supporting quasistatic and dynamic deformation with nonlinear kinematics and materials at interactive rates
- Bone-associated domains each with a local-frame deformation subspace from quasistatic poses and modal analysis, with reduced forces via cubature and penalty-based spring coupling to avoid locking and seams
- A Fast Sandwich Transform that evaluates inter-domain coupling between rotated domains without vertex-dependent runtime cost, reporting three-to-four orders of magnitude speedup over full-rank simulation
- Context
- Builds on physics-based anatomical character simulation (e.g. Teran et al.'s skeletal muscle work) and subspace/model-reduction deformation, partitioning the body per bone to keep reduced bases tractable. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Speedups are reported against full-rank unreduced simulation on quarter-million-element models; as a reduced-order method its accuracy depends on the quality of the per-domain subspaces and cubature, and penalty-based coupling trades exactness for stability, so a reader should weigh fidelity against the reported speed.
- Clarity
- Technical model-reduction paper; a first pass conveys the domain-decomposition strategy and payoff, but a second pass is needed for the subspace construction, cubature, and the Fast Sandwich Transform.
- How to read it
- First pass for the per-bone subspace decomposition and coupling idea; second/third pass on the Fast Sandwich Transform and cubature math if implementing or benchmarking reduced character simulation.
Skinning / Muscles
-
, , ,
Real-time performance-based facial animation driven by depth camera input, enabling live facial retargeting at interactive rates.
abstract ▾ abstract ▴
This paper presents a system for performance-based character animation that enables any user to control the facial expressions of a digital avatar in realtime. The user is recorded in a natural environment using a non-intrusive, commercially available 3D sensor. The simplicity of this acquisition device comes at the cost of high noise levels in the acquired data. To effectively map low-quality 2D images and 3D depth maps to realistic facial expressions, we introduce a novel face tracking algorithm that combines geometry and texture registration with pre-recorded animation priors in a single optimization. Formulated as a maximum a posteriori estimation in a reduced parameter space, our method implicitly exploits temporal coherence to stabilize the tracking. We demonstrate that compelling 3D facial dynamics can be reconstructed in realtime without the use of face markers, intrusive lighting, or complex scanning hardware. This makes our system easy to deploy and facilitates a range of new applications, e.g. in digital gameplay or social interactions.
Related Online Modeling for Realtime Facial Animation · Direct Manipulation Blendshapes · Expression Packing: As-Few-As-Possible Training Expressions for Blendshape Transfer · Performance-Driven Facial Animation
how to read this ▾ how to read this ▴
- Category
- System / method: real-time performance-based facial animation from a depth sensor
- Contributions
-
- A system letting any user drive a digital avatar's facial expressions in real time, recorded with a non-intrusive commercial 3D sensor in a natural environment
- A face tracking algorithm combining geometry and texture registration with pre-recorded animation priors in a single optimization, formulated as MAP estimation in a reduced parameter space
- Markerless, lighting-free, hardware-light real-time reconstruction of compelling 3D facial dynamics, enabling gameplay and social-interaction applications
- Context
- Builds on example-based facial rigging (Li et al. 2010) and blendshape/animation-prior models, adapting them to noisy consumer depth-camera input for live retargeting. Builds on: Example-Based Facial Rigging
- Correctness
- Demonstrated with a commodity 3D sensor; the method assumes pre-recorded animation priors and a reduced expression space to compensate for high sensor noise, so output quality is bounded by the prior and the per-user model rather than capturing arbitrary unseen expressions.
- Clarity
- Accessible and application-driven; a first pass conveys the live-avatar idea and pipeline, with a second pass for the MAP optimization and the registration/prior terms.
- How to read it
- First pass for the real-time markerless concept and where the animation priors fit; second pass on the single-optimization formulation if you want the tracking math or to reproduce the stability behavior.
Facial
-
, , , , , , ,
Rigid motion stabilization for facial performance capture, separating head pose from expression motion for improved downstream processing.
abstract ▾ abstract ▴
Facial scanning has become the industry-standard approach for creating digital doubles in movies and video games. This involves capturing an actor while they perform different expressions that span their range of facial motion. Unfortunately, the scans typically contain a superposition of the desired expression on top of un-wanted rigid head movement. In order to extract true expression deformations, it is essential to factor out the rigid head movement for each expression, a process referred to as rigid stabilization . In order to achieve production-quality in industry, face stabilization is usually performed through a tedious and error-prone manual process. In this paper we present the first automatic face stabilization method that achieves professional-quality results on large sets of facial expressions. Since human faces can undergo a wide range of deformation, there is not a single point on the skin surface that moves rigidly with the underlying skull. Consequently, computing the rigid transformation from direct observation, a common approach in previous methods, is error prone and leads to inaccurate results. Instead, we propose to indirectly stabilize the expressions by explicitly aligning them to an estimate of the underlying skull using anatomically-motivated constraints.
Related FaceLab: Scalable Facial Performance Capture for Visual Effects · Reconstruction of Personalized 3D Face Rigs from Monocular Video · Direct Manipulation Blendshapes · High Fidelity Facial Animation Capture and Retargeting with Contours
how to read this ▾ how to read this ▴
- Category
- Method: automatic rigid stabilization of facial expression scans
- Contributions
-
- The first automatic face stabilization method reaching professional-quality results on large sets of facial expressions
- Factors unwanted rigid head movement out of expression scans to recover true expression deformation, replacing a tedious manual production process
- An indirect stabilization approach that avoids the error-prone assumption that any skin point moves rigidly with the skull
- Context
- Builds on the authors' single-shot facial geometry capture (Beeler et al. 2010) and supports the digital-double scanning pipeline used in film and games. Builds on: High-Quality Single-Shot Capture of Facial Geometry
- Correctness
- Targets production-quality stabilization across wide facial deformation; its key premise is that no single skin point is truly rigid with the skull, so it stabilizes indirectly, and a reader should note results are validated on facial expression scan sets rather than against ground-truth skull motion.
- Clarity
- Clearly motivated by a concrete production pain point; a first pass conveys the problem and the indirect-stabilization insight, with a second pass for the optimization details.
- How to read it
- First pass for why direct rigid estimation fails and how indirect stabilization fixes it; second pass on the formulation only if integrating stabilization into a capture pipeline.
Facial
- Scan-based Volume Animation Driven by Locally Adaptive Articulated Registrations TVCG Weta FX 20 cites
, , ,
Volume animation technique using MRI scan data with locally adaptive articulated registration, enabling realistic volumetric character deformation.
abstract ▾ abstract ▴
This paper presents a complete system for creating anatomically accurate, example-based volume deformation and animation of articulated body regions from multiple in vivo MRI volume scans of a specific individual. To solve the correspondence problem across scans, a template volume is registered to each sample: pose variation is first approximated by volume blend deformation for initialization, then a locally adaptive non-rigid registration based on the biharmonic clamped plate spline highly constrains the degrees of freedom and search space to avoid the strong local minima inherent in articulated registration. The established correspondences enable a data-driven example-based volume deformation that interpolates voxel displacements in pose space, driven by joint control estimated from the actual skeleton. The robustness of the algorithms is demonstrated on human hand and knee volumes, producing occlusion-free person-specific models with realistic inner tissue deformations.
Related Real-Time Weighted Pose-Space Deformation on the GPU · Delta Mush: Smoothing Deformations While Preserving Detail · EigenSkin: Real Time Large Deformation Character Skinning in Hardware · Animation Setup Transfer for 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method / system: example-based volumetric character deformation from MRI scans
- Contributions
-
- A complete system for anatomically accurate, example-based volume deformation and animation of articulated body regions from multiple in vivo MRI scans of a specific individual
- A locally adaptive non-rigid registration based on the biharmonic clamped plate spline, initialized by volume blend deformation, to solve cross-scan correspondence and avoid articulated-registration local minima
- A data-driven pose-space volume deformation interpolating voxel displacements, driven by skeleton-estimated joint control, demonstrated on hand and knee volumes
- Context
- Builds on weighted pose-space deformation (Rhee et al. 2006) and pose-space/example-based deformation, extending it from surfaces to person-specific volumetric (MRI) data. Builds on: Real-Time Weighted Pose-Space Deformation on the GPU
- Correctness
- Robustness is demonstrated on human hand and knee volumes producing occlusion-free person-specific models; the approach is example-based and individual-specific, so results depend on the captured pose samples and MRI quality, and generalization beyond the scanned subject/poses is not the claim.
- Clarity
- Methodical systems paper spanning registration and deformation; a first pass conveys the scan-to-animation pipeline, with a second pass for the registration spline and pose-space interpolation.
- How to read it
- First pass for the MRI-driven volumetric pipeline and the correspondence strategy; second pass on the biharmonic clamped plate spline registration if you work with volumetric anatomical data.
Skinning / Retargeting
-
,
Extends LBS bones to support stretch and twist DOF, eliminating candy-wrapper artifacts without extra helper joints.
abstract ▾ abstract ▴
Skeleton-based linear blend skinning (LBS) remains the most popular method for real-time character deformation and animation. The key to its success is its simple implementation and fast execution. However, in addition to the well-studied elbow-collapse and candy-wrapper artifacts, the space of deformations possible with LBS is inherently limited. In particular, blending with only a scalar weight function per bone prohibits properly handling stretching, where bones change length, and twisting, where the shape rotates along the length of the bone. We present a simple modification of the LBS formulation that enables stretching and twisting without changing the existing skeleton rig or bone weights. Our method needs only an extra scalar weight function per bone, which can be painted manually or computed automatically. The resulting formulation significantly enriches the space of possible deformations while only increasing storage and computation costs by constant factors.
Related Elasticity-Inspired Deformers for Character Articulation · Skinning: Real-time Shape Deformation · Geodesic Voxel Binding for Production Character Meshes · Automatic Rigging and Animation of 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: an extension to linear blend skinning (stretch and twist)
- Contributions
-
- A simple modification of the LBS formulation that enables stretching (bones changing length) and twisting (shape rotating along the bone) without changing the existing skeleton rig or bone weights
- Requires only one extra scalar weight function per bone, paintable manually or computed automatically
- Significantly enriches the LBS deformation space while increasing storage and computation by only constant factors
- Context
- Builds directly on linear blend skinning and the authors' Bounded Biharmonic Weights (Jacobson and Sorkine 2011) for the weight functions, addressing LBS limits beyond the classic elbow-collapse and candy-wrapper artifacts. Builds on: Bounded Biharmonic Weights for Real-Time Deformation
- Correctness
- The appeal is real-time efficiency and rig compatibility; as an LBS variant it inherits LBS's geometric (non-physical) nature, and the added expressiveness depends on a sensible extra weight function, so it enriches rather than fully resolves all blend-skinning artifacts.
- Clarity
- Notably accessible; a first pass conveys the idea and its practicality, and a short second pass suffices for the (compact) formulation.
- How to read it
- First pass gives the core trick and why it is cheap and rig-preserving; a brief second pass on the formula and the extra weight function is enough to apply it.
Skinning / Rigging
2010
21-
Examines partial and additive animation layering, remapping, and flipping techniques that gave Naughty Dog Uncharted characters responsive, feel-good player-controlled movement on PS3.
Retargeting
-
Covers God of War III character animation from concept through in-game implementation, including contact-sensitive moves and maintaining Kratos visual consistency across gameplay demands.
Retargeting
-
BioWare described an automated system that selects and blends animator-authored facial performance libraries to handle high-volume dialog in Star Wars: The Old Republic.
Facial / Retargeting
-
,
Direct manipulation interface for blendshape rigs allowing artists to sculpt corrections on the face mesh with automatic weight solving.
abstract ▾ abstract ▴
The paper mentions that although direct manipulation for figure animation has long been possible using inverse kinematics, there has been no similar "inverse kinematics" approach for blendshape models. In both problems the system must infer unknown degrees of freedom during each edit. We solve this problem by minimizing changes in facial expression.
Related Example-Based Facial Rigging · Smooth Contact-Aware Facial Blendshapes Transfer · DreamWorks Animation Facial Motion and Deformation System · Animating Facial Expressions
how to read this ▾ how to read this ▴
- Category
- Method: a direct-manipulation interface for blendshape rigs
- Contributions
-
- An 'inverse kinematics' style direct-manipulation approach for blendshape face models, letting artists sculpt corrections directly on the mesh
- Automatic inference of the unknown blendshape weights during each edit by minimizing changes in facial expression
- Context
- Frames blendshape editing as the facial analogue of inverse kinematics for figure animation, building on parametric face modeling lineage (Parke, A Parametric Model for Human Faces). Builds on: A Parametric Model for Human Faces
- Correctness
- The method rests on the assumption that minimizing expression change yields the artist-intended edit; readers should keep in mind this is a weight-solving interaction technique whose quality depends on the underlying blendshape basis rather than a new deformation model.
- Clarity
- Accessible in concept; a first pass conveys the IK-for-faces analogy, do a second pass for the minimization formulation and how degrees of freedom are inferred.
- How to read it
- Read the problem framing (IK analogy) first, then focus on the objective being minimized; a second pass is worth it if you want to reimplement the weight solver.
Facial
-
, ,
Adaptive contact linearization for yarn-yarn penalty forces in knitted cloth, exploiting temporal coherence to reduce contact solve cost substantially.
abstract ▾ abstract ▴
Yarn-based cloth simulation can improve visual quality but at high computational costs due to the reliance on numerous persistent yarn-yarn contacts to generate material behavior. Finding so many contacts in densely interlinked geometry is a pathological case for traditional collision detection, and the sheer number of contact interactions makes contact processing the simulation bottleneck. In this paper, we propose a method for approximating penalty-based contact forces in yarn-yarn collisions by computing the exact contact response at one time step, then using a rotated linear force model to approximate forces in nearby deformed configurations. Because contacts internal to the cloth exhibit good temporal coherence, sufficient accuracy can be obtained with infrequent updates to the approximation, which are done adaptively in space and time. Furthermore, by tracking contact models we reduce the time to detect new contacts. The end result is a 7- to 9-fold speedup in contact processing and a 4- to 5-fold overall speedup, enabling simulation of character-scale garments.
Related Discrete Elastic Rods · Simulation-Ready Hair Capture · A Safe and Fast Repulsion Method for GPU-based Cloth Self Collisions · Towards Realtime: A Hybrid Physics-based Method for Hair Animation on GPU
how to read this ▾ how to read this ▴
- Category
- Method: an acceleration technique for yarn-level cloth simulation
- Contributions
-
- Adaptive contact linearization that computes exact contact response at one step then approximates forces in nearby configurations with a rotated linear force model
- Exploitation of temporal coherence of internal contacts so the approximation is updated only infrequently, adaptively in space and time
- Contact-model tracking that reduces detection time, yielding a 7-9x speedup in contact processing and 4-5x overall, enabling character-scale garments
- Context
- Directly extends yarn-level cloth simulation (Kaldor et al., Simulating Knitted Cloth at the Yarn Level) by attacking the contact-processing bottleneck in densely interlinked geometry. Builds on: Simulating Knitted Cloth at the Yarn Level
- Correctness
- The key assumption is that internal cloth contacts deform with enough temporal coherence that a linearized force model stays accurate between infrequent updates; readers should note the approximation is validated on knitted character garments and accuracy depends on how adaptively updates are scheduled.
- Clarity
- Moderately technical; a first pass conveys the linearize-and-reuse idea, do a second pass for the rotated linear force model and the adaptive update criteria.
- How to read it
- Focus on why yarn-yarn contact is the bottleneck and on the rotated linear force approximation; a second pass on the adaptive update scheduling pays off if you care about the accuracy/speed trade-off.
CFX
-
, ,
Optimization method for constructing blendshape facial rigs directly from example expression scans using sparse deformation decomposition.
abstract ▾ abstract ▴
Introduces a method for generating facial blendshape rigs from a set of example poses of a CG character. The system transfers controller semantics and expression dynamics from a generic template to the target blendshape model while solving for an optimal reproduction of the training poses, enabling a scalable design process where additional poses iteratively refine the expression space. Plausible animations are obtained even from a single training pose, and formulating the optimization in gradient space yields superior results to a direct vertex optimization.
Related Reusable Facial Rigging and Animation: Create Once, Use Many · Direct Manipulation Blendshapes · Smooth Contact-Aware Facial Blendshapes Transfer · Transferring the Rig and Animations from a Character to Different Face Models
how to read this ▾ how to read this ▴
- Category
- Method: optimization for building blendshape facial rigs from example scans
- Contributions
-
- An optimization that constructs a blendshape facial rig from a set of example poses, transferring controller semantics and expression dynamics from a generic template to the target
- A scalable, iterative workflow where additional poses refine the expression space, with plausible animation obtainable even from a single training pose
- A gradient-space formulation of the optimization that yields better results than direct vertex optimization
- Context
- Builds on deformation transfer for triangle meshes (Sumner et al.) to carry template rig semantics onto a new character's example poses. Builds on: Deformation Transfer for Triangle Meshes
- Correctness
- Assumes a generic template's controller semantics and dynamics transfer meaningfully to the target; readers should keep in mind that single-pose results are 'plausible' rather than exact and that quality scales with the number and coverage of example poses provided.
- Clarity
- Accessible; a first pass conveys the example-driven rig construction idea, do a second pass for the optimization objective and the gradient-space formulation.
- How to read it
- Read the pipeline and the role of deformation transfer first; a second pass on the gradient-space optimization is worthwhile if you plan to build rigs from scans.
Facial / Rigging
-
, , ,
Data-driven wrinkle synthesis adding fine detail to coarse cloth animation using a database of example wrinkle patterns.
abstract ▾ abstract ▴
This paper presents an example-based wrinkle synthesis technique for animating close-fitting clothing such as shirts and pants, which involve nearly continuous body contact and small kinematic wrinkles. Fine-scale wrinkles are driven from the pose of the figure's kinematic skeleton using a precomputed database built by simulating high-resolution cloth as each joint is moved over its range of motion, processing each joint independently to avoid sampling the exponential pose space. During synthesis, mesh interpolation produces a wrinkle mesh per joint, the per-joint meshes are blended into a single mesh for the full pose, and the result is combined with a coarse low-resolution cloth simulation that captures global and dynamic motion. Implemented on the GPU with CUDA, the system runs at interactive rates while capturing many characteristic fine-scale features missing from the coarse simulation.
Related Untangling Cloth · Simulation of Clothing with Folds and Wrinkles · Fast Cloth Simulation on Moving Humanoids · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation
how to read this ▾ how to read this ▴
- Category
- Method: data-driven wrinkle synthesis for clothing animation
- Contributions
-
- An example-based technique that drives fine-scale clothing wrinkles from the figure's kinematic skeleton pose for close-fitting garments
- A precomputed database built by simulating high-resolution cloth per joint over its range of motion, processing joints independently to avoid sampling the exponential pose space
- A synthesis step that interpolates and blends per-joint wrinkle meshes and combines them with a coarse cloth simulation, running at interactive rates on the GPU with CUDA
- Context
- A pose-driven, database-augmentation approach to cloth detail that complements coarse cloth simulation, related to data-driven detail synthesis for animation.
- Correctness
- The central assumption is that wrinkle response can be decomposed per joint and recombined, sidestepping the full pose space; readers should keep in mind this targets close-fitting clothing with near-continuous body contact and small kinematic wrinkles, so behavior outside that regime (loose or strongly dynamic cloth) is not its focus.
- Clarity
- Accessible; a first pass conveys the per-joint database and blending idea, do a second pass for the interpolation/blending details and the coarse-plus-detail combination.
- How to read it
- Focus on the per-joint decomposition trick and how per-joint meshes are blended for a full pose; a second pass pays off for the database construction and the coarse-detail merge.
CFX
-
, ,
Iterative coordinate-descent algorithm that decomposes arbitrary mesh animations into standard linear blend skinning without manual rigging.
abstract ▾ abstract ▴
Skinning is a simple yet popular deformation technique combining compact storage with efficient hardware accelerated rendering. While skinned meshes (such as virtual characters) are traditionally created by artists, previous work proposes algorithms to construct skinning automatically from a given vertex animation. However, these methods typically perform well only for a certain class of input sequences and often require long pre‐processing times. We present an algorithm based on iterative coordinate descent optimization which handles arbitrary animations and produces more accurate approximations than previous techniques, while using only standard linear skinning without any modifications or extensions. To overcome the computational complexity associated with the iterative optimization, we work in a suitable linear subspace (obtained by quick approximate dimensionality reduction) and take advantage of the typically very sparse vertex weights. As a result, our method requires about one or two orders of magnitude less pre‐processing time than previous methods.
Related Compressed Skinning for Facial Blendshapes · Fast and Deep Deformation Approximations · Skinning: Real-time Shape Deformation · Real-Time Skeletal Skinning with Optimized Centers of Rotation
how to read this ▾ how to read this ▴
- Category
- Method: automatic conversion of mesh animation into linear blend skinning
- Contributions
-
- An iterative coordinate-descent algorithm that decomposes arbitrary vertex animations into standard linear blend skinning with no manual rigging
- More accurate approximations than prior techniques while using only unmodified linear skinning
- Working in an approximate reduced linear subspace and exploiting sparse vertex weights to cut preprocessing time by one to two orders of magnitude
- Context
- Addresses skinning-decomposition of vertex animations, related to dual-quaternion skinning (Kavan et al., Skinning with Dual Quaternions) but deliberately targeting plain linear blend skinning for compatibility. Builds on: Skinning with Dual Quaternions
- Correctness
- Assumes the animation is well represented by standard LBS within a low-dimensional subspace; readers should note results are reported as more accurate across arbitrary sequences but the approximation quality is inherently bounded by what unextended linear skinning can express.
- Clarity
- Moderately technical; a first pass conveys the decompose-into-LBS goal, do a second pass for the coordinate-descent optimization and the dimensionality reduction.
- How to read it
- Read why LBS-compatible output matters and the overall optimization loop first; a second pass on the subspace reduction and weight sparsity pays off if preprocessing cost is your concern.
Skinning
-
, ,
For Clash of the Titans, MPC developed a sophisticated feather system used on several flying creatures across more than a hundred shots, delivering photo-realistic results over a wide range
abstract ▾ abstract ▴
For Clash of the Titans, MPC developed a sophisticated feather system used on several flying creatures across more than a hundred shots, delivering photo-realistic results over a wide range of detail from distant background characters to ones passing just in front of the camera. The system integrated tightly with MPC's existing pipeline and gave artists efficient creative control with almost no technical background required. Each feather barb was procedurally created as fully unrestricted three-dimensional curves, with all feathers generated at render time in a custom multi-threaded PRMan DSO.
Related Rendertime Procedural Feathers Through Blended Guide Meshes · Mesh-Driven Generation and Animation of Groomed Feathers · Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · A Biologically-Parameterized Feather Model
how to read this ▾ how to read this ▴
- Category
- Production talk / feather system breakdown
- Contributions
-
- Demonstrates a feather system used on several flying creatures for Clash of the Titans across more than a hundred shots, from distant background to extreme foreground
- Shows tight integration with MPC's existing pipeline and efficient artist control requiring almost no technical background
- Describes feather barbs procedurally created as fully unrestricted 3D curves, with feathers generated at render time in a custom multi-threaded PRMan DSO
- Context
- A studio production system for photo-real feathers on mystical creatures, relating to procedural curve generation and render-time geometry instancing within a PRMan pipeline.
- Correctness
- Studio practice, not peer-reviewed; results are production-proven on shipped shots, so the emphasis is on robustness, artist usability, and scalability across detail levels rather than on quantitative evaluation.
- Clarity
- Accessible; a single pass conveys the system design and pipeline choices, with detail useful to TDs building comparable creature setups.
- How to read it
- Read for the architectural decisions (procedural barb curves, render-time generation via a multi-threaded DSO) and pipeline integration; one careful pass is usually enough unless you are designing a feather system yourself.
CFX
-
, , ,
Multi-view passive capture system for facial performance without markers, recovering high-resolution dynamic geometry from synchronized cameras.
abstract ▾ abstract ▴
We introduce a purely passive facial capture approach that uses only an array of video cameras, but requires no template facial geometry, no special makeup or markers, and no active lighting. We obtain initial geometry using multi-view stereo, and then use a novel approach for automatically tracking texture detail across the frames. As a result, we obtain a high-resolution sequence of compatibly triangulated and parameterized meshes. The resulting sequence can be rendered with dynamically captured textures, while also consistently applying texture changes such as virtual makeup.
Related Performance-Driven Facial Animation · High-Quality Passive Facial Performance Capture Using Anchor Frames · High-Quality Face Capture Using Anatomical Muscles · Driving High-Resolution Facial Scans with Video Performance Capture
how to read this ▾ how to read this ▴
- Category
- Capture system: passive multi-view facial performance capture
- Contributions
-
- A purely passive facial capture approach using only an array of video cameras, with no template geometry, no markers or makeup, and no active lighting
- Initial geometry from multi-view stereo plus a method for automatically tracking texture detail across frames
- Output of a high-resolution sequence of compatibly triangulated and parameterized meshes that can be rendered with captured textures and support texture edits such as virtual makeup
- Context
- A markerless, passive alternative in the performance-driven facial animation lineage (Williams, Performance-Driven Facial Animation), relying on multi-view stereo plus temporal texture tracking. Builds on: Performance-Driven Facial Animation
- Correctness
- Relies on sufficient skin texture and synchronized multi-view coverage for stereo and tracking; readers should keep in mind that without markers or active lighting, tracking robustness depends on visible texture detail and the capture-studio camera setup.
- Clarity
- Accessible; a first pass conveys the passive, template-free pipeline, do a second pass for the texture-tracking method that yields temporally consistent meshes.
- How to read it
- Focus on what makes it template- and marker-free and on the cross-frame texture tracking; a second pass pays off on the tracking and mesh-correspondence details if you compare it to active-lighting systems.
Facial
-
, , , ,
Pore-level facial geometry from a single passive stereo shot, the start of the Medusa capture line at Disney Research.
abstract ▾ abstract ▴
This paper describes a passive stereo system for capturing the 3D geometry of a face in a single-shot under standard light sources. The system is low-cost and easy to deploy. Results are submillimeter accurate and commensurate with those from state-of-the-art systems based on active lighting, and the models meet the quality requirements of a demanding domain like the movie industry. Recovered models are shown for captures from both high-end cameras in a studio setting and from a consumer binocular-stereo camera, demonstrating scalability across a spectrum of camera deployments, and showing the potential for 3D face modeling to move beyond the professional arena and into the emerging consumer market in stereoscopic photography. Our primary technical contribution is a modification of standard stereo refinement methods to capture pore-scale geometry, using a qualitative approach that produces visually realistic results. The second technical contribution is a calibration method suited to face capture systems. The systemic contribution includes multiple demonstrations of system robustness and quality. These include capture in a studio setup, capture off a consumer binocular-stereo camera, scanning of faces of varying gender and ethnicity and age, capture of highly-transient facial expression, and scanning a physical mask to provide ground-truth validation.
Related High Resolution Passive Facial Performance Capture · Displaced Dynamic Expression Regression for Real-Time Facial Tracking and Animation · Face2Face: Real-Time Face Capture and Reenactment of RGB Videos · High-Quality Face Capture Using Anatomical Muscles
how to read this ▾ how to read this ▴
- Category
- Capture system: single-shot passive facial geometry capture
- Contributions
-
- A low-cost, easy-to-deploy passive stereo system that captures 3D face geometry in a single shot under standard light sources, with submillimeter accuracy comparable to active-lighting systems
- A modification of standard stereo refinement to recover pore-scale geometry via a qualitative approach producing visually realistic results
- A calibration method suited to face-capture systems, plus demonstrations of robustness and scalability from studio high-end cameras to a consumer binocular-stereo camera
- Context
- The start of Disney Research's Medusa capture line, building toward photoreal digital actors (Alexander et al., The Digital Emily Project) using passive stereo rather than active lighting. Builds on: The Digital Emily Project: Achieving a Photorealistic Digital Actor
- Correctness
- The pore-scale refinement is explicitly a qualitative approach aimed at visual realism, so 'submillimeter' accuracy is the system claim while fine pore detail is plausibility-driven; readers should keep in mind single-shot capture trades the temporal sequences of active rigs for deployment simplicity.
- Clarity
- Accessible; a first pass conveys the single-shot passive-stereo idea and the consumer-vs-studio framing, do a second pass for the stereo-refinement modification and calibration.
- How to read it
- Read the system overview and the single-shot motivation first; a second pass on the pore-scale stereo refinement and calibration is worthwhile if you build or evaluate face scanners.
Facial
-
, ,
Coordinate-invariant strain limiting via SVD of the deformation gradient at multiple resolutions, enforcing isotropic inextensibility for cloth and shells.
abstract ▾ abstract ▴
In this paper we describe a fast strain-limiting method that allows stiff, incompliant materials to be simulated efficiently. Unlike prior approaches, which act on springs or individual strain components, this method acts on the strain tensors in a coordinate-invariant fashion allowing isotropic behavior. Our method applies to both two-and three-dimensional strains, and only requires computing the singular value decomposition of the deformation gradient, either a small 2x2 or 3x3 matrix, for each element. We demonstrate its use with triangular and tetrahedral linear-basis elements. For triangulated surfaces in three-dimensional space, we also describe a complementary edge-angle-limiting method to limit out-of-plane bending. All of the limits are enforced through an iterative, non-linear, Gauss-Seidel-like constraint procedure. To accelerate convergence, we propose a novel multi-resolution algorithm that enforces fitted limits at each level of a non-conforming hierarchy. Compared with other constraint-based techniques, our isotropic multi-resolution strain-limiting method is straightforward to implement, efficient to use, and applicable to a wide range of shell and solid materials.
Related Strain Based Dynamics · Projective Dynamics: Fusing Constraint Projections for Fast Simulation · Continuum-based Strain Limiting · Dynamic Deformables: Implementation and Production Practicalities
how to read this ▾ how to read this ▴
- Category
- Method: a strain-limiting technique for cloth and shells
- Contributions
-
- A coordinate-invariant strain-limiting method acting on strain tensors (via SVD of the deformation gradient) to give isotropic behavior, unlike spring- or component-based prior approaches
- Applicability to both 2D and 3D strains with triangular and tetrahedral linear elements, plus a complementary edge-angle method to limit out-of-plane bending
- A novel multi-resolution algorithm that enforces fitted limits at each level of a non-conforming hierarchy to accelerate the iterative Gauss-Seidel-like constraint convergence
- Context
- Extends constraint-based strain limiting toward coordinate-invariant isotropy, related to continuum-based strain limiting (Thomaszewski et al.) and using per-element SVD of the deformation gradient. Builds on: Continuum-based Strain Limiting
- Correctness
- Assumes that limiting the singular values of the deformation gradient adequately enforces isotropic inextensibility and that the Gauss-Seidel-like iteration converges; readers should note enforcement is iterative and approximate, with the multi-resolution scheme mainly improving convergence rather than guaranteeing exact limits.
- Clarity
- Moderately technical but stated as straightforward to implement; a first pass conveys the SVD-based isotropic idea, do a second pass for the multi-resolution hierarchy and the constraint iteration.
- How to read it
- Focus on why coordinate-invariance matters and the per-element SVD limit; a second pass on the multi-resolution hierarchy pays off if convergence speed or implementation is your goal.
CFX
-
,
Explains how physics and IK were fused in Just Cause 2 for skydiving, grappling, and vehicle clinging, reducing animation asset creation while enabling novel emergent gameplay interactions.
Rigging
-
, , , ,
Production techniques for simulating 70 feet of Rapunzel's hair in Tangled using a mass-spring dynamicWires system with art-direction controls.
abstract ▾ abstract ▴
This talk describes how Walt Disney Animation Studios simulated the extreme 70 feet of hair for Rapunzel in Tangled using the proprietary mass-spring based hair simulation software dynamicWires. To handle the immense length, a sparse set of around 200 guide curves is simulated, with extra collision support structures added to fill gaps in the hair volume and on-the-fly spring forces applied to colliding segments to preserve volume and act as friction. Features such as effortless dragging via reduced tangential ground friction, per-shot simulation freezing of non-visible hair, and breakaway hair-hair constraints give the artists control over the hair motion while keeping it natural and adhering to the film's art direction.
Related Directing Hair Motion on Tangled · Building a Dynamic Dad · Grooming and Simulation Methods for Different Hair Types | Andriy Bilichenko | Paris HIVE 2023 · Hair, Feathers and Fur | Axis Studios | Character FX & Crowds Production Talks
how to read this ▾ how to read this ▴
- Category
- Production talk / system breakdown: hair simulation
- Contributions
-
- Simulating Rapunzel's 70 feet of hair with the proprietary mass-spring dynamicWires system using ~200 sparse guide curves
- Collision support structures plus on-the-fly spring forces on colliding segments to preserve hair volume and act as friction
- Art-direction controls: effortless dragging via reduced tangential ground friction, per-shot freezing of non-visible hair, and breakaway hair-hair constraints
- Context
- A production application of mass-spring hair dynamics, building on Selle et al.'s 'A Mass Spring Model for Hair Simulation'. Builds on: A Mass Spring Model for Hair Simulation
- Correctness
- Studio practice, not peer-reviewed; the approach is production-proven on Tangled, and the sparse-guide plus support-structure design is tuned to the extreme-length case rather than presented as a general validated model.
- Clarity
- Accessible; a first pass conveys the production challenges and the trick set, no heavy formulation to parse.
- How to read it
- Read once for the practical toolbox (guide-curve sparsity, volume-preserving forces, art-direction overrides); no second pass needed unless you want to cross-reference the underlying Selle mass-spring model.
CFX
-
,
Differential IK formulation providing stable, efficient inverse kinematics for complex character rigs with redundant degrees of freedom.
abstract ▾ abstract ▴
A differential inverse kinematics algorithm combining pseudoinverse and Jacobian transpose approaches achieves both convergence speed and stability in singular configurations. Near singular poses, the algorithm adaptively weights toward the Jacobian transpose to avoid the parameter instability of pure pseudoinverse while maintaining faster convergence than damped approaches. Singular value decomposition guides the adaptive blending based on how close the skeleton approaches singular configurations.
Related MoRig: Motion-Aware Rigging of Character Meshes from Point Clouds · Using Deep Learning to Approximate Joint Placement in 3D Bipedal Characters · A.C.M.E. Multilimb System · How the Rig Design Impacts the Animation Process
how to read this ▾ how to read this ▴
- Category
- Method: a differential inverse-kinematics algorithm
- Contributions
-
- A differential IK formulation that adaptively blends pseudoinverse and Jacobian-transpose approaches for both convergence speed and stability
- Weighting toward the Jacobian transpose near singular poses to avoid pseudoinverse parameter instability while staying faster than damped methods
- Using singular value decomposition to guide the adaptive blend based on closeness to singular configurations
- Context
- Sits in the classic inverse-kinematics lineage for redundant articulated rigs, blending the well-known pseudoinverse and Jacobian-transpose families and using SVD as the switching signal.
- Correctness
- The central claim is stability through singular configurations via SVD-guided blending; a reader should note the trade-off is between convergence speed and the cost/robustness of the SVD-driven weighting, and the abstract states no specific rig or timing results.
- Clarity
- Moderately technical; a first pass conveys the adaptive-blend idea, but a second pass is needed to follow the SVD-based weighting formulation.
- How to read it
- Focus first on why the pseudoinverse becomes unstable near singularities and how the transpose blend fixes it; do a second pass on the SVD weighting if you intend to implement it.
Rigging
-
, , ,
Data-driven conditional cloth model learned from simulation that enables real-time animation of thousands of garments with approximate collision resolution.
abstract ▾ abstract ▴
We present a technique for learning clothing models that enables the simultaneous animation of thousands of detailed garments in real-time. This surprisingly simple conditional model learns and preserves the key dynamic properties of a cloth motion along with folding details. Our approach requires no a priori physical model, but rather treats training data as a "black box." We show that the models learned with our method are stable over large time-steps and can approximately resolve cloth-body collisions. We also show that within a class of methods, no simpler model covers the full range of cloth dynamics captured by ours. Our method bridges the current gap between skinning and physical simulation, combining benefits of speed from the former with dynamic effects from the latter. We demonstrate our approach on a variety of apparel worn by male and female human characters performing a varied set of motions typically used in video games ( e.g. , walking, running, jumping, etc. ).
Related A Pixel-Based Framework for Data-Driven Clothing · Directing Cloth Draping through Blended UVs · Subspace Neural Physics: Fast Data-Driven Interactive Simulation · PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation
how to read this ▾ how to read this ▴
- Category
- Method: a data-driven (learned) clothing model
- Contributions
-
- A simple conditional model learned from simulation data that animates thousands of detailed garments in real time
- Stability over large time-steps with approximate cloth-body collision resolution and no a priori physical model
- Bridging skinning and physical simulation, combining skinning speed with simulation-like dynamic folding detail
- Context
- Positions itself between skinning and physical cloth simulation, treating simulation training data as a black box to learn a conditional dynamic model.
- Correctness
- Demonstrated on apparel for male and female human characters doing game-typical motions (walking, running, jumping); collisions are only approximately resolved, and the model is conditioned on its training distribution, so behavior outside that motion class is a fair concern.
- Clarity
- Accessible; the abstract frames the idea plainly, and a first pass conveys the skinning-vs-simulation bridge, with a second pass for the conditional-model details.
- How to read it
- Read for the conceptual placement between skinning and simulation and the stability claims; a second pass pays off if you need the learning setup and how collisions are approximated.
CFX / ML Deformation
-
, ,
Panel from 5th Cell, BioWare, and 343 Industries covering mocap integration, blendshapes, shader-based deformation, procedural helpers, and muscle systems in character pipelines.
Rigging / Skinning / Muscles
-
, , , , , ,
USC ICT and Image Metrics cross the uncanny valley with a fully digital photoreal face built from Light Stage scans.
abstract ▾ abstract ▴
The Digital Emily project, a collaboration between Image Metrics and USC Institute for Creative Technologies, built a photorealistic animated digital face from high-resolution scans of actress Emily O'Brien. Her facial shape and reflectance were digitized in 33 expressions using the Light Stage capture system with polarized spherical-gradient illumination, recording diffuse and specular reflectance and surface normals accurate to the level of skin pores and fine wrinkles. The captured expressions were turned into a blendshape facial rig with displacement maps, eyes, and teeth, and animated using a video-based facial-animation system driven by an actor's recorded performance. Rendered with subsurface scattering and image-based lighting, the result aimed to cross the uncanny valley and was widely accepted by viewers as video of a real person.
Related High Fidelity Facial Animation Capture and Retargeting with Contours · Acquiring the Reflectance Field of a Human Face · Facial Performance Synthesis using Deformation-Driven Polynomial Displacement Maps · Facial Performance Enhancement Using Dynamic Shape Space Analysis
how to read this ▾ how to read this ▴
- Category
- Capture-to-rig pipeline: photoreal digital actor
- Contributions
-
- A photorealistic animated digital face built from high-resolution Light Stage scans of an actress in 33 expressions
- Capturing diffuse and specular reflectance plus surface normals down to skin pores and fine wrinkles via polarized spherical-gradient illumination
- Turning the scans into a blendshape rig with displacement maps, eyes and teeth, driven by a video-based facial-animation system and rendered with subsurface scattering and image-based lighting
- Context
- An end-to-end integration building on Blanz and Vetter's morphable face model and Debevec et al.'s reflectance-field acquisition, combining them with blendshape rigging and performance-driven animation. Builds on: A Morphable Model for the Synthesis of 3D Faces · Acquiring the Reflectance Field of a Human Face
- Correctness
- A collaboration-driven demonstration aimed at crossing the uncanny valley; the stated outcome is that viewers accepted the result as real video, which is a perceptual/audience claim rather than a controlled quantitative evaluation, and it rests on the specialized Light Stage capture rig.
- Clarity
- Very accessible as a system tech note; a single pass conveys the pipeline end to end, with deeper passes only if you chase the cited capture and morphable-model methods.
- How to read it
- Read once as a pipeline overview connecting capture, rigging and rendering; follow the Debevec reflectance-field and Blanz morphable-model citations if you need the underlying capture and modeling theory.
Facial
- talk The Next Generation of Fighting Games: Physics and Animation in UFC 2009 Undisputed GDC Industrial
Presents full-body IK targeting, physics and animation blending techniques, and character navigation solutions used in UFC 2009 Undisputed for responsive fighter animation.
Rigging
- talk Uncharted 2 Character Pipeline: An In-depth Look at the Creation of U2's Characters GDC Industrial
,
Naughty Dog details proprietary modeling tools, Maya referencing for skeletal assets, a shared male/female facial rig skeleton, sculpting workflows, and UV seam solutions for Uncharted 2.
Rigging / Facial
-
,
Attaches a fine-resolution wrinkle mesh to a coarse animated base mesh, solving wrinkle geometry with a static solver at interactive rates.
abstract ▾ abstract ▴
This paper presents a simple and fast method for adding wrinkles to dynamic meshes such as simulated cloth or the skin of an animated character. A higher resolution wrinkle mesh is attached to a coarse base mesh, with wrinkle vertices allowed to deviate from their attachment positions within a limited range, and its shape is determined by a static Gauss-Seidel solver that runs in parallel to the base mesh motion. Unlike prior approaches that only add displacements along the surface normal, the method also uses tangential degrees of freedom and a two-phase compression force profile on the base mesh edges to produce more realistic wrinkles. The wrinkle mesh can be tessellated independently of the base mesh, runs in real time on the GPU, and lets the simulation mesh resolution be reduced without losing surface detail.
Related Position Based Dynamics · Vivace: A Practical Gauss-Seidel Method for Stable Soft Body Dynamics · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Strain Based Dynamics
how to read this ▾ how to read this ▴
- Category
- Method: a real-time surface-detail (wrinkle) technique
- Contributions
-
- Attaching a higher-resolution wrinkle mesh to a coarse animated base mesh, with wrinkle vertices allowed to deviate within a limited range
- Solving wrinkle shape with a static Gauss-Seidel solver running in parallel to the base-mesh motion, in real time on the GPU
- Using tangential degrees of freedom and a two-phase compression force profile (not just normal-direction displacement) for more realistic wrinkles, letting base-mesh resolution be reduced
- Context
- A detail-enhancement layer for dynamic cloth and skin in the Position Based Dynamics tradition, extending prior normal-only wrinkle approaches. Builds on: Position Based Dynamics
- Correctness
- Presented as a simple, fast add-on for cloth and animated-character skin; it is a plausible geometric/static solver rather than a full physical wrinkle simulation, and wrinkle motion is constrained to a limited deviation range around the base attachment.
- Clarity
- Accessible; a first pass conveys the attach-and-static-solve idea, and a second pass clarifies the tangential DOF and two-phase compression formulation.
- How to read it
- Focus first on the base/wrinkle decoupling and why a static solver suffices; a second pass on the tangential DOFs and compression-force profile is worth it if you plan to reproduce the look.
CFX / Skinning
2009
7-
, , , ,
Unified statistical model of human pose and body shape learned from 550 laser scans, capturing pose-dependent muscle deformations.
abstract ▾ abstract ▴
Generation and animation of realistic humans is an essential part of many projects in today's media industry. Especially, the games and special effects industry heavily depend on realistic human animation. In this work a unified model that describes both, human pose and body shape is introduced which allows us to accurately model muscle deformations not only as a function of pose but also dependent on the physique of the subject. Coupled with the model's ability to generate arbitrary human body shapes, it severely simplifies the generation of highly realistic character animations. A learning based approach is trained on approximately 550 full body 3D laser scans taken of 114 subjects. Scan registration is performed using a non‐rigid deformation technique. Then, a rotation invariant encoding of the acquired exemplars permits the computation of a statistical model that simultaneously encodes pose and body shape. Finally, morphing or generating meshes according to several constraints simultaneously can be achieved by training semantically meaningful regressors.
Related NeuroSkinning: Automatic Skin Binding for Production Characters with Deep Graph Networks · Animation Setup Transfer for 3D Characters · Segmentation-Based Skinning · NiLBS: Neural Inverse Linear Blend Skinning
how to read this ▾ how to read this ▴
- Category
- Method / statistical model of human pose and shape
- Contributions
-
- A unified statistical model that simultaneously encodes human pose and body shape, capturing muscle deformations as a function of both pose and physique
- A learning-based pipeline using non-rigid registration and a rotation-invariant encoding of exemplars to build the model
- Semantically meaningful regressors that morph or generate meshes under several simultaneous constraints
- Context
- Builds on SCAPE (Anguelov et al. 2005), extending data-driven body modeling to jointly couple pose-dependent and shape-dependent deformation in one model. Builds on: SCAPE: Shape Completion and Animation of People
- Correctness
- Trained on roughly 550 full-body laser scans from 114 subjects, so generalization is bounded by that population and by registration quality; the rotation-invariant encoding is the key modeling assumption a reader should keep in mind.
- Clarity
- Accessible motivation with technical core; a first pass conveys the unified pose-plus-shape idea, a second pass for the encoding and regressor training.
- How to read it
- Read first for what unifying pose and shape buys over SCAPE; a second pass on the rotation-invariant encoding and the regressors is worthwhile if you intend to fit or sample bodies.
Skinning / ML Deformation
-
,
Synthesizes complex contact-rich motions such as dressing by embedding topological relationships of body segments into motion coordinates.
abstract ▾ abstract ▴
In this paper, we propose a new method to efficiently synthesize character motions that involve close contacts such as wearing a T‐shirt, passing the arms through the strings of a knapsack, or piggy‐back carrying an injured person. We introduce the concept of topology coordinates, in which the topological relationships of the segments are embedded into the attributes. As a result, the computation for collision avoidance can be greatly reduced for complex motions that require tangling the segments of the body. Our method can be combinedly used with other prevalent frame‐based optimization techniques such as inverse kinematics.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Automated Extraction and Parameterization of Motions in Large Data Sets · Data-Driven Autocompletion for Keyframe Animation · Motion Grammars for Character Animation
how to read this ▾ how to read this ▴
- Category
- Method: a motion-synthesis representation for contact-rich motion
- Contributions
-
- Introduces topology coordinates that embed the topological relationships of body segments into motion attributes
- Greatly reduces collision-avoidance computation for tangling, close-contact motions such as dressing or passing arms through straps
- Combines with frame-based optimization techniques such as inverse kinematics
- Context
- No explicit prior works are listed; it sits in the lineage of physically-based and optimization-based character motion synthesis, adding a topological encoding to handle entanglement that geometric collision handling struggles with.
- Correctness
- Demonstrated on close-contact examples like wearing a T-shirt and piggy-back carrying; the approach assumes the relevant interactions can be captured as segment topology relationships, so motions outside that abstraction may not benefit.
- Clarity
- The idea is intuitive once topology coordinates are defined; a first pass conveys the concept, a second pass for how the coordinates are computed and coupled with IK.
- How to read it
- Focus on what topology coordinates encode and why they cut collision-avoidance cost for tangled motion; a second pass on the coordinate formulation is worth it if you work on synthesis of contact-heavy motion.
Motion Synthesis
-
, ,
This paper presents a scheme for constructing complex, non-penetrating feather geometry suitable for feature animation.
abstract ▾ abstract ▴
This paper presents a scheme for constructing complex, non-penetrating feather geometry suitable for feature animation. The method derives a potential field from guide geometry and uses an implicit constraint surface, defined as a displacement from the character skin, so adjacent feathers never intersect. The approach is frame independent and yields visually smooth animation free of popping, demonstrated on a character with several thousand feathers.
Related Rendertime Procedural Feathers Through Blended Guide Meshes · Mesh-Driven Generation and Animation of Groomed Feathers · Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · Animating Puss in Boots' Feather in Shrek 2
how to read this ▾ how to read this ▴
- Category
- Method: a collision-free feather construction technique
- Contributions
-
- A scheme for constructing complex non-penetrating feather geometry for feature animation
- Derives a potential field from guide geometry and uses an implicit constraint surface defined as a displacement from the character skin so adjacent feathers never intersect
- A frame-independent approach yielding visually smooth, pop-free animation, shown on a character with several thousand feathers
- Context
- Builds on Chen et al.'s feather modeling and Bridson et al.'s robust cloth collision handling, and is the published, peer-reviewed companion in spirit to Rijpkema's earlier rendertime feather talk. Builds on: Modeling and Rendering of Realistic Feathers · Robust Treatment of Collisions, Contact and Friction for Cloth Animation
- Correctness
- Demonstrated on a character with several thousand feathers with smooth pop-free results; being frame-independent it sidesteps temporal-coherence solvers, so the main assumption to note is that the implicit constraint surface and potential field adequately capture the desired feather layout.
- Clarity
- Accessible and well-scoped; a first pass conveys the potential-field-plus-constraint-surface idea, a second pass for the field construction details.
- How to read it
- Read first for how the displacement-from-skin constraint surface guarantees non-penetration frame by frame; a second pass on the potential-field derivation pays off if you implement grooming, and compare against the Rijpkema talk.
CFX
-
, ,
Continuum-based strain limiting for cloth simulation that prevents over-stretching while maintaining simulation stability and physical plausibility.
abstract ▾ abstract ▴
We present Continuum‐based Strain Limiting (CSL), a new method for limiting deformations in physically‐based cloth simulations. Despite recent developments for nearly inextensible materials, the efficient simulation of general biphasic textiles and their anisotropic behavior remains challenging. Many approaches use soft materials and enforce limits on edge elongations, leading to discretization‐dependent behavior. Moreover, they offer no explicit control over shearing and stretching unless specifically aligned meshes are used. Based on a continuum deformation measure, our method allows accurate control over all strain components using individual thresholds. We impose deformation limits element‐wise and cast the problem as a 6×6 system of linear equations. CSL can be combined with any cloth simulator and, as a velocity filter, integrates seamlessly into standard collision handling.
Related Efficient Simulation of Inextensible Cloth · Multi-Resolution Isotropic Strain Limiting · Physics-Inspired Upsampling for Cloth Simulation in Games · Adaptive Anisotropic Remeshing for Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a strain-limiting technique for cloth
- Contributions
-
- Continuum-based Strain Limiting (CSL), a continuum deformation measure giving accurate control over all strain components with individual thresholds
- Element-wise deformation limits cast as a 6x6 linear system, enabling explicit control of shearing and stretching without specially aligned meshes
- A velocity-filter formulation that combines with any cloth simulator and integrates into standard collision handling
- Context
- Builds on Baraff and Witkin's Large Steps in Cloth Simulation and improves on edge-elongation strain limiting, whose discretization-dependent behavior it aims to remove. Builds on: Large Steps in Cloth Simulation
- Correctness
- The method assumes a continuum deformation measure per element and operates as a velocity filter; it targets general biphasic, anisotropic textiles, so a reader should note that benefits over edge-based limits are clearest where mesh-dependence and shear control matter.
- Clarity
- Accessible to readers with cloth-simulation background; a first pass conveys why continuum strain beats edge limits, a second pass for the 6x6 element formulation.
- How to read it
- Focus on the continuum-versus-edge framing and the per-component thresholds; a second pass on the 6x6 system and velocity-filter integration is worthwhile if you maintain a cloth solver.
CFX
-
, , , ,
Continuum-based hair simulation using SPH-like discretization preserving fine strand detail while handling large numbers of interacting hairs.
abstract ▾ abstract ▴
This paper presents a hybrid Eulerian/Lagrangian approach for simulating straight hair that captures both bulk volumetric behavior and intricate strand-level detail. Bulk hair interaction and volume preservation are handled efficiently by a FLIP-based incompressible fluid solver operating on a grid, while fine hair-hair contact is resolved with high-resolution Lagrangian self-collisions on a mass/spring strand model. The volumetric solve acts as an effective preconditioner for the geometric self-collision step, allowing many thousands of directly colliding hairs to be simulated faster than fully Lagrangian collision handling alone. The method also supports user-controllable density targeting and a separation condition to control artificial sticking, and is demonstrated on examples including braids and a walking animated character with 10,000 simulated hairs.
Related A Mass Spring Model for Hair Simulation · Anisotropic Elastoplasticity for Cloth, Knit and Hair Frictional Contact · A Reduced Model for Interactive Hairs · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function
how to read this ▾ how to read this ▴
- Category
- Method: a hybrid hair simulation technique
- Contributions
-
- A hybrid Eulerian/Lagrangian approach for straight hair that captures both bulk volumetric behavior and strand-level detail
- A FLIP-based incompressible fluid solver on a grid for bulk interaction and volume preservation, with high-resolution Lagrangian self-collisions on a mass/spring strand model for fine contact
- Uses the volumetric solve as a preconditioner for the geometric self-collision step and adds user-controllable density targeting and a separation condition, demonstrated up to 10,000 hairs
- Context
- Builds on Selle et al.'s mass-spring hair model and borrows incompressible-fluid (FLIP) machinery to make dense hair-hair contact tractable. Builds on: A Mass Spring Model for Hair Simulation
- Correctness
- Demonstrated on braids and a walking character with 10,000 simulated hairs; the method is scoped to straight hair and treats bulk interaction as an incompressible continuum, so curly hair and the continuum approximation's limits are the caveats to keep in mind.
- Clarity
- Conceptually clear if you know fluid solvers and mass-spring strands; a first pass conveys the bulk-plus-detail split, a second pass for the FLIP solve and its use as a preconditioner.
- How to read it
- Read first for why splitting bulk (fluid grid) from detail (Lagrangian self-collision) scales to thousands of hairs; a careful second pass on the FLIP preconditioning is worth it if you implement dense hair contact.
CFX
-
Demonstrates semi-procedural techniques for real-time bipedal locomotion, blending keyframe animation with procedural IK to adapt foot placement dynamically on uneven terrain.
Rigging
-
, ,
Mesh-based representation for volumetric hair enabling efficient simulation and rendering of dense hair interacting with the body.
abstract ▾ abstract ▴
Hair meshes are a new method for modeling hair that brings hair modeling as close as possible to modeling polygonal surfaces, giving artists direct control over the overall shape of the hair. The hair mesh represents the entire hair volume with topological constraints that allow the path of individual hair strands to be automatically and uniquely traced from the scalp through the volume using barycentric coordinates and Catmull-Rom splines. A set of topological operations such as face extrude, layer insert, and edge separate let users create and edit hair meshes while preserving these constraints, and internal vertices are placed automatically via a constrained quadratic minimization solved with conjugate gradients so the artist need only manipulate the outer surface. The approach supports a wide range of realistic hairstyles, integrates with procedural styling and wisp-based techniques, and can be used for real-time hair simulation.
Related Hair Modeling and Simulation by Style · Structure-Aware Hair Capture · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function · Scriptable Character FX Solution
how to read this ▾ how to read this ▴
- Category
- Method: a mesh-based hair modeling representation
- Contributions
-
- A hair mesh representation that brings hair modeling close to polygonal surface modeling, giving artists direct control over overall hair shape
- Topological constraints that let individual strands be traced automatically and uniquely from scalp through the volume using barycentric coordinates and Catmull-Rom splines
- Editing operations (face extrude, layer insert, edge separate) plus automatic internal-vertex placement via constrained quadratic minimization, so artists manipulate only the outer surface
- Context
- A surface-modeling-inspired approach to volumetric hair that connects to wisp-based and procedural styling techniques, intended to support real-time simulation and rendering of dense hair.
- Correctness
- Demonstrated across a range of realistic hairstyles with constraint-preserving edits; readers should note the claims center on artist control and topology preservation rather than physically validated strand dynamics, and the constrained solve assumes the outer surface adequately determines interior structure.
- Clarity
- Accessible; a first pass conveys the surface-modeling-for-hair idea, do a second pass for the topological constraints and the quadratic minimization formulation.
- How to read it
- Focus first on the topology rules and the strand-tracing scheme; do a second pass on the constrained quadratic minimization (conjugate gradients) if you intend to implement the interior-vertex solver.
CFX
2008
14-
, ,
A practical mass-spring formulation able to simulate full heads of individual hairs with stable contact and collision.
abstract ▾ abstract ▴
Presents a mass-spring model for simulating individual hair strands on the head with up to one million particles. Introduces altitude spring formulation to model torsion in hair while preventing tetrahedra collapse, along with implicit linear springs for unconditional stability and strain limiting approaches. Handles complex hair interactions including sticking, clumping, collisions with objects and self-collisions.
Related Super-Helices for Predicting the Dynamics of Natural Hair · The Art and Technology of Hair Simulation in Disney's Moana · Detail-Preserving Continuum Simulation of Straight Hair · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function
how to read this ▾ how to read this ▴
- Category
- Method: a mass-spring hair simulation model
- Contributions
-
- A mass-spring formulation that simulates full heads of individual hair strands at large particle counts
- An altitude spring formulation to capture torsion while preventing tetrahedra collapse
- Implicit linear springs and strain limiting for unconditional stability, plus handling of sticking, clumping, and self-collisions
- Context
- Builds on implicit cloth-style integration in the spirit of Large Steps in Cloth Simulation (Baraff and Witkin), adapting stable spring dynamics and collision handling to thin filamentary hair. Builds on: Large Steps in Cloth Simulation
- Correctness
- Demonstrated on large-scale individual-strand hair with stable contact and collision; readers should keep in mind that mass-spring rod models approximate bending and twisting and depend on careful stiffness and stability tuning rather than a continuum elastic-rod basis.
- Clarity
- Fairly accessible if you know cloth simulation; a first pass conveys the system, a second pass is needed for the altitude-spring and strain-limiting details.
- How to read it
- Focus first on the altitude spring and stability sections, since those are the core novelty; a second pass on the collision and clumping handling pays off if you simulate dense hair.
CFX
-
, , ,
Reconstructs articulated character mesh animation from multi-view silhouette sequences enabling markerless performance capture.
abstract ▾ abstract ▴
Details in mesh animations are difficult to generate but they have great impact on visual quality. In this work, we demonstrate a practical software system for capturing such details from multi-view video recordings. Given a stream of synchronized video images that record a human performance from multiple viewpoints and an articulated template of the performer, our system captures the motion of both the skeleton and the shape. The output mesh animation is enhanced with the details observed in the image silhouettes. For example, a performance in casual loose-fitting clothes will generate mesh animations with flowing garment motions. We accomplish this with a fast pose tracking method followed by nonrigid deformation of the template to fit the silhouettes. The entire process takes less than sixteen seconds per frame and requires no markers or texture cues. Captured meshes are in full correspondence making them readily usable for editing operations including texturing, deformation transfer, and deformation model learning.
Related Surface Based Motion Retargeting by Preserving Spatial Relationship · Animation Setup Transfer for 3D Characters · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Robust Marker Trajectory Repair for MOCAP Using Kinematic Reference
how to read this ▾ how to read this ▴
- Category
- Capture system: markerless articulated performance capture
- Contributions
-
- Reconstructs both skeleton motion and detailed surface shape from multi-view silhouette video using an articulated template
- A fast pose-tracking step followed by nonrigid template deformation to fit observed silhouettes, capturing details like flowing clothing
- Produces meshes in full correspondence, ready for texturing, deformation transfer, and deformation-model learning
- Context
- Fits in the markerless multi-view performance-capture line of work, using an articulated template plus silhouette fitting rather than markers or texture cues.
- Correctness
- Demonstrated on human performances including loose clothing with a stated per-frame processing time and no markers needed; a reader should note silhouette-only cues can be ambiguous for concavities and surface detail away from the contour, and quality depends on the template and camera coverage.
- Clarity
- Reads as a practical system paper; a first pass conveys the pipeline, a second pass clarifies the pose-tracking and nonrigid-fitting stages.
- How to read it
- Read it as a pipeline: focus on how pose tracking feeds the nonrigid silhouette fit, and what correspondence buys you downstream; one careful pass usually suffices unless you plan to reimplement the fitting.
Retargeting
-
,
Data-driven model of coupled skin and muscle deformation learned from 4D scan sequences of real human motion.
abstract ▾ abstract ▴
This paper presents a data-driven technique for synthesizing realistic skin and muscle deformation by separating static pose-dependent deformations from dynamic inertial effects. Models are built from high-density marker data and can drive skin deformation from skeletal motion captured with fewer markers, using PCA and spring-damper equations to capture jiggling and muscle dynamics.
Related Capture and Statistical Modeling of Arm-Muscle Deformations · Anatomically Based Modeling · How to Build a Human: Practical Physics-Based Character Animation · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method: data-driven skin and muscle deformation model
- Contributions
-
- Synthesizes realistic skin and muscle deformation by separating static pose-dependent shape from dynamic inertial effects
- Builds models from high-density marker data using PCA for shape and spring-damper equations for jiggle and muscle dynamics
- Drives skin deformation from skeletal motion captured with far fewer markers
- Context
- Extends the authors' earlier work on capturing and animating skin deformation in human motion, adding an explicit dynamic (inertial) component on top of pose-dependent deformation. Builds on: Capturing and Animating Skin Deformation in Human Motion
- Correctness
- Validated as a data-driven model learned from dense marker captures of real motion; readers should keep in mind it is limited to the captured subjects and motion range, and the linear PCA plus spring-damper decomposition is an approximation of true soft-tissue behavior.
- Clarity
- Accessible; a first pass conveys the static-versus-dynamic split, a second pass clarifies the PCA and spring-damper formulation.
- How to read it
- Focus on the separation of pose-dependent from inertial deformation and how the sparse-marker drive works; a second pass on the dynamics model pays off if you want to reproduce jiggle.
Skinning / Muscles
-
, , , ,
Discrete differential geometry formulation for elastic rod simulation capturing bending and twisting for hair, cables, and filamentary structures.
abstract ▾ abstract ▴
We present a discrete treatment of adapted framed curves, parallel transport, and holonomy, thus establishing the language for a discrete geometric model of thin flexible rods with arbitrary cross section and undeformed configuration. Our approach differs from existing simulation techniques in the graphics and mechanics literature both in the kinematic description---we represent the material frame by its angular deviation from the natural Bishop frame---as well as in the dynamical treatment---we treat the centerline as dynamic and the material frame as quasistatic. Additionally, we describe a manifold projection method for coupling rods to rigid-bodies and simultaneously enforcing rod inextensibility. The use of quasistatics and constraints provides an efficient treatment for stiff twisting and stretching modes; at the same time, we retain the dynamic bending of the centerline and accurately reproduce the coupling between bending and twisting modes. We validate the discrete rod model via quantitative buckling, stability, and coupled-mode experiments, and via qualitative knot-tying comparisons.
Related Simulation-Ready Hair Capture · Rest Shape Optimization for Sag-Free Discrete Elastic Rods · Efficient and Stable Approach to Elasticity and Collisions for Hair Animation · Efficient Simulation of Inextensible Cloth
how to read this ▾ how to read this ▴
- Category
- Method: a discrete elastic rod simulation model
- Contributions
-
- A discrete differential geometry treatment of framed curves, parallel transport, and holonomy for thin elastic rods
- Represents the material frame as angular deviation from the Bishop frame, treating the centerline as dynamic and the material frame as quasistatic
- A manifold projection method for rod-rigid-body coupling and inextensibility, validated by buckling, stability, and coupled-mode experiments
- Context
- Advances the predictive-hair-and-rod line that includes Super-Helices (Bertails et al.), reformulating rod mechanics in the language of discrete differential geometry. Builds on: Super-Helices for Predicting the Dynamics of Natural Hair
- Correctness
- Validated quantitatively against buckling and stability experiments and qualitatively via knot tying; the dynamic-centerline plus quasistatic-frame choice is an efficiency trade-off that handles stiff twist and stretch by assumption rather than fully dynamic torsion.
- Clarity
- Mathematically demanding; a first pass conveys the kinematic and dynamic choices, but the discrete geometry needs a careful second and likely third pass.
- How to read it
- Get the high-level model choices (Bishop frame, dynamic centerline, quasistatic material frame) on the first pass, then invest in second and third passes on the discrete geometry and constraint projection if implementing.
CFX
- Facial Performance Synthesis using Deformation-Driven Polynomial Displacement Maps SIGGRAPH Asia Academic 160 cites
, , , , , , , ,
Learns polynomial maps from mocap markers to high-res face geometry for wrinkle and pore detail synthesis.
abstract ▾ abstract ▴
The paper presents a method for modeling and synthesizing realistic facial deformations using polynomial displacement maps (PDMs) that encode the relationship between sparse motion capture markers and high-resolution facial geometry. A real-time 3D scanning system captures facial performances at wrinkle and pore detail levels. The deformation-driven PDMs represent medium-scale and fine-scale facial displacements as functions of motion capture marker positions, enabling synthesis of novel performances with realistic wrinkles and skin detail. The approach is demonstrated on multiple subjects and expressions, showing the ability to generate detailed facial geometry from coarse motion capture data.
Related BlendForces: A Dynamic Framework for Facial Animation · The Digital Emily Project: Achieving a Photorealistic Digital Actor · 3D Morphable Face Models: Past, Present and Future · Creating an Actor-Specific Facial Rig from Performance Capture
how to read this ▾ how to read this ▴
- Category
- Method: data-driven facial detail synthesis
- Contributions
-
- Polynomial displacement maps (PDMs) that encode the mapping from sparse motion-capture markers to high-resolution face geometry
- A real-time 3D scanning setup that captures facial performance down to wrinkle and pore detail
- Synthesis of novel performances with realistic medium- and fine-scale skin detail from coarse marker input
- Context
- Builds on high-resolution facial capture in the lineage of Acquiring the Reflectance Field of a Human Face (Debevec et al.), turning captured detail into a learned marker-to-geometry deformation model. Builds on: Acquiring the Reflectance Field of a Human Face
- Correctness
- Demonstrated on multiple subjects and expressions, generating detailed geometry from coarse markers; readers should note the polynomial map is fit per-subject from captured data, so extrapolation beyond the captured expression range and transfer across subjects are limitations.
- Clarity
- Accessible; a first pass conveys the marker-driven detail idea, a second pass clarifies the polynomial map fitting.
- How to read it
- Focus on how marker motion parameterizes the displacement maps and the multi-scale (medium and fine) split; a second pass on the PDM fitting pays off if you want to reproduce wrinkle synthesis.
Facial
-
, , ,
Dual quaternion blending for skeletal skinning that eliminates the candy-wrapper artifact of LBS at negligible extra computational cost.
abstract ▾ abstract ▴
Skinning of skeletally deformable models is extensively used for real-time animation of characters, creatures and similar objects. The standard solution, linear blend skinning, has some serious drawbacks that require artist intervention. Therefore, a number of alternatives have been proposed in recent years. All of them successfully combat some of the artifacts, but none challenge the simplicity and efficiency of linear blend skinning. As a result, linear blend skinning is still the number one choice for the majority of developers. In this article, we present a novel skinning algorithm based on linear combination of dual quaternions. Even though our proposed method is approximate, it does not exhibit any of the artifacts inherent in previous methods and still permits an efficient GPU implementation. Upgrading an existing animation system from linear to dual quaternion skinning is very easy and has a relatively minor impact on runtime performance.
Related Skinning with Dual Quaternions · Stretchable and Twistable Bones for Skeletal Shape Deformation · Skinning: Real-time Shape Deformation · Elasticity-Inspired Deformers for Character Articulation
how to read this ▾ how to read this ▴
- Category
- Method: a skeletal skinning algorithm (journal version)
- Contributions
-
- Skinning via a linear combination of dual quaternions that removes the candy-wrapper and related LBS artifacts
- An approximate blend that still permits an efficient GPU implementation
- Easy upgrade path from linear blend skinning with minor runtime cost
- Context
- Is the extended journal treatment of the authors' earlier Skinning with Dual Quaternions, refining the approximate dual-quaternion blending approach. Builds on: Skinning with Dual Quaternions
- Correctness
- Demonstrated as an artifact-free real-time skinning method that is explicitly approximate yet free of the prior methods' artifacts; readers should keep in mind it blends rigid transforms only and does not model soft-tissue dynamics or muscle bulging.
- Clarity
- Accessible with clear motivation; a first pass conveys the result, a second pass is needed to follow the dual-quaternion derivation and the approximation analysis.
- How to read it
- If you only read one of the two dual-quaternion papers, read this fuller version; first pass for the idea and figures, second pass on the blend formulation for implementation.
Skinning
-
, ,
Derives shape-preserving cage coordinates from Green's integral identity, inducing quasi-conformal mappings for realistic mesh deformation.
abstract ▾ abstract ▴
We introduce Green Coordinates for closed polyhedral cages. The coordinates are motivated by Green's third integral identity and respect both the vertices position and faces orientation of the cage. We show that Green Coordinates lead to space deformations with a shape-preserving property. In particular, in 2D they induce conformal mappings, and extend naturally to quasi-conformal mappings in 3D. In both cases we derive closed-form expressions for the coordinates, yielding a simple and fast algorithm for cage-based space deformation. We compare the performance of Green Coordinates with those of Mean Value Coordinates and Harmonic Coordinates and show that the advantage of the shape-preserving property is not achieved at the expense of speed or simplicity. We also show that the new coordinates extend the mapping in a natural analytic manner to the exterior of the cage, allowing the employment of partial cages.
Related Green Coordinates for Triquad Cages in 3D · Harmonic Coordinates for Character Articulation · Mean Value Coordinates for Closed Triangular Meshes · Biharmonic Coordinates
how to read this ▾ how to read this ▴
- Category
- Method: shape-preserving cage-based deformation coordinates
- Contributions
-
- Green Coordinates for closed polyhedral cages, derived from Green's third integral identity, respecting both cage vertex positions and face orientations
- A shape-preserving property yielding conformal mappings in 2D and quasi-conformal mappings in 3D, with closed-form coordinate expressions
- Natural analytic extension of the deformation to the cage exterior, enabling partial cages
- Context
- Advances cage-based space deformation beyond Mean Value Coordinates (Ju et al.) and Harmonic Coordinates by adding a shape-preserving (quasi-conformal) guarantee. Builds on: Mean Value Coordinates for Closed Triangular Meshes
- Correctness
- Compared against Mean Value and Harmonic Coordinates and shown to gain shape preservation without sacrificing speed or simplicity; readers should note the coordinates depend on face orientation and that quasi-conformality is a property of the mapping, not a guarantee against all cage-design artifacts.
- Clarity
- Mathematically grounded but well-motivated; a first pass conveys the shape-preserving advantage, a second pass clarifies the derivation from Green's identity.
- How to read it
- Read the comparison figures against MVC and Harmonic Coordinates first to grasp the benefit, then a second pass on the derivation and closed-form expressions if implementing cage deformation.
Skinning
-
, ,
Musculotendon simulation for anatomically detailed hand animation, modeling tendon routing and muscle fiber contraction for realistic grasping.
abstract ▾ abstract ▴
This paper presents an automatic technique for generating the motion of tendons and muscles under the skin of a traditionally animated character. The method integrates a standard keyframe or motion capture animation pipeline with a biomechanical simulator that uses rigid bodies for bones and spline-based strands for tendons and muscles, supporting complex routing constraints such as sliding and surface constraints. An incremental controller solves a constrained quadratic optimization to compute the muscle activation levels required to track the input skeletal animation, and the resulting subcutaneous strand motion is skinned to the character surface as a post-process. The approach is demonstrated on animations of the human hand and forearm, where the model contains 54 musculotendons and 17 bones, capturing tendon deformations on the back of the hand and thumb.
Related A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation · Pose-Space Subspace Dynamics · Enriching Facial Blendshape Rigs with Physical Simulation · Simulation of Hand Anatomy Using Medical Imaging
how to read this ▾ how to read this ▴
- Category
- Method: biomechanical musculotendon simulation for animation
- Contributions
-
- Generates tendon and muscle motion under the skin of a traditionally animated character, integrating keyframe or mocap pipelines with a biomechanical simulator
- Models bones as rigid bodies and tendons and muscles as spline-based strands with sliding and surface routing constraints
- An incremental controller solving constrained quadratic optimization to find muscle activations that track the input skeletal animation, then skins the strand motion to the surface
- Context
- Builds on anatomical muscle simulation such as Creating and Simulating Skeletal Muscle from the Visible Human Data Set (Teran et al.), but uses strand-based musculotendons driven to track an artist's animation. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Demonstrated on a hand and forearm model with a stated count of musculotendons and bones, capturing tendon deformation on the back of the hand and thumb; readers should keep in mind it targets subcutaneous strand and tendon motion as a post-process layer, and results are shown on this anatomical region rather than as a general full-body solver.
- Clarity
- Moderately technical; a first pass conveys the track-an-animation idea, a second pass clarifies the strand constraints and activation optimization.
- How to read it
- Focus on the strand routing constraints and the activation-tracking optimization, since those are the core; a second pass pays off if you want anatomically plausible tendon motion in a rig.
Muscles
- Real-Time Motion Retargeting to Highly Varied User-Created Morphologies SIGGRAPH Industrial 210 cites
, , , , ,
Real-time motion retargeting for highly varied character body shapes, enabling game characters with diverse proportions to share animation.
abstract ▾ abstract ▴
This paper presents a system for animating characters whose skeleton morphologies are unknown when the animation is authored, developed for the game Spore where players create their own creatures. Animators author motion in a tool called Spasm by attaching semantic information, contexts and movement modes, that records keyframed motion in a generalized, character-independent form. At runtime the generalized data is specialized onto specific player-created characters to produce pose goals that are fed to a robust Particle IK solver, with stylized locomotion synthesized for arbitrary leg configurations and passive secondary animation added for appeal. The IK solver treats the skeleton as particles linked by length constraints and uses a two-phase spine-then-limb solve tuned for natural poses under conflicting goals.
Related Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Retargeting for Crowd Simulation · Motion Warping
how to read this ▾ how to read this ▴
- Category
- Method: real-time motion retargeting to unknown morphologies
- Contributions
-
- Animates characters whose skeleton morphology is unknown at authoring time, as needed for player-created creatures
- An authoring tool where animators attach semantic information (contexts and movement modes) to record motion in a character-independent form
- Runtime specialization onto arbitrary skeletons via a robust particle IK solver, with synthesized stylized locomotion for arbitrary leg configurations and passive secondary animation
- Context
- Extends the motion-retargeting line begun by Retargeting Motion to New Characters (Gleicher) to the extreme case of arbitrary, user-authored morphologies in a shipped game. Builds on: Retargeting Motion to New Characters
- Correctness
- Presented as a shipping production system for highly varied creatures, with a two-phase spine-then-limb particle IK solve tuned for natural poses under conflicting goals; readers should keep in mind it favors robustness and appeal over biomechanical accuracy, and trade-offs are tuned to the game's needs rather than measured against ground truth.
- Clarity
- Very readable and practical; a first pass conveys the generalize-then-specialize approach and a single careful pass largely suffices.
- How to read it
- Read it as a systems-design case study: focus on the semantic authoring representation and the particle IK two-phase solve; a second pass is worth it for the IK tuning if you build retargeting for variable rigs.
Retargeting
-
,
This talk presents a scheme for constructing complex feather geometry suitable for feature animation.
abstract ▾ abstract ▴
This talk presents a scheme for constructing complex feather geometry suitable for feature animation. The system takes up to three topologically identical feather model inputs from which an entire set of body feathers can be created, with each instanced feather being a blend between these models. A potential field derived from guide geometry together with an implicit constraint surface is used to produce nonpenetrating feathers that lie naturally across the body.
Related Feathers for Mystical Creatures: Pegasus · Mesh-Driven Generation and Animation of Groomed Feathers · Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · Animating Puss in Boots' Feather in Shrek 2
how to read this ▾ how to read this ▴
- Category
- Production talk: a procedural feather construction method
- Contributions
-
- A scheme for constructing complex feather geometry suitable for feature animation
- Blends each instanced feather between up to three topologically identical input models
- Uses a potential field from guide geometry plus an implicit constraint surface for nonpenetrating feathers that lie naturally across the body
- Context
- Relates to procedural grooming and instancing for feathered creatures, and prefigures later implicit-constraint feather work such as Weber and Gornowicz's collision-free feather construction.
- Correctness
- Studio practice rather than peer-reviewed; the approach is production-proven on feature work but the talk reports a system design rather than quantitative validation, so a reader should treat coverage and edge cases as illustrative.
- Clarity
- Accessible at the conceptual level; a single first pass conveys the blend-plus-potential-field idea, with details left to the implementer.
- How to read it
- Read once for the high-level pipeline (three guide models, blending, potential field, constraint surface); a second pass adds little since talks omit formulas, so pair it with the Weber feather paper if you need the math.
CFX
-
, , , ,
Introduces reusable per-joint cage deformation templates that transfer skinning across characters with similar skeletons.
abstract ▾ abstract ▴
This paper introduces skinning templates, which define common skeleton-driven deformation behaviors for common joint and bone types so that skinning solutions can be shared and reused across different characters. The templates are implemented using cage-based deformations, where the skeleton drives the cage vertices through example-based deformation functions that preserve radial distance to avoid linear-blend artifacts such as pinching and collapse, and the cage in turn smoothly deforms the embedded geometry via Positive Mean Value Coordinates. A semi-automatic fitting procedure adapts templates to varying geometries while preserving deformation intent, enabling effects like muscle bulging and pinch-free elbow bending. Templates can be swapped interactively during an animation preview cycle, allowing rapid exploration of alternate skinning styles.
Related Harmonic Coordinates for Character Articulation · Geodesic Voxel Binding for Production Character Meshes · Automatic Rigging and Animation of 3D Characters · Stretchable and Twistable Bones for Skeletal Shape Deformation
how to read this ▾ how to read this ▴
- Category
- Method: a reusable cage-based skinning technique
- Contributions
-
- Introduces skinning templates that define skeleton-driven deformation behaviors per joint and bone type for reuse across characters
- Implements templates with cage-based, example-driven deformations that preserve radial distance to avoid linear-blend pinching and collapse, propagated to geometry via Positive Mean Value Coordinates
- Provides a semi-automatic fitting procedure and interactive template swapping for rapid skinning-style exploration
- Context
- Builds on Mean Value Coordinates for Closed Triangular Meshes (Ju et al. 2005) and addresses the classic linear-blend skinning artifacts by transferring deformation intent rather than weights. Builds on: Mean Value Coordinates for Closed Triangular Meshes
- Correctness
- Validated through demonstrated effects such as muscle bulging and pinch-free elbow bending; key assumptions are that target characters share similar skeleton structure and that templates capture intent, so behavior on dissimilar topologies or extreme poses warrants caution.
- Clarity
- Accessible; a first pass conveys the template-and-cage idea, do a second pass for the radial-distance deformation functions and the fitting procedure.
- How to read it
- Focus first on what a template is and why cages plus Positive MVC avoid pinching; a second pass on the example-based deformation functions and fitting is worth it if you plan to implement reuse.
Skinning / Rigging
-
, , ,
Fail-safe collision response that cancels impact but preserves sliding, incorporating Coulomb friction approximation for cloth self-collision at scale.
abstract ▾ abstract ▴
Robust treatment of complex collisions is a challenging problem in cloth simulation. Some state of the art methods resolve collisions iteratively, invoking a fail-safe when a bound on iteration count is exceeded. The best-known fail-safe rigidifies the contact region, causing simulation artifacts. We present a fail-safe that cancels impact but not sliding motion, considerably reducing artificial dissipation. We equip the proposed fail-safe with an approximation of Coulomb friction, allowing finer control of sliding dissipation.
Related A Safe and Fast Repulsion Method for GPU-based Cloth Self Collisions · Dynamic Deformables: Implementation and Production Practicalities · Robust Treatment of Collisions, Contact and Friction for Cloth Animation · Frictional Contact on Smooth Elastic Solids
how to read this ▾ how to read this ▴
- Category
- Method: a collision-response fail-safe for cloth
- Contributions
-
- A fail-safe collision response that cancels impact but preserves sliding motion, reducing artificial dissipation versus rigidifying the contact region
- An approximation of Coulomb friction integrated into the fail-safe for finer control of sliding dissipation
- Context
- Builds on Bridson et al.'s robust treatment of collisions, contact and friction for cloth, targeting the artifacts of the standard rigidification fail-safe used when iterative resolution exceeds its iteration bound. Builds on: Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation
- Correctness
- Demonstrated on cloth self-collision at scale where iterative methods hit their iteration cap; the friction model is an approximation, so a reader should treat it as a robustness-oriented heuristic rather than an exact contact-mechanics solution.
- Clarity
- Fairly accessible if you know iterative cloth collision handling; a first pass conveys the impact-versus-sliding distinction, a second pass for the response formulation.
- How to read it
- Read with Bridson 2002 in mind; focus on why rigidification dissipates motion and how cancelling impact while keeping sliding fixes it, then a second pass for the friction approximation if you implement collision response.
CFX
-
, ,
Yarn-level cloth simulation capturing the mechanical behavior of knitted fabrics through individual yarn dynamics and contact.
abstract ▾ abstract ▴
Knitted fabric is widely used in clothing because of its unique and stretchy behavior, which is fundamentally different from the behavior of woven cloth. The properties of knits come from the nonlinear, three-dimensional kinematics of long, inter-looping yarns, and despite significant advances in cloth animation we still do not know how to simulate knitted fabric faithfully. Existing cloth simulators mainly adopt elastic-sheet mechanical models inspired by woven materials, focusing less on the model itself than on important simulation challenges such as efficiency, stability, and robustness. We define a new computational model for knits in terms of the motion of yarns, rather than the motion of a sheet. Each yarn is modeled as an inextensible, yet otherwise flexible, B-spline tube. To simulate complex knitted garments, we propose an implicit-explicit integrator, with yarn inextensibility constraints imposed using efficient projections. Friction among yarns is approximated using rigid-body velocity filters, and key yarn-yarn interactions are mediated by stiff penalty forces. Our results show that this simple model predicts the key mechanical properties of different knits, as demonstrated by qualitative comparisons to observed deformations of actual samples in the laboratory, and that the simulator can scale up to substantial animations with complex dynamic motion.
Related Untangling Cloth · Simulating Cloth Using Bilinear Elements · Yarn-Level Simulation of Woven Cloth · Discrete Shells
how to read this ▾ how to read this ▴
- Category
- Method: a yarn-level knitted-cloth simulator
- Contributions
-
- A computational model of knits in terms of yarn motion rather than a deforming sheet, with each yarn an inextensible flexible B-spline tube
- An implicit-explicit integrator with inextensibility enforced by efficient projections, plus rigid-body velocity filters for inter-yarn friction and stiff penalty forces for key yarn-yarn interactions
- Demonstration that this model predicts key mechanical properties across different knit patterns
- Context
- Departs from the elastic-sheet, woven-inspired cloth models that dominate the field, modeling the nonlinear three-dimensional kinematics of inter-looping yarns directly.
- Correctness
- The model is shown to reproduce key mechanical properties of several knits; the friction treatment is a velocity-filter approximation and yarns are inextensible-but-flexible idealizations, so it targets fidelity over the efficiency and scale that sheet models optimize.
- Clarity
- Conceptually clear but technically dense; a first pass conveys the yarn-not-sheet thesis, a second pass for the integrator and constraint projections.
- How to read it
- Read first for the central reframing (simulate yarns, not a sheet) and why knits need it; reserve a careful second pass for the implicit-explicit integrator and inextensibility projections.
CFX
-
, ,
The journal account of Orvalho's reusable facial rigging: a generic labelled rig (controls, skeleton, muscles) is transferred to arbitrary target faces by correspondence, so one authored rig and its animations drive many characters.
abstract ▾ abstract ▴
The journal version of Orvalho's reusable facial rigging work. A generic, labelled facial rig, including controls, a skeleton and muscle structure, is transferred to arbitrary target face models by establishing correspondence between landmark features, so a single authored rig and its animations can drive many different characters' faces with adapted anatomy.
Related Reusable Facial Rigging and Animation: Create Once, Use Many · Transferring Facial Expressions to Different Face Models · Smooth Contact-Aware Facial Blendshapes Transfer · Facial Retargeting with Automatic Range of Motion Alignment
how to read this ▾ how to read this ▴
- Category
- Method: a facial rig transfer technique
- Contributions
-
- Transfers a generic labelled facial rig (controls, skeleton, muscle structure) to arbitrary target face models
- Establishes correspondence between landmark features so one authored rig and its animations drive many faces with adapted anatomy
- Context
- The journal account of Orvalho's Reusable Facial Rigging (2007) and draws on Sumner and Popovic's Deformation Transfer for Triangle Meshes for moving rig structure onto new geometry. Builds on: Reusable Facial Rigging and Animation: Create Once, Use Many · Deformation Transfer for Triangle Meshes
- Correctness
- The approach assumes reliable landmark correspondence and broadly humanoid target faces; it is demonstrated as a create-once-use-many pipeline, so transfer quality on faces far from the source anatomy or with sparse landmarks is the limitation to watch.
- Clarity
- Accessible as a journal write-up; a first pass conveys the correspondence-and-transfer idea, a second pass for the landmarking and adaptation details.
- How to read it
- Focus on how the generic rig is labelled and how landmark correspondence drives transfer; read Deformation Transfer alongside it, and do a second pass only if you are building a cross-character rig pipeline.
Facial / Retargeting
2007
9-
,
Automatic skeleton embedding and skinning weight computation for arbitrary 3D meshes, enabling one-click rigging of novel characters.
abstract ▾ abstract ▴
A system called Pinocchio automatically rigs 3D characters by embedding a skeleton inside the character geometry and computing bone weights using heat diffusion. The method discretizes skeleton embedding using an octree distance field and sphere packing, then refines placement with continuous optimization. Bone weights are computed by treating the character as a heat-conducting volume where bones maintain fixed temperatures, producing natural weight variation across the surface. The system handles diverse character proportions and has been tested on 16 unseen characters with high success rates.
Related Geodesic Voxel Binding for Production Character Meshes · Avatar Reshaping and Automatic Rigging Using a Deformable Model · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · Animation Setup Transfer for 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method / system: automatic skeleton rigging and skinning (Pinocchio)
- Contributions
-
- Automatic embedding of a skeleton inside arbitrary character geometry using an octree distance field, sphere packing, and continuous refinement.
- Bone-weight computation via heat diffusion, treating the character as a heat-conducting volume with bones at fixed temperatures.
- A working one-click rigging system tested across characters of diverse proportions.
- Context
- Sits in the lineage of automatic rigging and skinning-weight estimation, combining a geometric skeleton-embedding stage with a physically-motivated heat-diffusion weighting scheme.
- Correctness
- Demonstrated on a set of unseen characters with reported high success rates; the method assumes a reasonable match between the input mesh and the embedded skeleton template, so very atypical morphologies or topologies are a known caution.
- Clarity
- Clearly structured and accessible; a first pass conveys the two-stage approach, a second pass for the embedding optimization and the heat-diffusion weight formulation.
- How to read it
- First pass for the system overview (embed then weight); second pass on the discrete embedding and heat-equation weighting if you plan to reuse either stage independently.
Rigging / Skinning
-
, , , ,
Efficient inextensibility constraint formulation for cloth using a fast filter-based approach preventing stretching without sacrificing speed.
abstract ▾ abstract ▴
Many textiles do not noticeably stretch under their own weight, yet many cloth solvers permit large strain for better performance. This paper proposes a method to obtain very low strain along the warp and weft directions using Constrained Lagrangian Mechanics together with a novel fast projection method that enforces inextensibility constraints rather than integrating stiff spring forces. The resulting algorithm acts as a velocity filter that integrates easily into existing simulation code alongside bending, damping, and collision passes. Experiments on chains and draped cloth show the method is asymptotically faster than strain-limiting and stiff-spring approaches as permissible strain vanishes and mesh resolution increases.
Related Continuum-based Strain Limiting · Multi-Resolution Isotropic Strain Limiting · Adaptive Anisotropic Remeshing for Cloth Simulation · GPU-Based Simulation of Cloth Wrinkles at Submillimeter Levels
how to read this ▾ how to read this ▴
- Category
- Method: inextensible cloth simulation
- Contributions
-
- A Constrained Lagrangian Mechanics formulation that enforces low strain along warp and weft instead of integrating stiff spring forces.
- A fast projection method that acts as a velocity filter, dropping easily into existing solvers alongside bending, damping, and collision passes.
- Demonstrated asymptotic speed advantage over strain-limiting and stiff-spring methods as permissible strain vanishes and resolution increases.
- Context
- Addresses cloth inextensibility within an existing simulation stack, relating to robust cloth-collision pipelines such as Bridson et al.'s work by integrating as an additional filtering pass. Builds on: Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation
- Correctness
- Shown on chains and draped cloth with an asymptotic efficiency argument; the inextensibility target fits textiles that barely stretch, so the benefit is clearest for low-strain materials and high resolutions rather than deliberately stretchy fabrics.
- Clarity
- Accessible to a simulation reader; a first pass conveys the velocity-filter idea and where it plugs in, a second pass for the constrained-mechanics and fast-projection derivation.
- How to read it
- First pass for the constraint-versus-stiff-spring framing and the drop-in filter design; second pass on the fast projection math if you are implementing or comparing strain-limiting approaches.
CFX
-
, , , ,
Harmonic coordinate space for cage-based character deformation providing non-negative, smooth weights that satisfy the maximum principle.
Rigging / Skinning
-
, ,
Low-dimensional basis representation over motion capture enables automatic computation of near-optimal real-time character controllers.
abstract ▾ abstract ▴
This paper presents a method for real-time interactive character animation that automatically computes near-optimal controllers from a corpus of motion capture data and a desired task. A motion engine blends sequences of precaptured clips while preventing foot-skate without inverse kinematics, and a control policy selects clip sequences using a compact value function represented as a linear combination of basis functions, learned via a linear programming approach to approximate dynamic programming. Because the value functions for many animation tasks are smooth, very few basis functions are needed, yielding controllers with low memory overhead that respond fluidly to continuous user control and environmental constraints. The authors introduce switchability and separability to mitigate dimensionality, and demonstrate navigation, spinning navigation, and fixed and moving obstacle avoidance, the latter enabling simple crowd simulations.
Related DReCon: Data-Driven Responsive Control of Physics-Based Characters · ReGAIL: Toward Agile Character Control From a Single Reference Motion · Physics-Based Motion Retargeting from Sparse Inputs · Mode-Adaptive Neural Networks for Quadruped Motion Control
how to read this ▾ how to read this ▴
- Category
- Method: data-driven near-optimal character controllers
- Contributions
-
- Automatic computation of near-optimal real-time controllers from a motion-capture corpus and a desired task.
- A motion engine that blends precaptured clips and prevents foot-skate without inverse kinematics.
- A control policy using a compact value function as a linear combination of basis functions, learned via a linear-programming approximation to dynamic programming, plus switchability and separability to manage dimensionality.
- Context
- Builds on motion-graph style clip-sequencing (Kovar et al.'s Motion Graphs) and frames clip selection as approximate dynamic programming over a low-dimensional value-function basis. Builds on: Motion Graphs
- Correctness
- Demonstrated on navigation, spinning navigation, and fixed and moving obstacle avoidance (extending to simple crowds); the compactness argument rests on the value functions being smooth for these tasks, so tasks with non-smooth objectives may need more basis functions.
- Clarity
- Accessible in motivation but reinforcement-learning-flavored; a first pass conveys the basis-function value-function idea, a second pass for the LP approximation and the switchability/separability constructs.
- How to read it
- First pass for the problem framing and the motion-engine/policy split; second pass on the value-function approximation and dimensionality tricks if you work on learned controllers.
Motion Synthesis
- talk Next-Generation Facial Rigging: Development of an Infinite Pose Combination Facial System GDC Industrial
Naughty Dog's Infinite Pose Combination system enabled facial rigs capable of arbitrary pose combinations, advancing character expressiveness beyond fixed blendshape limits.
Facial / Rigging
-
, , ,
Position-based simulation framework for real-time cloth, hair, and deformable bodies widely adopted in games and interactive applications.
abstract ▾ abstract ▴
This paper introduces position-based dynamics, a constraint-based simulation approach that works directly with vertex positions rather than forces or velocities, enabling unconditional stability and direct control of object deformations. The method projects constraints by solving nonlinear equations iteratively using a Gauss-Seidel-like approach and naturally conserves linear and angular momentum for internal constraints. A real-time cloth simulator demonstrates the approach, supporting features like two-way rigid body interaction, self-collision, and independent control of bending and stretching properties through constraint formulation.
Related Projective Dynamics: Fusing Constraint Projections for Fast Simulation · Wrinkle Meshes · Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation · Strain Based Dynamics
how to read this ▾ how to read this ▴
- Category
- Method / framework: position-based dynamics
- Contributions
-
- A constraint-based simulation approach that works directly on vertex positions rather than forces or velocities, giving unconditional stability and direct control of deformation.
- Iterative Gauss-Seidel-like constraint projection that conserves linear and angular momentum for internal constraints.
- A real-time cloth simulator supporting two-way rigid-body interaction, self-collision, and independent bending and stretching control.
- Context
- A general position-based alternative to force- and velocity-based dynamics, providing a unified constraint-projection framework for cloth, hair, and deformable bodies.
- Correctness
- Demonstrated through a real-time cloth simulator with the stated stability and control benefits; constraint stiffness behavior is iteration- and timestep-dependent, which a reader should keep in mind as a known characteristic of the projection scheme.
- Clarity
- Very accessible and widely cited as an entry point; a first pass conveys the whole idea, a second pass for the constraint-projection details and specific constraint formulations.
- How to read it
- First pass to internalize the position-projection loop and why it is stable; second pass on the constraint definitions if you are implementing PBD, since this is the canonical reference.
CFX
-
PhD thesis on a portable facial pipeline: build a sophisticated rig once on a generic model, then automatically transfer its controls, anatomy and animation to many different face models.
abstract ▾ abstract ▴
A PhD thesis on a portable facial rigging and animation framework. A sophisticated facial rig, including controls, anatomical structure and animation, is built once on a generic model and then transferred automatically to many different 3D face models. The approach cuts the manual cost of facial setup and lets the same deformation parameters drive unique expressions across characters of varying topology and proportion.
Related Transferring the Rig and Animations from a Character to Different Face Models · Transferring Facial Expressions to Different Face Models · Example-Based Facial Rigging · Smooth Contact-Aware Facial Blendshapes Transfer
how to read this ▾ how to read this ▴
- Category
- PhD thesis: portable, reusable facial rigging and animation
- Contributions
-
- A framework where a sophisticated facial rig (controls, anatomy, and animation) is built once on a generic model.
- Automatic transfer of that rig to many different 3D face models of varying topology and proportion.
- Reuse of the same deformation parameters to drive unique expressions per character, cutting manual facial-setup cost.
- Context
- Consolidates and extends the author's create-once, use-many line of work, building on Transferring Facial Expressions to Different Face Models and Sumner and Popovic's Deformation Transfer for Triangle Meshes. Builds on: Transferring Facial Expressions to Different Face Models · Deformation Transfer for Triangle Meshes
- Correctness
- Presented as a thesis-scale pipeline validated by transfer across multiple faces; as with the underlying transfer method, results hinge on correspondence quality and anatomy adaptation across differing topologies, so consistency on highly stylized faces is a reasonable caution.
- Clarity
- As a thesis it is broad and self-contained; a first pass via the introduction and contributions chapters conveys the whole approach, with specific chapters for the formulation.
- How to read it
- Read the intro and contribution summary first for the overall pipeline; dive into the transfer chapters for detail, and treat the 2006 paper as the compact version of the core method.
Facial / Rigging
-
, ,
Simple feedback-based bipedal locomotion controller achieving robust walking and running on varied terrain with minimal parameter tuning.
abstract ▾ abstract ▴
SIMBICON is a simple control strategy for physics-based biped locomotion that can generate a wide variety of gaits and styles in real-time, including walking in all directions, running, skipping, and hopping. The framework combines a finite state machine of target poses driven by proportional-derivative controllers with a balance feedback law that adjusts the swing hip target angle based on the center of mass position and velocity. Controllers can be authored manually with a small set of parameters or reconstructed from motion capture data, and they remain robust to pushes, unexpected terrain variations, and changes in kinematic and dynamic parameters. The authors also apply feedback error learning to learn predictive feedforward torques, enabling low-gain control that produces smoother and more natural simulated motion.
Related Animating Human Athletics · Motion Grammars for Character Animation · Physics-based Motion Capture Imitation with Deep Reinforcement Learning · PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network
how to read this ▾ how to read this ▴
- Category
- Method: physics-based biped locomotion control
- Contributions
-
- A simple control strategy generating many real-time gaits and styles (walking in all directions, running, skipping, hopping).
- A finite-state machine of target poses driven by PD controllers, combined with a balance feedback law adjusting the swing-hip target from center-of-mass position and velocity.
- Controllers that are either hand-authored from few parameters or reconstructed from motion capture, with feedback error learning of feedforward torques for smoother low-gain control.
- Context
- Continues the physics-based human locomotion tradition (Hodgins et al.'s Animating Human Athletics), emphasizing a minimal, robust balance-feedback controller. Builds on: Animating Human Athletics
- Correctness
- Demonstrated to remain robust to pushes, terrain variation, and parameter changes; robustness is shown empirically across gaits rather than proven, so behavior under conditions far outside the demonstrated range should not be assumed.
- Clarity
- Notably accessible for a control paper; a first pass conveys the FSM-plus-balance-feedback core, a second pass for the parameterization and feedback-error-learning details.
- How to read it
- First pass for the FSM and the swing-hip balance law (the memorable core idea); second pass on parameter authoring and feedback error learning if you intend to build or extend the controller.
Motion Synthesis
-
, , ,
Dual quaternion blending fixes the collapsing-joint and candy-wrapper artifacts of linear blend skinning at almost the same cost.
abstract ▾ abstract ▴
Skinning of skeletally deformable models is extensively used for real-time animation of characters, creatures and similar objects. The standard solution, linear blend skinning, has some serious drawbacks that require artist intervention. Therefore, a number of alternatives have been proposed in recent years. All of them successfully combat some of the artifacts, but none challenge the simplicity and efficiency of linear blend skinning. As a result, linear blend skinning is still the number one choice for the majority of developers. In this paper, we present a novel GPU-friendly skinning algorithm based on dual quaternions. We show that this approach solves the artifacts of linear blend skinning at minimal additional cost. Upgrading an existing animation system (e.g., in a videogame) from linear to dual quaternion skinning is very easy and has negligible impact on run-time performance.
Related Geometric Skinning with Approximate Dual Quaternion Blending · Stretchable and Twistable Bones for Skeletal Shape Deformation · Segmentation-Based Skinning · Velocity Skinning for Real-time Stylized Skeletal Animation
how to read this ▾ how to read this ▴
- Category
- Method: a GPU-friendly skeletal skinning algorithm
- Contributions
-
- Replaces linear blend skinning's matrix blend with a dual-quaternion blend that avoids collapsing-joint and candy-wrapper artifacts
- Keeps cost close to linear blend skinning and stays GPU-friendly for real-time characters
- Drop-in upgrade for existing animation systems with negligible runtime impact
- Context
- Sits in the lineage of skeleton-driven deformation alternatives to linear blend skinning such as Pose Space Deformation (Lewis et al.), but attacks the rotation-blending math directly rather than learning corrective shapes. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Demonstrated as a real-time skinning method that removes well-known LBS artifacts at near-equal cost; a reader should note it addresses rigid-transform blending and is not a soft-tissue or muscle model, so dynamics and flesh detail are out of scope.
- Clarity
- Accessible in intent; a first pass conveys the artifact-fixing idea, a second pass is needed to follow the dual-quaternion formulation.
- How to read it
- Read the intro and figures first to see the artifacts being fixed, then do a second pass on the dual-quaternion blend math if you intend to implement it; the 2008 journal version refines this work.
Skinning
2006
5-
,
Data-driven model of skin and muscle deformation extracted from human motion, enabling realistic soft-tissue dynamics on animated characters.
abstract ▾ abstract ▴
This paper presents a data-driven technique for capturing and animating the dynamic surface motion of the human body, including bending, bulging, jiggling, and stretching, using a commercial optical motion capture system with approximately 350 small markers placed on the muscular and fleshy parts of the body. The sparse marker sample is supplemented with a detailed subject-specific polygonal model, and a local reference frame defined at each marker is used to clean noisy data by merging disconnected trajectories and filling occluded-marker holes via PCA. To animate the model, the marker motion is factored into rigid body motion of near-rigidly segmented parts plus a residual local deformation approximated first by a quadratic transformation and then resolved with radial basis function interpolation. The method is demonstrated on dynamic activities such as punching, jumping rope, and belly dancing, and results are compared to conventional motion capture and synchronized video.
Related Data-driven Modeling of Skin and Muscle Deformation · Data-Driven Physics for Human Soft Tissue Animation · NIMBLE: A Non-rigid Hand Model with Bones and Muscles · OSSO: Obtaining Skeletal Shape from Outside
how to read this ▾ how to read this ▴
- Category
- Capture system + data-driven skin-deformation method
- Contributions
-
- A data-driven technique to capture and animate dynamic body-surface motion (bending, bulging, jiggling, stretching) using a commercial optical mocap system with about 350 markers on fleshy regions
- A pipeline that cleans noisy sparse markers (merging trajectories, filling occlusions via PCA) using a subject-specific polygonal model and per-marker local frames
- Factors marker motion into rigid motion of near-rigid segments plus residual local deformation, approximated by a quadratic transformation then resolved with radial basis function interpolation
- Context
- A data-driven alternative to physical flesh simulation, in the lineage of marker-based motion capture extended to capture soft-tissue surface dynamics rather than just skeletal motion.
- Correctness
- Demonstrated on dynamic activities (punching, jumping rope, belly dancing) with comparison to conventional mocap and synchronized video; fidelity is bounded by the sparse 350-marker sampling and the quadratic-plus-RBF deformation model, so it captures observed soft-tissue motion rather than predicting unobserved dynamics.
- Clarity
- Accessible and practically oriented; a first pass conveys the capture-and-model pipeline, a second pass clarifies the marker cleanup and the deformation decomposition.
- How to read it
- First pass for the capture setup and the rigid-plus-residual deformation model; second pass on the cleanup and RBF steps if you work with marker data or soft-tissue capture.
Skinning / Muscles
-
, ,
Automated pipeline converts unarticulated example shapes into a controllable articulated reduced-deformable model with intuitive IK-style control.
abstract ▾ abstract ▴
Articulated shapes are aptly described by reduced deformable models that express required shape deformations using a compact set of control parameters. Although sufficient to describe most shape deformations, these control parameters can be ill-suited for animation tasks, particularly when reduced deformable models are inferred automatically from example shapes. Our algorithm provides intuitive and direct control of reduced deformable models similar to a conventional inverse-kinematics algorithm for jointed rigid structures. We present a fully automated pipeline that transforms a set of unarticulated example shapes into a controllable, articulated model. With only a few manipulations, an animator can automatically and interactively pose detailed shapes at rates independent of their geometric complexity.
Related Mesh-Based Inverse Kinematics · S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · Joint-Dependent Local Deformations for Hand Animation and Object Grasping
how to read this ▾ how to read this ▴
- Category
- Method: IK-style control for reduced deformable models
- Contributions
-
- A fully automated pipeline that turns a set of unarticulated example shapes into a controllable, articulated reduced-deformable model.
- An inverse-kinematics-style interface that gives intuitive, direct control over the model's compact parameters.
- Interactive posing of detailed shapes at rates independent of geometric complexity.
- Context
- Extends example-based deformation control in the spirit of Sumner et al.'s Mesh-Based Inverse Kinematics, bringing jointed-rigid IK intuition to automatically inferred reduced deformable models. Builds on: Mesh-Based Inverse Kinematics
- Correctness
- Demonstrated as an interactive posing tool over example shapes, with control rates decoupled from mesh complexity; the quality of poses is bounded by how well the example set spans the desired deformations, so coverage of the examples is the key assumption to keep in mind.
- Clarity
- Accessible at a high level; a first pass conveys the IK-from-examples idea, a second pass is needed for the reduced-model formulation and the control mapping.
- How to read it
- First pass for the pipeline (examples to articulated model to IK control); do a second pass on how control parameters are derived and solved if you plan to implement or compare against it.
Rigging / Skinning
-
, ,
GPU-parallel example-based pose-space deformation that automatically computes per-vertex blend weights from sample poses.
abstract ▾ abstract ▴
WPSD (Weighted Pose Space Deformation) is an example based skinning method for articulated body animation. The per‐vertex computation required in WPSD can be parallelized in a SIMD (Single Instruction Multiple Data) manner and implemented on a GPU. While such vertex‐parallel computation is often done on the GPU vertex processors, further parallelism can potentially be obtained by using the fragment processors. In this paper, we develop a parallel deformation method using the GPU fragment processors. Joint weights for each vertex are automatically calculated from sample poses, thereby reducing manual effort and enhancing the quality of WPSD as well as SSD (Skeletal Subspace Deformation). We show sufficient speed‐up of SSD, PSD (Pose Space Deformation) and WPSD to make them suitable for real‐time applications . Categories and Subject Descriptors (according to ACM CCS): I.3.1 [Computer Graphics]: Hardware Architecture‐Parallel processing, I.3.5 [Computer Graphics]: Computational Geometry and Object Modeling‐Curve, surface, solid and object modeling, I.3.7 [Computer Graphics]: Three‐Dimensional Graphics and Realism‐Animation.
Related EigenSkin: Real Time Large Deformation Character Skinning in Hardware · Scan-based Volume Animation Driven by Locally Adaptive Articulated Registrations · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2 · Dyna: A Model of Dynamic Human Shape in Motion
how to read this ▾ how to read this ▴
- Category
- Method: GPU example-based skinning (weighted pose-space deformation)
- Contributions
-
- A parallel WPSD deformation method that exploits GPU fragment processors for additional parallelism beyond vertex processors.
- Automatic per-vertex joint weight computation from sample poses, reducing manual rigging effort.
- Real-time speed-ups for SSD, PSD and WPSD, making example-based skinning suitable for interactive applications.
- Context
- Builds directly on Lewis et al.'s Pose Space Deformation, adding automatic weighting and a GPU (fragment-processor) parallelization of the per-vertex computation. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Validated as a speed and quality improvement on articulated body animation; gains are reported on the era's GPU fragment-processor architecture, so the specific performance argument is hardware-dependent and the deformation quality still relies on representative sample poses.
- Clarity
- Accessible if you already know SSD/PSD; a first pass conveys the GPU-parallel WPSD idea, a second pass for the weight-computation and fragment-processor mapping.
- How to read it
- First pass for the why (auto weights plus GPU parallelism); second pass on the fragment-processor data layout only if GPU implementation detail matters to you, otherwise the conceptual contribution suffices.
Skinning
-
, , , , ,
Super-helix model for hair strand dynamics using Kirchhoff elastic rods, capturing natural curl and wave patterns with high physical fidelity.
abstract ▾ abstract ▴
Introduces Super-Helices, a piecewise helical rod model for accurately predicting hair motion using Kirchhoff equations for dynamic inextensible elastic rods. Each hair strand is represented as a continuous helical rod animated using Lagrangian mechanics, handling nonlinear bending and twisting behavior. Validated against real hair experiments, the model efficiently simulates various hair types (straight, wavy, curly) with realistic nonlinear effects like buckling and bending-twisting instabilities.
Related A Mass Spring Model for Hair Simulation · A Hybrid Iterative Solver for Robustly Capturing Coulomb Friction in Hair Dynamics · Adaptive Nonlinearity for Collisions in Complex Rod Assemblies · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function
how to read this ▾ how to read this ▴
- Category
- Method: physically-based hair strand dynamics
- Contributions
-
- Super-Helices, a piecewise-helical rod model for predicting hair motion based on the Kirchhoff equations for dynamic inextensible elastic rods.
- A Lagrangian-mechanics animation of each strand as a continuous helical rod, handling nonlinear bending and twisting.
- Efficient simulation of straight, wavy and curly hair, capturing nonlinear effects such as buckling and bending-twisting instabilities.
- Context
- Grounds hair animation in the mechanics of Kirchhoff elastic rods, treating each strand as a continuous dynamic rod rather than a particle or spring chain.
- Correctness
- Stated to be validated against real-hair experiments and to reproduce nonlinear behavior across hair types; results concern individual-strand fidelity, so a reader should keep collective effects (full-head strand counts and inter-strand contact) in mind as separate concerns.
- Clarity
- Conceptually accessible but mathematically dense; a first pass conveys the rod model and what it captures, the Kirchhoff/Lagrangian formulation needs a careful second or third pass.
- How to read it
- First pass for the physical model and the phenomena it reproduces; budget a slow second/third pass on the rod equations and discretization if you need to reimplement or judge numerical behavior.
CFX
-
, ,
Finds the correspondence between a generic facial rig and a target face, then transfers controls and anatomy so the same deformation parameters yield expressions across different 3D face models.
abstract ▾ abstract ▴
Presents a facial deformation system that eases character setup by letting artists manipulate a face like a puppet. It finds the correspondence of the main attributes of a generic rig and transfers them to different 3D face models, automatically generating an anatomy-based facial rig that adapts muscles and a skeleton to each face, so the same deformation parameters produce unique expressions across models.
Related Transferring the Rig and Animations from a Character to Different Face Models · Reusable Facial Rigging and Animation: Create Once, Use Many · Facial Retargeting with Automatic Range of Motion Alignment · Smooth Contact-Aware Facial Blendshapes Transfer
how to read this ▾ how to read this ▴
- Category
- Method: facial rig and expression transfer across face models
- Contributions
-
- A facial deformation system that lets artists manipulate a face like a puppet for easier character setup.
- Finding correspondence of a generic rig's main attributes and transferring them to different 3D face models.
- Automatic generation of an anatomy-based rig (muscles plus skeleton) so the same parameters produce unique expressions per model.
- Context
- Relates to deformation-transfer ideas, building on Sumner and Popovic's Deformation Transfer for Triangle Meshes to carry rig controls and anatomy from a generic face to target faces. Builds on: Deformation Transfer for Triangle Meshes
- Correctness
- Presented as a setup-acceleration system that reuses one rig across faces; the result quality depends on the quality of the established correspondence and how well the generic rig's anatomy adapts to differing topology and proportions, which a reader should treat as the main assumption.
- Clarity
- Accessible to a rigging/character-tech reader; a first pass conveys the create-once, transfer-to-many idea, a second pass for the correspondence and anatomy-adaptation steps.
- How to read it
- First pass for the pipeline and motivation; second pass on the correspondence method and anatomy transfer if you intend to evaluate it against other rig-retargeting work.
Facial / Retargeting
2005
11- Automatic Determination of Facial Muscle Activations from Sparse Motion Capture Marker Data SIGGRAPH Academic 423 cites
, ,
FEM-based facial muscle simulation with automatic activation estimation from sparse mocap markers, enabling physics-driven facial animation.
abstract ▾ abstract ▴
We build an anatomically accurate model of facial musculature, passive tissue, and skeletal structure from volumetric data of a living subject, endowing the tissues with a nonlinear constitutive model and controllable anisotropic muscle activations based on fiber directions. To animate this model, we propose a method that automatically determines the muscle activations, head position, and jaw articulation that make a quasistatic finite element simulation track a sparse set of surface landmarks from motion capture marker data. The estimation is posed as a nonlinear least squares problem solved with a Gauss-Newton approach, where the Jacobian of the quasistatic configuration is computed efficiently. Because the search is performed over the space of physically attainable configurations parameterized by muscle activations, the method robustly handles outliers in the motion capture data and the resulting activations can be reused in dynamic simulations with contact and collision.
Related Art-Directed Muscle Simulation for High-End Facial Animation · Fully Automatic Generation of Anatomical Face Simulation Models · Building Accurate Physics-based Face Models from Data · Phace: Physics-based Face Modeling and Animation
how to read this ▾ how to read this ▴
- Category
- Method: anatomical FEM facial model with activation estimation from mocap
- Contributions
-
- Builds an anatomically accurate model of facial musculature, passive tissue, and skeleton from a subject's volumetric data, with a nonlinear constitutive model and anisotropic fiber-based muscle activations
- Automatically determines muscle activations, head position, and jaw articulation so a quasistatic FEM simulation tracks sparse mocap surface landmarks, solved as nonlinear least squares via Gauss-Newton
- Searches over physically attainable configurations, giving robustness to mocap outliers and activations reusable in dynamic contact/collision simulations
- Context
- Extends muscle-based facial animation (Waters' facial muscle model) and FEM musculoskeletal simulation (Teran et al.'s skeletal-muscle work) by inverting a physical face model from sparse marker data. Builds on: A Muscle Model for Animating Three-Dimensional Facial Expression · Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Accuracy depends on the per-subject anatomical model and the quasistatic assumption during tracking; estimating from sparse landmarks is inherently underdetermined, so the physical parameterization is what regularizes it, and results are tied to the captured individual.
- Clarity
- Dense, methods-heavy writing; a first pass conveys the build-model-then-invert-activations idea, with second and third passes needed for the constitutive model and the Gauss-Newton estimation.
- How to read it
- Focus on how the activation estimation is posed as least squares over physically attainable configurations and how the Jacobian is computed; multiple passes pay off for the FEM and solver if you work on physics-based faces.
Facial / Muscles
-
, , , , ,
Finite element musculoskeletal simulation built from anatomical data, groundwork for the production flesh systems that followed.
abstract ▾ abstract ▴
This paper presents a framework for extracting and simulating high resolution musculoskeletal geometry from the segmented visible human data set, demonstrated on roughly 30 contact coupled muscles of the upper limb made up of about 10 million tetrahedra. Muscle, tendon, and bone geometry is created using level set and constructive solid geometry repair, with B-spline solids assigning spatially varying fiber directions and a transversely isotropic, quasi-incompressible constitutive model providing active and passive fiber response. To make simulation tractable, each high resolution muscle is embedded in a nonmanifold, connectivity preserving simulation mesh molded from a lower resolution body-centered cubic lattice, which relaxes the time step restriction and reduces memory. A robust invertible finite element technique handles degenerate and inverted tetrahedra, and a fascia contact model maintains realistic contact between muscle groups during ballistic motion.
Related How to Build a Human: Practical Physics-Based Character Animation · Robust Quasistatic Finite Elements and Flesh Simulation · Lessons from the Evolution of an Anatomical Facial Muscle Model · A Neural Network Model for Efficient Musculoskeletal-Driven Skin Deformation
how to read this ▾ how to read this ▴
- Category
- Method / system: FEM musculoskeletal modeling and simulation from anatomical data
- Contributions
-
- A framework to extract high-resolution muscle, tendon, and bone geometry from the segmented Visible Human data set using level sets and CSG repair, with B-spline solids assigning spatially varying fiber directions
- A transversely isotropic, quasi-incompressible constitutive model for active and passive fiber response, demonstrated on ~30 contact-coupled upper-limb muscles (~10 million tetrahedra)
- Embedding each high-res muscle in a connectivity-preserving BCC simulation mesh to relax the time-step restriction and cut memory, plus robust invertible FEM and a fascia contact model
- Context
- Grounds physically based flesh and muscle simulation in real anatomical data, providing the FEM musculoskeletal groundwork later built on by production flesh systems and by facial muscle work such as Sifakis et al. 2005.
- Correctness
- Anatomical fidelity is bounded by the Visible Human segmentation and the chosen constitutive model; the embedded lower-resolution simulation mesh trades some accuracy for tractability, and validation is by anatomical plausibility and stability rather than measured in-vivo mechanics.
- Clarity
- A substantial, technique-dense journal paper; a first pass conveys the geometry-extraction-to-simulation pipeline, with second and third passes needed for the constitutive model, invertible FEM, and embedding.
- How to read it
- Read for the end-to-end pipeline from anatomical data to simulable meshes; deep-dive the constitutive model, BCC embedding, and invertible FEM in later passes if you build muscle or flesh simulators.
Muscles
-
, , ,
Multilinear model factoring facial identity and expression for transferring performances between individuals and editing facial animation.
abstract ▾ abstract ▴
Face Transfer maps videorecorded performances of one individual onto facial animations of another by extracting visemes, expressions, and 3D pose from monocular video. It builds on a multilinear model of 3D face meshes that separably parameterizes geometric variation due to identity, expression, and viseme, estimated from a Cartesian product of 3D face scans using N-mode SVD. The paper introduces methods to put unstructured scans into correspondence via template fitting and to impute missing examples in the data tensor through matrix factorization. By linking the multilinear model to optical-flow-based tracking with a weak-perspective camera model, the system recovers pose and attribute parameters from video and can mix attributes across multiple videos to rewrite footage or retarget performances to new identities.
Related Performance-Driven Facial Animation · Transferring the Rig and Animations from a Character to Different Face Models · Realtime Performance-Based Facial Animation · High Fidelity Facial Animation Capture and Retargeting with Contours
how to read this ▾ how to read this ▴
- Category
- Method: a multilinear face model for performance transfer and editing
- Contributions
-
- A multilinear 3D face model that separably parameterizes identity, expression, and viseme, estimated via N-mode SVD over a Cartesian product of face scans
- Techniques to put unstructured scans into correspondence (template fitting) and impute missing tensor examples via matrix factorization
- A system linking the model to optical-flow tracking with a weak-perspective camera to recover pose and attributes from monocular video, then retarget or mix performances across identities
- Context
- Extends the linear morphable-model line (Blanz and Vetter, A Morphable Model for the Synthesis of 3D Faces) to a multilinear tensor factorization that disentangles multiple modes of facial variation rather than a single PCA space. Builds on: A Morphable Model for the Synthesis of 3D Faces
- Correctness
- Demonstrated on monocular video tracking and cross-subject retargeting, but it assumes scans can be brought into reliable correspondence and a weak-perspective camera; recovery quality depends on optical flow and on how well the scan corpus spans the target identities and expressions.
- Clarity
- Reasonably accessible at a high level; a first pass conveys the factor-then-track idea, a second pass is needed for the N-mode SVD formulation and the tracking objective.
- How to read it
- First pass to grasp the identity/expression/viseme factorization and the video-to-attributes pipeline; do a second pass on the N-mode SVD and correspondence/imputation steps if you intend to reimplement or adapt the model.
Facial / Retargeting
-
,
Geostatistical approach to motion interpolation using kriging for smooth synthesis of human movement from sparse example motion clips.
abstract ▾ abstract ▴
A common motion interpolation technique for realistic human animation is to blend similar motion samples with weighting functions whose parameters are embedded in an abstract space. Existing methods, however, are insensitive to statistical properties, such as correlations between motions. In addition, they lack the capability to quantitatively evaluate the reliability of synthesized motions. This paper proposes a method that treats motion interpolations as statistical predictions of missing data in an arbitrarily definable parametric space. A practical technique of geostatistics, called universal kriging, is then introduced for statistically estimating the correlations between the dissimilarity of motions and the distance in the parametric space. Our method statistically optimizes interpolation kernels for given parameters at each frame, using a pose distance metric to efficiently analyze the correlation. Motions are accurately predicted for the spatial constraints represented in the parametric space, and they therefore have few undesirable artifacts, if any. This property alleviates the problem of spatial inconsistencies, such as foot-sliding, that are associated with many existing methods. Moreover, numerical estimates for the reliability of predictions enable motions to be adaptively sampled.
Related Verbs and Adverbs: Multidimensional Motion Interpolation · Automated Extraction and Parameterization of Motions in Large Data Sets · Near-Optimal Character Animation with Continuous Control · Neural State Machine for Character-Scene Interactions
how to read this ▾ how to read this ▴
- Category
- Method: a statistical motion-interpolation technique
- Contributions
-
- Reframes motion interpolation as statistical prediction of missing data in a parametric space
- Introduces universal kriging to estimate correlations between motion dissimilarity and parametric distance, optimizing interpolation kernels per frame
- Reduces spatial-inconsistency artifacts such as foot-sliding and offers a notion of reliability for synthesized motion
- Context
- Advances parametric example-based motion interpolation (Rose et al., Verbs and Adverbs) by replacing fixed weighting functions with a geostatistical, correlation-aware estimator. Builds on: Verbs and Adverbs: Multidimensional Motion Interpolation
- Correctness
- Argued to reduce artifacts by accounting for inter-motion correlations and a pose distance metric; results depend on the chosen parametric space and pose metric, and kriging assumptions (a meaningful variogram from the examples) may not hold for sparse or poorly distributed clips.
- Clarity
- The motion-blending framing is accessible, but the kriging machinery is the harder part; a first pass gives the intuition, a second pass is needed for the variogram and kernel optimization.
- How to read it
- First pass for why correlation-aware interpolation beats fixed weights; a second pass on the universal-kriging formulation pays off if you work on motion synthesis or want the reliability estimate.
Motion Synthesis
-
, ,
Extends mean value coordinates to closed triangular meshes in 3D, enabling smooth and natural cage-based space deformations.
abstract ▾ abstract ▴
This paper generalizes mean value coordinates from closed 2D polygons to closed triangular meshes in 3D, constructing an interpolant that extends function values defined at mesh vertices to the interior. The resulting coordinates are continuous everywhere, smooth on the interior, linear on the triangles, and reproduce linear functions, giving them linear precision. The authors derive a stable closed-form evaluation based on the mean vector of a spherical triangle and handle degenerate and coplanar cases robustly. They demonstrate the coordinates on boundary value interpolation, volumetric texturing, and cage-based surface deformation for character animation, where moving the vertices of an enclosing control mesh induces smooth deformations of the embedded model.
Related Green Coordinates for Triquad Cages in 3D · Green Coordinates · Biharmonic Coordinates · Wires: A Geometric Deformation Technique
how to read this ▾ how to read this ▴
- Category
- Method: a cage-based space-deformation coordinate scheme
- Contributions
-
- Generalizes mean value coordinates from closed 2D polygons to closed triangular meshes in 3D
- Provides an interpolant that is continuous everywhere, smooth on the interior, linear on triangles, and reproduces linear functions (linear precision)
- Derives a stable closed-form evaluation from the mean vector of a spherical triangle with robust handling of degenerate and coplanar cases
- Context
- Extends the 2D mean value coordinates lineage to 3D cages, situating it among barycentric/generalized-coordinate methods for boundary interpolation and cage-based deformation.
- Correctness
- Demonstrated on boundary value interpolation, volumetric texturing, and cage-based character deformation; the coordinates are well-defined for closed triangular cages, but mean value coordinates can be negative, so readers should not assume locality or positivity guarantees.
- Clarity
- Clear and self-contained; a first pass conveys the construction, a second pass clarifies the spherical-triangle derivation and degenerate-case handling.
- How to read it
- First pass for the properties and the deformation use case; a focused second pass on the closed-form evaluation and edge cases is worthwhile if you plan to implement the coordinates.
Skinning
-
, , ,
Deformation-gradient feature vectors and nonlinear example span enable direct mesh posing through vertex constraints without an explicit skeleton.
abstract ▾ abstract ▴
The ability to position a small subset of mesh vertices and produce a meaningful overall deformation of the entire mesh is a fundamental task in mesh editing and animation. However, the class of meaningful deformations varies from mesh to mesh and depends on mesh kinematics, which prescribes valid mesh configurations, and a selection mechanism for choosing among them. Drawing an analogy to the traditional use of skeleton-based inverse kinematics for posing skeletons. we define mesh-based inverse kinematics as the problem of finding meaningful mesh deformations that meet specified vertex constraints.Our solution relies on example meshes to indicate the class of meaningful deformations. Each example is represented with a feature vector of deformation gradients that capture the affine transformations which individual triangles undergo relative to a reference pose. To pose a mesh, our algorithm efficiently searches among all meshes with specified vertex positions to find the one that is closest to some pose in a nonlinear span of the example feature vectors. Since the search is not restricted to the span of example shapes, this produces compelling deformations even when the constraints require poses that are different from those observed in the examples.
Related Inverse Kinematics for Reduced Deformable Models · Character Articulation through Profile Curves · Laplacian Surface Editing · Avatar Reshaping and Automatic Rigging Using a Deformable Model
how to read this ▾ how to read this ▴
- Category
- Method: example-based mesh deformation (mesh inverse kinematics)
- Contributions
-
- Defines mesh-based inverse kinematics: finding meaningful deformations that satisfy specified vertex constraints without an explicit skeleton
- Represents each example mesh by a feature vector of per-triangle deformation gradients relative to a reference pose
- Poses a mesh by searching for the constrained mesh closest to a nonlinear span of the example feature vectors, allowing compelling results even outside the example shapes
- Context
- Draws an explicit analogy to skeleton-based inverse kinematics and the deformation-gradient representation used in mesh editing, applying it to example-driven posing. Builds on: Laplacian Surface Editing
- Correctness
- Quality hinges on the example meshes spanning the desired class of deformations; because the search is not restricted to the example span it can extrapolate, but constraints far from the examples or sparse example sets may yield less meaningful results.
- Clarity
- The IK analogy makes the goal accessible; a first pass conveys the idea, a second pass is needed for the deformation-gradient feature vector and the nonlinear span optimization.
- How to read it
- First pass for the concept (skeleton-free, example-driven posing); a second pass on the feature-vector construction and the closest-pose search is worth it if you build deformation or editing tools.
Rigging / Skinning
-
, , , ,
Drives an elastic volumetric body model with the animator's skeleton, so skeletal motion induces physically plausible secondary deformation and dynamics, combining skeleton-based control with elasticity-based realism.
abstract ▾ abstract ▴
A framework for physically based rigging of deformable characters in which an elastic volumetric model is driven by the animator's skeleton. Skeletal motion induces physically plausible deformation and secondary dynamics through a linear elasticity solve, combining the controllability of skeleton-based rigging with the realism of elasticity-based simulation, and supporting effects such as muscle bulging and jiggle without per-pose sculpting.
Related Interactive Skeleton-Driven Dynamic Deformations · Robust Treatment of Degenerate Elements in Interactive Corotational FEM Simulations · Somigliana Coordinates: an Elasticity-Derived Approach for Cage Deformation · Bodyopt: A Character Deformation Pipeline for Avatar: The Way of Water
how to read this ▾ how to read this ▴
- Category
- Method: physically based rigging for deformable characters
- Contributions
-
- A framework where an elastic volumetric body model is driven by the animator's skeleton
- Skeletal motion induces physically plausible deformation and secondary dynamics via a linear elasticity solve
- Supports effects such as muscle bulging and jiggle without per-pose sculpting, combining skeletal controllability with elasticity-based realism
- Context
- Builds directly on the authors' skeleton-driven dynamic deformation work (Capell et al., Interactive Skeleton-Driven Dynamic Deformations), adding a physically based, elasticity-driven rigging layer. Builds on: Interactive Skeleton-Driven Dynamic Deformations
- Correctness
- Realism comes from linear elasticity, which is accurate for moderate deformation but can be less faithful under large or highly nonlinear deformation; results show plausible secondary motion rather than ground-truth-validated tissue behavior.
- Clarity
- Accessible to readers with basic FEM/elasticity background; a first pass conveys the skeleton-drives-elastic-body idea, a second pass clarifies the elasticity solve and coupling.
- How to read it
- First pass for the rigging concept and the controllability-versus-realism tradeoff; second pass on the linear elasticity formulation if you implement physically based secondary motion.
Rigging / Skinning
-
, , ,
Foundational FEM flesh simulation using quasistatic Newton-Raphson iteration robust to element inversion, enabling character soft-tissue deformation.
abstract ▾ abstract ▴
This paper presents a quasistatic finite element algorithm for robustly simulating deformable flesh attached to a kinematic skeleton. To enable fast conjugate gradient solvers during Newton-Raphson iteration, the method modifies the element stiffness matrices to guarantee positive definiteness even under heavy compression and large boundary condition jumps, by diagonalizing the deformation gradient and clamping negative eigenvalues. Building on invertible finite elements, it smoothly extends elastic forces into the inverted and degenerate regime, removing the artificial time step restrictions usually needed to prevent mesh inversion. A penalty based, level set driven strategy is introduced for handling collision and self-collision of deformable tetrahedral bodies, demonstrated on flesh and muscle of the upper torso derived from the visible human data set.
Related Creating and Simulating Skeletal Muscle from the Visible Human Data Set · Dynamic Deformables: Implementation and Production Practicalities · Art-Directed Muscle Simulation for High-End Facial Animation · FEM Simulation of 3D Deformable Solids: A Practitioner's Guide to Theory, Discretization and Model Reduction
how to read this ▾ how to read this ▴
- Category
- Method: robust quasistatic FEM for flesh simulation
- Contributions
-
- A quasistatic finite element algorithm for simulating deformable flesh attached to a kinematic skeleton
- Modifies element stiffness matrices (diagonalizing the deformation gradient and clamping negative eigenvalues) to guarantee positive definiteness for fast conjugate-gradient solves, even under heavy compression
- Extends invertible finite elements into the degenerate regime and adds a penalty-based, level-set-driven strategy for collision and self-collision of tetrahedral bodies
- Context
- Builds on invertible finite elements and on the authors' muscle work (Creating and Simulating Skeletal Muscle from the Visible Human Data Set), targeting robust soft-tissue deformation for characters. Builds on: Creating and Simulating Skeletal Muscle from the Visible Human Data Set
- Correctness
- Demonstrated on flesh and muscle of the upper torso from the Visible Human data set; the quasistatic assumption omits true inertial dynamics, so it targets pose-dependent deformation rather than full dynamic response, and collision handling is penalty-based (parameter sensitive).
- Clarity
- Technically dense; a first pass conveys the robustness goals, but the eigenvalue clamping and invertible-element handling require a careful second or third pass.
- How to read it
- First pass for the robustness contributions (positive-definiteness fix, inversion handling); plan a slow second/third pass on the linearization and collision sections if you implement an FEM flesh solver.
Muscles
-
, , , , ,
Statistical body shape model factoring body shape from pose deformation, foundational for parameterized human body models used in vision and animation.
abstract ▾ abstract ▴
SCAPE (Shape Completion and Animation for PEople) is a data-driven method for building a human shape model that spans variation in both subject shape and pose. It learns a pose deformation model that derives non-rigid surface deformation as a function of the articulated skeleton pose, and a separate body shape model captured with principal component analysis over a set of 3D scans of different people. The two models combine to produce surface meshes with realistic muscle deformation for new people in new poses. The model is used for shape completion, generating a full surface mesh from a limited set of markers, with applications to partial view completion and to animating a moving person from a single static scan plus a marker motion capture sequence.
Related Avatar Reshaping and Automatic Rigging Using a Deformable Model · Surface Based Motion Retargeting by Preserving Spatial Relationship · Data-driven Modeling of Skin and Muscle Deformation · Layered Construction for Deformable Animated Characters
how to read this ▾ how to read this ▴
- Category
- Method / model: a data-driven statistical human body model
- Contributions
-
- A data-driven model spanning variation in both subject shape and pose
- A pose deformation model giving non-rigid surface deformation as a function of articulated skeleton pose, plus a separate PCA shape model over 3D scans of many people
- Combines the two to synthesize realistic posed meshes for new people, enabling shape completion from sparse markers and animation from a single static scan plus marker motion
- Context
- A foundational parameterized body model that factors shape from pose deformation, widely built upon in later vision and animation body-model work.
- Correctness
- Learned from a corpus of 3D scans, so fidelity depends on how well that corpus spans target body types and poses; the pose-dependent deformation is a learned approximation rather than physical simulation, and extrapolation beyond the training distribution is a known caution.
- Clarity
- Conceptually clear with a clean shape/pose separation; a first pass conveys the model, a second pass clarifies the deformation learning and completion procedure.
- How to read it
- Worth a careful read as a foundational reference; first pass for the shape-versus-pose factorization, second pass on the deformation model and completion if you use or extend statistical body models.
Skinning / Retargeting
-
,
Method to extract linear blend skinning weights and skeleton from animated mesh sequences, enabling compact representation of captured deformations.
abstract ▾ abstract ▴
This paper extends character skinning techniques to the general setting of skinning arbitrary deformable mesh animations without requiring user-defined skeletons or bones. The method automatically estimates proxy bone transformations by applying nonparametric mean shift clustering to high-dimensional triangle rotation sequences computed via polar decomposition, then uses robust least squares to determine bone-vertex influence sets and vertex weights, preferring nonnegative least squares to avoid over-fitting. A low-rank displacement correction model defined in the rest pose provides progressive convergence with a fixed number of bones. The resulting skinned mesh animations enable hardware-accelerated rendering with matrix palette skinning, rest pose editing, level-of-detail generation, and reduced-coordinate deformable collision detection.
Related Real-Time Weighted Pose-Space Deformation on the GPU · Smooth Skinning Decomposition with Rigid Bones · Efficient Elasticity for Character Skinning with Contact and Collisions · Delta Mush: Smoothing Deformations While Preserving Detail
how to read this ▾ how to read this ▴
- Category
- Method: extracting a skinned rig from mesh animation
- Contributions
-
- Skins arbitrary deformable mesh animations without user-defined skeletons or bones
- Estimates proxy bone transformations via nonparametric mean shift clustering of per-triangle rotation sequences (from polar decomposition), with robust nonnegative least squares for bone-vertex influence and weights
- Adds a rest-pose low-rank displacement correction for progressive accuracy, enabling hardware matrix-palette rendering, rest-pose editing, level of detail, and reduced-coordinate collision
- Context
- Generalizes character skinning (linear blend skinning / matrix-palette skinning) to a fitting problem over captured or simulated mesh animations, producing a compact skinned approximation.
- Correctness
- Approximation quality depends on how well the motion is captured by proxy-bone rigid transforms plus a low-rank correction; highly non-articulated or extreme deformations may need more bones or larger corrections, and the result is a fit rather than an exact reproduction.
- Clarity
- Accessible pipeline with well-known building blocks (polar decomposition, clustering, least squares); a first pass conveys the method, a second pass clarifies the clustering and weight-solve specifics.
- How to read it
- First pass for the extract-a-rig-from-animation idea and its applications; second pass on the clustering and least-squares stages if you implement skinning fitting or compression.
Skinning
2004
5-
,
Puss in Boots, a new character in Shrek 2, wears an ostrich feather in his hat that had to move convincingly.
abstract ▾ abstract ▴
Puss in Boots, a new character in Shrek 2, wears an ostrich feather in his hat that had to move convincingly. The technique combines a coarse simulation of an underlying surface, designed to run quickly and produce stable predictable results, with detailed procedural animation applied to the barbs. The barb animation is computed at render time and oscillates the individual barbs of the feather to add lively secondary motion.
Related Feathers for Mystical Creatures: Pegasus · Rendertime Procedural Feathers Through Blended Guide Meshes · Hummingbird: DreamWorks Feather System · Mesh-Driven Generation and Animation of Groomed Feathers
how to read this ▾ how to read this ▴
- Category
- Production talk / breakdown: feather animation for a film character
- Contributions
-
- Demonstrates animating Puss in Boots' ostrich hat feather in Shrek 2 by combining a coarse, fast, stable underlying-surface simulation with detailed procedural barb animation
- Computes barb motion at render time, oscillating individual barbs to add lively secondary motion
- Context
- A production technique in the secondary-motion and procedural-detail tradition, pairing a cheap simulated base with render-time procedural enrichment for feather dynamics.
- Correctness
- Studio practice, not peer-reviewed; the coarse-sim-plus-procedural-barbs approach is production-proven for one character's shot needs, and its generality beyond that look is not claimed.
- Clarity
- Very accessible; a single read conveys the coarse-base-plus-procedural-detail idea.
- How to read it
- Read once for the layering strategy (stable coarse sim + render-time procedural barbs); a useful mental pattern for adding secondary motion cheaply rather than a formal method.
CFX
-
,
Discovers logically similar motion variants in large capture databases and builds continuous parameterized motion families automatically.
abstract ▾ abstract ▴
Large motion data sets often contain many variants of the same kind of motion, but without appropriate tools it is difficult to fully exploit this fact. This paper provides automated methods for identifying logically similar motions in a data set and using them to build a continuous and intuitively parameterized space of motions. To find logically similar motions that are numerically dissimilar, our search method employs a novel distance metric to find "close" motions and then uses them as intermediaries to find more distant motions. Search queries are answered at interactive speeds through a precomputation that compactly represents all possibly similar motion segments. Once a set of related motions has been extracted, we automatically register them and apply blending techniques to create a continuous space of motions. Given a function that defines relevant motion parameters, we present a method for extracting motions from this space that accurately possess new parameters requested by the user. Our algorithm extends previous work by explicitly constraining blend weights to reasonable values and having a run-time cost that is nearly independent of the number of example motions. We present experimental results on a test data set of 37,000 frames, or about ten minutes of motion sampled at 60 Hz.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Motion Warping · Verbs and Adverbs: Multidimensional Motion Interpolation · Physically Based Motion Transformation
how to read this ▾ how to read this ▴
- Category
- Method: motion-data search and parameterization
- Contributions
-
- Automated identification of logically similar motions using a novel distance metric plus intermediary motions to bridge numerically dissimilar but related clips
- Interactive-speed search via a precomputation that compactly represents all possibly similar motion segments
- Automatic registration and blending into a continuously, intuitively parameterized motion space, with blend weights constrained to reasonable values and run-time cost nearly independent of example count
- Context
- Extends data-driven motion synthesis built on motion databases (notably Kovar and Gleicher's Motion Graphs) from connecting clips toward extracting and parameterizing families of similar motions. Builds on: Motion Graphs
- Correctness
- Relies on a distance metric and chained intermediaries to capture logical similarity, and on blending staying valid within the constrained weight range; readers should remember accuracy of requested parameters depends on the example coverage and the user-supplied parameter function.
- Clarity
- Clearly written; a first pass conveys the search-then-parameterize pipeline, a second pass is needed for the distance metric, precomputation, and registration/blending details.
- How to read it
- Focus on the distance metric and the intermediary-based search, then on how the parameterized space is built; a second pass pays off if you want interactive query or controllable blending.
Motion Synthesis / Retargeting
-
,
Transfers the deformation of a source mesh onto a different target mesh, the standard tool behind blendshape and rig transfer.
abstract ▾ abstract ▴
Deformation transfer applies the deformation exhibited by a source triangle mesh onto a different target triangle mesh. Our approach is general and does not require the source and target to share the same number of vertices or triangles, or to have identical connectivity. The user builds a correspondence map between the triangles of the source and those of the target by specifying a small set of vertex markers. Deformation transfer computes the set of transformations induced by the deformation of the source mesh, maps the transformations through the correspondence from the source to the target, and solves an optimization problem to consistently apply the transformations to the target shape. The resulting system of linear equations can be factored once, after which transferring a new deformation to the target mesh requires only a backsubstitution step. Global properties such as foot placement can be achieved by constraining vertex positions. We demonstrate our method by retargeting full body key poses, applying scanned facial deformations onto a digital character, and remapping rigid and non-rigid animation sequences from one mesh onto another.
Related Transferring Facial Expressions to Different Face Models · Surface Based Motion Retargeting by Preserving Spatial Relationship · Transferring the Rig and Animations from a Character to Different Face Models · Laplacian Surface Editing
how to read this ▾ how to read this ▴
- Category
- Method: mesh deformation transfer (correspondence + linear optimization)
- Contributions
-
- Transfers deformation from a source triangle mesh to a target with different vertex/triangle count and connectivity, using a small set of user vertex markers to build a triangle correspondence
- Maps source deformation transformations through the correspondence and solves an optimization to apply them consistently to the target
- Factors the linear system once so each new transfer needs only a backsubstitution, with vertex constraints for global properties like foot placement
- Context
- Builds on motion/deformation retargeting work such as Gleicher's Retargeting Motion to New Characters, generalizing it to arbitrary triangle meshes and becoming the standard basis for blendshape and rig transfer. Builds on: Retargeting Motion to New Characters
- Correctness
- Quality hinges on the user-specified marker correspondence and on deformations being well-represented by per-triangle affine transforms; demonstrated on full-body key poses, scanned facial deformations, and rigid/non-rigid sequences, so very different topologies or sparse/poor markers can degrade results.
- Clarity
- Clearly presented; a first pass conveys the correspondence-and-solve idea, a second pass is worth it for the transformation formulation and the prefactored linear system.
- How to read it
- Read for how triangle transformations are defined and mapped through correspondence; second-pass the optimization setup and the single-factorization trick if you plan to implement or extend transfer.
Retargeting / Skinning
-
, , , , ,
Encodes each vertex by the mesh Laplacian relative to its neighbourhood, an intrinsic differential representation that preserves surface detail under deformation and lets detail be transferred or transplanted between meshes.
abstract ▾ abstract ▴
Surface editing operations commonly require geometric details of the surface to be preserved as much as possible. We argue that geometric detail is an intrinsic property of a surface and that, consequently, surface editing is best performed by operating over an intrinsic surface representation. We provide such a representation of a surface, based on the Laplacian of the mesh, by encoding each vertex relative to its neighborhood. The Laplacian of the mesh is enhanced to be invariant to locally linearized rigid transformations and scaling. Based on this Laplacian representation, we develop useful editing operations: interactive free-form deformation in a region of interest based on the transformation of a handle, transfer and mixing of geometric details between two surfaces, and transplanting of a partial surface mesh onto another surface. The main computation involved in all operations is the solution of a sparse linear system, which can be done at interactive rates. We demonstrate the effectiveness of our approach in several examples, showing that the editing operations change the shape while respecting the structural geometric detail.
Related Direct Manipulation of Free-Form Deformations · Surface Based Motion Retargeting by Preserving Spatial Relationship · Deformation Transfer for Triangle Meshes · Mesh-Based Inverse Kinematics
how to read this ▾ how to read this ▴
- Category
- Method: differential (Laplacian) coordinates for detail-preserving mesh editing and transfer
- Contributions
-
- Represents a surface intrinsically by the mesh Laplacian, encoding each vertex relative to its one-ring neighbourhood so geometric detail is stored explicitly rather than as absolute positions
- Makes that Laplacian representation invariant to locally linearized rigid transformation and scaling, so handle-driven edits rotate and stretch detail correctly
- Builds three operations on it: interactive handle-based free-form deformation, transfer and mixing of geometric detail between two surfaces, and transplanting a partial mesh onto another, each reduced to one sparse linear solve at interactive rates
- Context
- A root of the differential-coordinate line of mesh deformation: it formalized Laplacian coordinates that later detail-transfer, as-rigid-as-possible and biharmonic-weight methods lean on, and it is a sibling of the contemporaneous Deformation Transfer for Triangle Meshes. Mesh-Based Inverse Kinematics descends from this representation.
- Correctness
- The rigid-invariance is a local linearization, so very large rotations can distort detail and need the implicit per-vertex transform fit; results depend on clean one-ring neighbourhoods and a well-conditioned Laplacian, and the cited examples are interactive single-mesh edits rather than full articulated rigs.
- Clarity
- Readable and example-driven; a first pass conveys the encode-relative-to-neighbours idea, a second pass earns the rotation-invariance derivation and the least-squares system.
- How to read it
- First pass: the intrinsic Laplacian encoding and why detail is preserved. Second pass: the rigid/scale invariance trick and the sparse least-squares formulation, especially the detail-transfer-between-meshes section if rigging detail transfer is your interest.
Skinning / Retargeting
-
, , ,
Gaussian process model for style-aware inverse kinematics generating human-like poses respecting learned stylistic priors.
abstract ▾ abstract ▴
This paper presents an inverse kinematics system based on a learned model of human poses. Given a set of constraints, our system can produce the most likely pose satisfying those constraints, in real-time. Training the model on different input data leads to different styles of IK. The model is represented as a probability distribution over the space of all possible poses. This means that our IK system can generate any pose, but prefers poses that are most similar to the space of poses in the training data. We represent the probability with a novel model called a Scaled Gaussian Process Latent Variable Model. The parameters of the model are all learned automatically; no manual tuning is required for the learning component of the system. We additionally describe a novel procedure for interpolating between styles.Our style-based IK can replace conventional IK, wherever it is used in computer animation and computer vision. We demonstrate our system in the context of a number of applications: interactive character posing, trajectory keyframing, real-time motion capture with missing markers, and posing from a 2D image.
Related A Facial Motion Retargeting Pipeline for Appearance Agnostic 3D Characters · Optimal and Interactive Keyframe Selection for Motion Capture · Creating an Actor-Specific Facial Rig from Performance Capture · Facial Retargeting with Automatic Range of Motion Alignment
how to read this ▾ how to read this ▴
- Category
- Method: learned-prior inverse kinematics (probabilistic pose model)
- Contributions
-
- An IK system that, given constraints, produces the most likely pose in real time under a learned probability distribution over poses
- A novel Scaled Gaussian Process Latent Variable Model whose parameters are learned automatically with no manual tuning, where training data determines the IK style
- A procedure for interpolating between styles, demonstrated on posing, trajectory keyframing, real-time mocap with missing markers, and posing from a 2D image
- Context
- Sits in the data-driven, statistical character-posing lineage, applying Gaussian-process latent-variable modeling as a learned pose prior to constrain inverse kinematics toward human-like results.
- Correctness
- The system can reach any pose but is biased toward the training distribution, so style and plausibility depend on the training data; readers should note out-of-distribution constraints will be pulled toward learned poses, which is the intended behavior but a limitation for novel motions.
- Clarity
- The application framing is accessible, but the SGPLVM core is mathematically dense; a first pass conveys the learned-prior idea, a second and likely third pass are needed for the model.
- How to read it
- Read first for the probabilistic-IK framing and the demos that show what the prior buys you; budget a careful second/third pass on the Scaled GPLVM if you need the formulation.
Rigging / Retargeting
2003
6-
, , ,
Discrete differential geometry formulation for thin elastic shells using hinge-angle bending energy, foundational for cloth and thin surface simulation.
abstract ▾ abstract ▴
This paper introduces a discrete shell model describing the behavior of thin flexible structures such as hats, leaves, and aluminum cans, which are characterized by a curved undeformed configuration. The model is governed by nonlinear membrane and flexural energies derived geometrically over triangle meshes, with the bending energy expressed as the squared difference of dihedral angles between the deformed and undeformed configurations to measure change in mean curvature. The formulation is invariant under rigid body transformation and can be implemented with only a small change to a standard cloth simulator. The authors demonstrate convincing simulations of materials ranging from paper to metal, including a comparison of a real and simulated falling hat and plastically deformed creased paper.
Related Simulating Cloth Using Bilinear Elements · Cloth and Skin Deformation with a Triangle Mesh Based Convolutional Neural Network · Mixing Yarns and Triangles in Cloth Simulation · Multi-Resolution Isotropic Strain Limiting
how to read this ▾ how to read this ▴
- Category
- Method: discrete differential geometry model for thin shells
- Contributions
-
- A discrete shell model for thin flexible structures with a curved undeformed configuration, with nonlinear membrane and flexural energies defined geometrically over triangle meshes
- A bending energy expressed as the squared difference of dihedral angles between deformed and undeformed states, measuring change in mean curvature and invariant under rigid motion
- An implementation requiring only a small change to a standard cloth simulator, with materials ranging from paper to metal
- Context
- Builds on Baraff and Witkin's Large Steps in Cloth Simulation and the discrete-differential-geometry tradition, recasting thin-shell bending as a geometric hinge-angle energy. Builds on: Large Steps in Cloth Simulation
- Correctness
- Demonstrated across paper-to-metal materials including a real-versus-simulated falling-hat comparison and creased paper; reader caveat is that the model assumes a triangle-mesh hinge formulation, so behavior depends on mesh resolution and on energy parameters, and plasticity is handled in the demonstrated cases rather than a general elastoplastic theory.
- Clarity
- Accessible for a DDG paper; a first pass conveys the dihedral-angle bending idea, a second pass is needed for the energy derivation.
- How to read it
- Focus on the hinge-angle bending energy and its rigid-motion invariance, plus the curved rest-state assumption; a second pass on the energy formulation pays off if you implement or modify a shell solver.
CFX
-
, ,
User paints a timeline with semantic annotations and the system assembles motion-capture frames via dynamic programming to satisfy them.
abstract ▾ abstract ▴
This paper describes a framework that allows a user to synthesize human motion while retaining control of its qualitative properties. The user paints a timeline with annotations --- like walk, run or jump --- from a vocabulary which is freely chosen by the user. The system then assembles frames from a motion database so that the final motion performs the specified actions at specified times. The motion can also be forced to pass through particular configurations at particular times, and to go to a particular position and orientation. Annotations can be painted positively (for example, must run), negatively (for example, may not run backwards) or as a don't-care . The system uses a novel search method, based around dynamic programming at several scales, to obtain a solution efficiently so that authoring is interactive. Our results demonstrate that the method can generate smooth, natural-looking motion.The annotation vocabulary can be chosen to fit the application, and allows specification of composite motions (run and jump simultaneously, for example). The process requires a collection of motion data that has been annotated with the chosen vocabulary.
Related A Deep Learning Framework for Character Motion Synthesis and Editing · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Physically Based Motion Transformation · Character Motion Synthesis by Topology Coordinates
how to read this ▾ how to read this ▴
- Category
- Method: annotation-driven motion synthesis from mocap
- Contributions
-
- A framework where a user paints a timeline with a freely chosen annotation vocabulary (walk, run, jump) and the system assembles matching frames from a motion database
- Support for positive, negative, and don't-care annotations plus positional and configuration constraints, including composite simultaneous motions
- A multi-scale dynamic-programming search that solves the assembly interactively for smooth, natural-looking motion
- Context
- Builds directly on Kovar et al.'s Motion Graphs, adding a semantic annotation layer and a dynamic-programming search so users specify qualitative content rather than just paths. Builds on: Motion Graphs
- Correctness
- Validated by smooth, natural-looking results that satisfy painted constraints; reader caveat is that it requires a motion database pre-annotated with the chosen vocabulary, so coverage and label quality bound what can be synthesized.
- Clarity
- Accessible; a first pass conveys the paint-the-timeline workflow, a second pass clarifies the multi-scale dynamic programming.
- How to read it
- Focus on the annotation interface and the multi-scale dynamic-programming search; a second pass is worth it for the search formulation and how constraints are encoded.
Motion Synthesis
-
,
This paper describes a physics-based method for synthesizing realistic bird flight animations.
abstract ▾ abstract ▴
This paper describes a physics-based method for synthesizing realistic bird flight animations. The bird is modeled as an articulated skeleton with elastically deformable feathers, and motion is produced by applying joint torques and aerodynamic forces in a forward dynamics simulation. Wingbeats are optimized individually and concatenated so a bird can follow a specified trajectory while taking off, cruising, descending, turning, and landing.
Related Apteryx: Procedural Generation, Sculpting and Grooming of Feathers · Group Based Rigging of Realistically Feathered Wings · Biological Modeling of Feathers by Morphogenesis Simulation · Collision-free Construction of Animated Feathers Using Implicit Constraint Surfaces
how to read this ▾ how to read this ▴
- Category
- Method: physics-based animation synthesis (forward dynamics + optimization)
- Contributions
-
- Models a bird as an articulated skeleton with elastically deformable feathers, driven by joint torques and aerodynamic forces in forward dynamics
- Optimizes individual wingbeats and concatenates them to follow a specified trajectory
- Covers a full repertoire of flight phases (takeoff, cruising, descending, turning, landing)
- Context
- Sits in the physics-based character animation lineage, extending optimization-driven motion ideas such as Popovic and Witkin's Physically Based Motion Transformation to the aerodynamics of flapping flight. Builds on: Physically Based Motion Transformation
- Correctness
- Realism rests on the chosen aerodynamic force model and the assumption that per-wingbeat optimization concatenates into smooth trajectory-following motion; a reader should treat results as physically plausible synthesis rather than validated against measured bird kinematics.
- Clarity
- Conceptually accessible; a first pass conveys the simulate-then-optimize idea, do a second pass for the aerodynamic force model and the wingbeat optimization formulation.
- How to read it
- Focus on how aerodynamic forces are computed on the feathers and how the per-wingbeat objective is set up; a second pass on the optimization and concatenation is worth it if you care about reproducing trajectory control.
Motion Synthesis
-
, ,
Mixed implicit/explicit integrator with physically correct bending and interface forecasting to preserve folds in cloth-character collisions.
abstract ▾ abstract ▴
This paper presents techniques for simulating clothing that matches the look and behavior of real garments by capturing folds, wrinkles, draping, and stretching. The contributions include a mixed explicit/implicit time integration scheme that handles elastic forces explicitly and damping forces implicitly, a physically correct bending model between pairs of triangles with potentially nonzero rest angles for pre-shaping wrinkles, an interface forecasting technique that promotes detail in contact regions, a post-processing method for cloth-character collisions that preserves folds and wrinkles, and a dynamic sticking constraint for controlling large-scale folding. Cloth-object collisions are handled using a level set signed distance function defined on a grid, and self-collision uses prior robust treatment. The methods were used in film production including Terminator 3 and Harry Potter.
Related Large Steps in Cloth Simulation · Untangling Cloth · Example-Based Wrinkle Synthesis for Clothing Animation · Fast Cloth Simulation on Moving Humanoids
how to read this ▾ how to read this ▴
- Category
- Method: cloth simulation (integration, bending, collision handling)
- Contributions
-
- Mixed explicit/implicit time integration (elastic forces explicit, damping implicit) plus a physically correct bending model with nonzero rest angles for pre-shaping wrinkles
- Interface forecasting to promote detail in contact regions and a post-process that preserves folds and wrinkles during cloth-character collisions
- A dynamic sticking constraint for controlling large-scale folding
- Context
- Builds directly on Bridson et al.'s robust collision, contact and friction treatment for skinned cloth, layering fold and wrinkle preservation on top of that collision foundation. Builds on: Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation
- Correctness
- Demonstrated on film production garments (Terminator 3, Harry Potter) and tuned for that look; the bending and forecasting choices are aimed at visual fidelity, so a reader should view it as production-validated rather than benchmarked against measured fabric mechanics.
- Clarity
- Reads as a well-structured systems paper; a first pass gives the pipeline of techniques, a second pass is needed for the integration scheme and bending energy details.
- How to read it
- Read for the catalogue of distinct techniques and where each fits in the pipeline; second-pass the integrator and bending model, and read alongside the 2002 robust-collision paper since it is the assumed substrate.
CFX
-
, ,
Resolves deeply interpenetrating cloth using a global geometry-based untangling approach, essential for garment initialization and collision recovery.
abstract ▾ abstract ▴
Presents a history-free collision handling algorithm for cloth that resolves self-intersections by globally analyzing the current state of the cloth. Uses global intersection analysis to determine interpenetrating regions, classifies intersection paths according to endpoint positions and degenerated vertices, and applies radial basis function fitting to extrapolate correspondence information across the cloth surface. Attractive forces between corresponding points resolve intersections without requiring knowledge of past timesteps.
Related Fast Cloth Simulation on Moving Humanoids · Clean Cloth Inputs: Removing Character Self-Intersections with Volume Simulation · Dynamic Deformables: Implementation and Production Practicalities · Robust Treatment of Collisions, Contact and Friction for Cloth Animation
how to read this ▾ how to read this ▴
- Category
- Method: history-free cloth collision recovery (untangling)
- Contributions
-
- A history-free algorithm that resolves cloth self-intersections by globally analyzing the current state, no past-timestep knowledge required
- Global intersection analysis that classifies intersection paths by endpoint and degenerate vertices, with radial basis function fitting to extrapolate correspondence across the surface
- Attractive forces between corresponding points that pull interpenetrating regions apart
- Context
- Complements robust per-step collision methods such as Bridson et al.'s skinned-cloth treatment by handling the deeply interpenetrated states those methods assume away, useful for garment initialization and recovery. Builds on: Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation
- Correctness
- The approach assumes the current geometry alone carries enough structure to infer correct correspondence via RBF extrapolation; readers should note it targets recovery from existing tangles rather than preventing them, and pathological intersection topologies may stress the path classification.
- Clarity
- The high-level idea (analyze geometry now, pull tangles apart) is intuitive; a second pass is needed for the intersection-path classification and RBF correspondence machinery.
- How to read it
- Focus first on what global intersection analysis computes and why history-free matters; do a careful second pass on the path classification and RBF fitting if you intend to implement untangling.
CFX
-
, ,
Foundational WDAS sketch presenting XGen, a system for instancing curves and geometry on surfaces for fur, feathers, foliage, and hair with grooming brush tools.
abstract ▾ abstract ▴
XGen is a flexible arbitrary primitive generator designed as a replacement for Disney's fur system. It decouples generation methods from primitives through independent modules: primitives, generators, renderers and patches. Supports interactive grooming through guides and hierarchical control, with expression-based attribute manipulation for rapid prototyping of complex looks from the same base groom across multiple characters.
Related Framestore Creatures & Houdini | Framestore | Character FX & Crowds Production Talks · Hair Effects in Trolls World Tour · Automation of Creature FX in a Small Studio Pipeline · Gravity Preloading for Maintaining Hair Shape Using the Simulator as a Closed-Box Function
how to read this ▾ how to read this ▴
- Category
- Production talk / system sketch: arbitrary primitive (fur/hair/foliage) generator
- Contributions
-
- Presents XGen, a flexible arbitrary primitive generator replacing Disney's fur system, that instances curves and geometry on surfaces for fur, feathers, foliage and hair
- Decouples generation from primitives via independent modules (primitives, generators, renderers, patches)
- Supports interactive grooming through guides and hierarchical control with expression-based attributes for reusing a base groom across characters
- Context
- A studio production system in the instancing/grooming lineage for surface-borne primitives, framed as the successor to Disney's prior in-house fur tooling.
- Correctness
- Studio practice, not peer-reviewed; the modular decoupling and groom-reuse claims are production-proven design rationale rather than quantitatively validated results.
- Clarity
- A short, accessible sketch; one read conveys the architecture and intent, with little formulation to revisit.
- How to read it
- Read once for the module decomposition and the grooming/expression workflow; treat it as architectural insight into a production tool rather than an algorithm to reimplement.
CFX
2002
7-
,
This work defines the structure of an individual feather using a parameterization derived from the biological structure and substructures of real feathers.
abstract ▾ abstract ▴
This work defines the structure of an individual feather using a parameterization derived from the biological structure and substructures of real feathers. The model uses Bezier-curve based descriptions of the rachis and barbs and can generate a wide variety of feathers at multiple levels of detail. It provides a first step toward semi-automatically generating full feather coats for computer graphics characters.
Related Procedurally Generating Biologically Driven Feathers · Feathers for Mystical Creatures: Pegasus · Modeling and Rendering of Realistic Feathers · Biological Modeling of Feathers by Morphogenesis Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a procedural model of an individual feather
- Contributions
-
- A feather parameterization derived from the biological structure and substructures of real feathers
- Bezier-curve based descriptions of the rachis and barbs that generate varied feather types at multiple levels of detail
- A first step toward semi-automatically generating full feather coats for CG characters
- Context
- Relates to procedural and biologically inspired modeling of natural structures, grounding feather geometry in real feather anatomy rather than ad hoc geometry.
- Correctness
- Presented as a single-feather modeling approach validated by the variety of feathers it can generate; reader caveat is that it is an early step toward full coats, so coverage, layout, and rendering of complete plumage are largely out of scope here.
- Clarity
- Accessible; a single first pass likely conveys the parameterization, with a second pass only if you need the curve construction details.
- How to read it
- Focus on the biological-to-Bezier mapping (rachis and barbs) and the level-of-detail control; one pass is enough unless you are implementing the geometry.
CFX
-
, ,
Compresses pose-dependent skinning corrections into per-joint eigenbases (PCA of displacement deltas) and evaluates them on the GPU, real-time corrective skinning in hardware.
abstract ▾ abstract ▴
EigenSkin precomputes accurate but expensive nonlinear skin deformations for a character, then approximates their difference from standard linear blend skinning with a compact per-joint eigenbasis of displacement corrections (eigendisplacements) obtained by principal component analysis. At runtime these corrections are evaluated and applied on programmable graphics hardware, so large pose-dependent skin deformations of a detailed character, demonstrated on a human hand, run in real time. The method delivers the visual quality of dense corrective skinning at a fraction of the runtime cost.
Related Real-Time Weighted Pose-Space Deformation on the GPU · Scan-based Volume Animation Driven by Locally Adaptive Articulated Registrations · Delta Mush: Smoothing Deformations While Preserving Detail · Mobilizing Mocap, Motion Blending, and Mayhem: Rig Interoperability for Crowd Simulation on Incredibles 2
how to read this ▾ how to read this ▴
- Category
- Method: GPU corrective character skinning via eigenbases
- Contributions
-
- Approximates pose-dependent skin corrections to linear blend skinning with a compact per-joint eigenbasis (eigendisplacements from PCA)
- Evaluates the corrections on programmable graphics hardware for real-time large deformations
- Delivers dense corrective-skinning quality at a fraction of the runtime cost
- Context
- Builds on Pose Space Deformation, compressing its corrective-shape idea into a low-rank, GPU-evaluable form. Builds on: Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation
- Correctness
- Sound and practical; quality depends on how many eigenbasis terms are kept and on the training poses, a standard accuracy-versus-cost tradeoff.
- Clarity
- Clear; the eigendisplacement idea is intuitive if you know PCA, and the hardware mapping is well explained.
- How to read it
- First pass: the eigendisplacement correction to LBS and why it is GPU-friendly. Second pass: the PCA construction and per-joint bases if you will implement it.
Skinning
-
, , , ,
Drives a volumetric character's elastic deformation with the animator's skeleton via a multiresolution linear-elasticity solve, giving interactively controllable, physically based secondary motion.
abstract ▾ abstract ▴
Introduces skeleton-driven dynamic deformation: a control lattice over a volumetric character is deformed by a linear elasticity solve driven by the animator's skeleton, producing physically based, interactively controllable secondary motion. A multiresolution basis keeps the simulation fast enough for interactive use, offering an alternative to purely geometric skinning.
Related Physically Based Rigging for Deformable Characters · Somigliana Coordinates: an Elasticity-Derived Approach for Cage Deformation · Bodyopt: A Character Deformation Pipeline for Avatar: The Way of Water · Sharp Kelvinlets: Elastic Deformations with Cusps and Localized Falloffs
how to read this ▾ how to read this ▴
- Category
- Method: skeleton-driven physically based deformation
- Contributions
-
- Skeleton-driven dynamic deformation, where a control lattice over a volumetric character is deformed by a linear elasticity solve driven by the animator's skeleton
- Physically based, interactively controllable secondary motion as an alternative to purely geometric skinning
- A multiresolution basis that keeps the elasticity simulation fast enough for interactive use
- Context
- Builds on Terzopoulos et al.'s elastically deformable models, coupling that elasticity framework to an animator's skeleton for controllable secondary motion. Builds on: Elastically Deformable Models
- Correctness
- Uses a linear elasticity model on a control lattice, validated by interactive deformation results; reader caveat is that linear elasticity is an approximation that can be less accurate under large strains, and quality depends on the lattice resolution and multiresolution basis.
- Clarity
- Moderately accessible; a first pass conveys the skeleton-plus-elasticity concept, a second pass is needed for the multiresolution solve.
- How to read it
- Focus on how the skeleton drives the elasticity solve and what the multiresolution basis buys; do a second pass on the linear-elasticity formulation if you implement interactive secondary motion.
Skinning / Rigging
-
, , ,
This paper presents techniques for modeling and rendering realistic feathers and feathered birds.
abstract ▾ abstract ▴
This paper presents techniques for modeling and rendering realistic feathers and feathered birds. A feather is described as a branching structure generated by an L-system, so users can create many feather types and shapes by adjusting a few parameters. For efficient appearance rendering the authors derive a form of the bidirectional texture function that captures the small but visible geometric detail of the feather blade.
Related Procedurally Generating Biologically Driven Feathers · A Biologically-Parameterized Feather Model · Rendering Iridescent Rock Dove Neck Feathers · A Surface-based Appearance Model for Pennaceous Feathers
how to read this ▾ how to read this ▴
- Category
- Method: feather modeling and rendering
- Contributions
-
- An L-system description of a feather as a branching structure, so users create many feather types and shapes by adjusting a few parameters
- A bidirectional-texture-function based appearance representation capturing the small but visible geometric detail of the feather blade
- An end-to-end pipeline for modeling and rendering realistic feathers and feathered birds
- Context
- Relates to procedural natural-structure modeling (L-systems) and image/appearance-based rendering via the bidirectional texture function, applied to feathers and full plumage.
- Correctness
- Validated by the realism of rendered feathers and feathered birds; reader caveat is that the BTF appearance model trades exact micro-geometry for efficient rendering, so results depend on how well the captured BTF represents the blade detail.
- Clarity
- Accessible; a first pass conveys both the L-system modeling and the BTF rendering idea, with a second pass for the BTF derivation.
- How to read it
- Read modeling (L-system) and appearance (BTF) as two separable halves; a second pass on the BTF form pays off if you care about efficient feather shading.
CFX
-
, ,
Builds a graph of transitions inside a mocap corpus so arbitrary streams of motion can be synthesized from clips.
abstract ▾ abstract ▴
This paper introduces motion graphs, a method for synthesizing realistic and controllable character motion from a corpus of motion capture data. A directed graph is automatically constructed in which edges hold pieces of original motion data plus automatically generated transitions, and nodes serve as choice points where clips can be seamlessly connected. Candidate transitions are detected using a point-cloud distance metric over windows of frames, blended with linear and spherical linear interpolation, and the graph is pruned using strongly connected components to guarantee well-connected, label-consistent motion. New motion is generated by searching the graph with a branch and bound algorithm for graph walks that satisfy user constraints, demonstrated on the problem of synthesizing locomotion along arbitrary user-sketched paths.
Related DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds · Character Controllers Using Motion VAEs · Motion Grammars for Character Animation · A Deep Learning Framework for Character Motion Synthesis and Editing
how to read this ▾ how to read this ▴
- Category
- Method: data-driven motion synthesis from a mocap corpus
- Contributions
-
- Motion graphs, an automatically constructed directed graph whose edges hold original clips plus generated transitions and whose nodes are connection choice points
- A point-cloud distance metric over frame windows to detect candidate transitions, blended with linear and spherical-linear interpolation, with pruning via strongly connected components
- A branch-and-bound graph search that synthesizes constrained motion, demonstrated on locomotion along user-sketched paths
- Context
- Builds on Rose et al.'s Verbs and Adverbs multidimensional motion interpolation, moving from blending parameterized clips toward graph-based reassembly of an entire capture corpus. Builds on: Verbs and Adverbs: Multidimensional Motion Interpolation
- Correctness
- Demonstrated on synthesizing locomotion along arbitrary user paths; reader caveat is that output quality is bounded by the captured corpus and by the transition metric and blending, so coverage and naturalness depend on the data and graph connectivity.
- Clarity
- Accessible; a first pass conveys the graph idea, a second pass clarifies the distance metric, pruning, and search.
- How to read it
- Focus on graph construction (transition detection plus connectivity pruning) and the search formulation; a second pass on the distance metric and branch-and-bound is worth it if you build a synthesis system.
Motion Synthesis
- Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation SIGGRAPH Academic 1007 cites
, ,
Robust collision handling for cloth simulation using impulse-based contact resolution and continuous collision detection for stable production use.
abstract ▾ abstract ▴
Presents an algorithm to efficiently and robustly process collisions, contact and friction in cloth simulation. It works with any technique for the internal cloth dynamics and models true cloth thickness, combining fast repulsion forces with a robust geometric collision and impact-zone scheme. The simulation data can be post-processed with a collision-aware subdivision scheme to produce smooth, interference-free results for rendering.
Related Robust Treatment of Collisions, Contact and Friction for Cloth Animation · A Safe and Fast Repulsion Method for GPU-based Cloth Self Collisions · Dynamic Deformables: Implementation and Production Practicalities · CAMA: Contact-Aware Matrix Assembly with Unified Collision Handling for GPU-based Cloth Simulation
how to read this ▾ how to read this ▴
- Category
- Method: collision, contact and friction handling for cloth
- Contributions
-
- A collision pipeline that works with any internal cloth dynamics and models true cloth thickness
- A combination of fast repulsion forces with a robust geometric collision and impact-zone scheme for stable resolution
- A collision-aware subdivision post-process that yields smooth, interference-free results for rendering
- Context
- Builds on Baraff and Witkin's Large Steps in Cloth Simulation, adding a robust collision/contact/friction layer that is independent of the underlying integrator. Builds on: Large Steps in Cloth Simulation
- Correctness
- Aimed at robust, production-style cloth and validated by interference-free simulated results; reader caveat is that robustness rests on the repulsion-plus-impact-zone strategy and continuous collision detection, whose cost and tuning matter for dense self-contact.
- Clarity
- Moderately accessible; a first pass conveys the repulsion-plus-impact-zone strategy, a second pass is needed for the geometric collision and friction details.
- How to read it
- Treat internal dynamics as a black box and focus on the collision/contact/friction layer and impact zones; a second pass on the geometric scheme pays off if you implement robust cloth contact.
CFX
- Robust Treatment of Collisions, Contact and Friction for Cloth Animation SIGGRAPH Academic 1007 cites
, ,
Foundational algorithm for cloth collision, contact, and Coulomb friction, introduced velocity-filter-style resolution for simultaneous impacts.
abstract ▾ abstract ▴
Presents a robust algorithm for handling collisions, contact and friction in cloth simulation that combines geometric collision detection with repulsion forces. The method models cloth thickness and prevents self-intersection while producing realistic folds and wrinkles through proper friction modeling.
Related Robust Treatment of Collisions, Contact and Friction for a Skinned Cloth Simulation · A Safe and Fast Repulsion Method for GPU-based Cloth Self Collisions · Dynamic Deformables: Implementation and Production Practicalities · Efficient and Stable Approach to Elasticity and Collisions for Hair Animation
how to read this ▾ how to read this ▴
- Category
- Method: cloth collision, contact and friction
- Contributions
-
- A robust algorithm combining geometric collision detection with repulsion forces for cloth
- Modeling of cloth thickness and prevention of self-intersection
- Coulomb-style friction modeling that produces realistic folds and wrinkles
- Context
- Builds on Baraff and Witkin's Large Steps in Cloth Simulation, supplying the collision, contact, and friction treatment that the dynamics solver needs for believable folding. Builds on: Large Steps in Cloth Simulation
- Correctness
- Validated by realistic, self-intersection-free folds and wrinkles; reader caveat is that the results depend on the repulsion-plus-geometric-detection combination and the friction model, with the usual cost and tuning concerns under heavy self-contact.
- Clarity
- Accessible; a first pass conveys the repulsion-plus-collision idea, a second pass clarifies the friction and self-intersection handling.
- How to read it
- Focus on how repulsion forces and geometric detection cooperate and how friction yields folds; a second pass is worth it for the contact-resolution mechanics. Note this overlaps closely with the companion robust-cloth entry.
CFX
2000
2-
, , , , ,
Light stage system capturing the full reflectance field of a human face enabling relighting with arbitrary illumination for photoreal CG faces.
abstract ▾ abstract ▴
This paper presents a method to acquire the reflectance field of a human face using a light stage, capturing face images from multiple viewpoints under dense sampling of incident illumination directions. Reflectance functions are constructed for each pixel, enabling photorealistic re-rendering of the face under arbitrary novel lighting. The authors develop techniques to extrapolate the reflectance field to novel viewpoints using a skin reflectance model that separates specular and subsurface components, accounting for effects like shadows and interreflections.
Related The Digital Emily Project: Achieving a Photorealistic Digital Actor · Single-Shot High-Quality Facial Geometry and Skin Appearance Capture · Creating an Actor-Specific Facial Rig from Performance Capture · Displaced Dynamic Expression Regression for Real-Time Facial Tracking and Animation
how to read this ▾ how to read this ▴
- Category
- Capture system: face reflectance acquisition for relighting
- Contributions
-
- A light stage that captures a face under dense sampling of incident illumination directions from multiple viewpoints
- Per-pixel reflectance functions that allow photorealistic re-rendering under arbitrary novel lighting
- A skin reflectance model separating specular and subsurface components to extrapolate the field to novel viewpoints
- Context
- Foundational photoreal-face capture work; relates to image-based rendering and reflectance modeling, treating a face as a measured reflectance field rather than a hand-built shading model.
- Correctness
- Demonstrated on captured human faces with the stated specular/subsurface separation; reading caveat is that viewpoint extrapolation and effects like shadows and interreflections rely on the chosen reflectance model, so accuracy away from sampled views and lighting depends on how well that model holds.
- Clarity
- Accessible at a conceptual level; a first pass conveys the light-stage idea, do a second pass for the reflectance-function construction and the skin model.
- How to read it
- Focus first on what is measured (reflectance field) versus what is modeled (specular/subsurface split); a second pass pays off if you care about how relighting and novel-view rendering are actually computed.
Facial
- Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation SIGGRAPH Industrial
, ,
Corrective shapes interpolated in pose space on top of skeletal skinning, the backbone idea behind modern corrective workflows.
abstract ▾ abstract ▴
Pose space deformation is a unified, purely kinematic approach to surface deformation that generalizes and improves upon both shape interpolation and skeleton subspace deformation (SSD). The key observation is that several types of deformation can be represented uniformly as mappings from a pose space, defined by an underlying skeleton or a more abstract set of parameters, to displacements in the object local coordinate frames. The deformation is treated as a scattered data interpolation problem solved with Gaussian radial basis functions, allowing artists to directly sculpt desired shapes at arbitrary points in pose space with smooth interpolation between them. The approach addresses drawbacks of shape interpolation and the characteristic collapsing joint defects of SSD, while retaining real-time synthesis performance suitable for facial and body deformation in entertainment, gaming, and telepresence applications.
Related DreamWorks Animation Facial Motion and Deformation System · Direct Manipulation Blendshapes · Inverse Kinematics for Reduced Deformable Models · A Blendshape Model that Incorporates Physical Interaction
how to read this ▾ how to read this ▴
- Category
- Method: a corrective deformation technique (pose space deformation)
- Contributions
-
- A unified kinematic framework treating both shape interpolation and skeleton subspace deformation as mappings from a pose space to local-frame displacements
- Scattered-data interpolation via Gaussian radial basis functions so artists can sculpt target shapes directly at arbitrary poses
- Removes the joint-collapse defects of SSD while keeping real-time synthesis suitable for facial and body deformation
- Context
- Builds on Sederberg and Parry's free-form deformation and Magnenat-Thalmann et al.'s joint-dependent local deformations, generalizing shape interpolation and skeleton subspace deformation into one pose-space formulation. Builds on: Free-Form Deformation of Solid Geometric Models · Joint-Dependent Local Deformations for Hand Animation and Object Grasping
- Correctness
- Purely kinematic and artist-driven, validated on facial and body deformation for entertainment and telepresence; reader caveat is that quality depends on the sculpted examples and RBF interpolation, and being non-physical it does not model dynamics or contact.
- Clarity
- Accessible; a first pass conveys the pose-space idea clearly, do a second pass for the radial-basis-function formulation.
- How to read it
- Focus on the pose-space-to-displacement mapping and why it cures SSD joint collapse; a second pass on the RBF scattered-data solve is worth it if you implement correctives.
Skinning
1999
2-
,
The 3D morphable face model: a statistical face space fit to images, ancestor of every data-driven face model since.
abstract ▾ abstract ▴
This paper introduces a technique for modeling textured 3D faces from a dataset of prototypical laser scans, deriving a morphable face model by transforming the shape and texture of example faces into a vector space representation. New faces and expressions are generated as linear combinations of the prototypes, with shape and texture constraints derived from the statistics of the example faces regulating the naturalness of results. An analysis-by-synthesis matching algorithm reconstructs 3D shape and texture from one or more photographs by optimizing model coefficients and rendering parameters, and a bootstrapping optic flow procedure establishes dense one-to-one correspondence across the example faces. The system also enables manipulation of complex facial attributes such as gender, fullness, and distinctiveness.
Related 3D Morphable Face Models: Past, Present and Future · Practice and Theory of Blendshape Facial Models · Learning an Animatable Detailed 3D Face Model from In-The-Wild Images · FaceWarehouse: A 3D Facial Expression Database for Visual Computing
how to read this ▾ how to read this ▴
- Category
- Method: a statistical (morphable) 3D face model
- Contributions
-
- A morphable face model: shape and texture of laser-scanned prototype faces cast into a vector space, with new faces as regulated linear combinations
- An analysis-by-synthesis algorithm that reconstructs 3D shape and texture from one or more photographs by optimizing model and rendering parameters
- A bootstrapping optic-flow procedure for dense correspondence, plus control of attributes like gender, fullness, and distinctiveness
- Context
- A foundational statistical-shape-model approach for faces, ancestor of essentially all later data-driven 3D face models, drawing on PCA-style example-based modeling rather than a single prior graphics paper.
- Correctness
- The face space is only as expressive as its laser-scan dataset and the assumption of dense correspondence and linear shape/texture combination; analysis-by-synthesis fitting is an optimization that can depend on initialization and imaging conditions.
- Clarity
- Dense, with statistics and optimization; a first pass conveys the model concept, a careful second (and third) pass is needed for the correspondence and fitting machinery.
- How to read it
- First pass for the linear face-space idea and analysis-by-synthesis framing; second/third pass on the correspondence bootstrapping and the fitting optimization if you work with face models.
Facial
-
,
Edits captured motion through a spacetime optimization over a simplified physical character model, so the transformed motion stays physically plausible while meeting new animator constraints.
abstract ▾ abstract ▴
Presents physically based motion transformation: edits to a captured or animated motion are propagated through a spacetime optimization over a simplified physical model of the character, so the modified motion stays physically plausible, respecting momentum, mass and forces, while satisfying the animator's new constraints. It enables retargeting and stylistic edits that preserve dynamic realism.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Optimal and Interactive Keyframe Selection for Motion Capture · Sketch-based Motion Editing for Articulated Characters · Automated Extraction and Parameterization of Motions in Large Data Sets
how to read this ▾ how to read this ▴
- Category
- Method: physically based motion editing
- Contributions
-
- Physically based motion transformation: animator edits are propagated through a spacetime optimization over a simplified physical model of the character
- Edited motion stays physically plausible (respecting momentum, mass, and forces) while satisfying new constraints
- Enables retargeting and stylistic edits that preserve dynamic realism
- Context
- Builds on Witkin and Kass's Spacetime Constraints and on the same authors' Motion Warping, adding a physics layer so edits remain dynamically valid rather than purely signal-based. Builds on: Spacetime Constraints · Motion Warping
- Correctness
- Plausibility rests on a simplified physical character model, so fidelity is bounded by how well that simplification captures the real dynamics, and the spacetime optimization can be expensive and sensitive to setup.
- Clarity
- Conceptually clear but optimization-heavy; a first pass conveys why physics-aware editing matters, a second pass pays off for the spacetime formulation.
- How to read it
- First pass for the contrast with purely kinematic warping/retargeting; second pass on the simplified physical model and the spacetime optimization if you need physically valid edits.
Motion Synthesis / Retargeting
1998
5-
,
Implicit integration makes cloth simulation stable at large time steps, the foundation of production cloth (and of Maya nCloth's lineage).
abstract ▾ abstract ▴
The bottleneck in most cloth simulation systems is that time steps must be small to avoid numerical instability. This paper describes a cloth simulation system that can stably take large time steps by coupling a new technique for enforcing constraints on individual cloth particles with an implicit integration method. The simulator models cloth as a triangular mesh, with internal forces derived from a simple continuum formulation supporting anisotropic stretch or compression and a unified treatment of damping forces. The implicit method generates a large unbanded sparse linear system at each step, solved with a modified conjugate gradient method that simultaneously enforces particle constraints exactly, yielding a system significantly faster than previous cloth simulators.
Related Simulation of Clothing with Folds and Wrinkles · Untangling Cloth · Small Steps in Physics Simulation · Fast Cloth Simulation on Moving Humanoids
how to read this ▾ how to read this ▴
- Category
- Method: a cloth simulation algorithm
- Contributions
-
- An implicit integration method letting cloth simulation take large, stable time steps
- A constraint technique enforced exactly on individual particles, coupled into the implicit solve
- A continuum-style triangle-mesh force model with anisotropic stretch/compression and unified damping, solved by a modified conjugate gradient method
- Context
- Builds on the deformable-models lineage (Terzopoulos et al.'s Elastically Deformable Models) and became a foundation for production cloth solvers. Builds on: Elastically Deformable Models
- Correctness
- The key claim is stability at large time steps via implicit integration; the trade-off, common to implicit methods, is numerical damping and per-step cost from solving a large sparse system, which a reader should weigh against the step-size gains.
- Clarity
- Mathematically dense but well organized; a first pass conveys why implicit stepping helps, real understanding needs a second and likely third pass.
- How to read it
- First pass for the stability argument and the force model overview; budget a careful second/third pass on the implicit formulation and the constrained conjugate-gradient solver if implementing.
CFX
-
Spacetime constraint-based motion retargeting adapting motion capture data to new character proportions while preserving dynamic qualities.
abstract ▾ abstract ▴
This paper presents a technique for retargeting motion, the problem of adapting an animated motion from one articulated character to another character with identical structure but different segment lengths. Important features of the original motion, such as feet touching the floor or hands grabbing an object, are identified as constraints that must be maintained. A spacetime constraints solver computes an adapted motion that re-establishes these constraints while preserving the frequency characteristics of the original signal, representing the changes as a motion displacement curve built from frequency-limited B-splines. The approach is demonstrated on motion capture data across characters of differing proportions, including walking, ladder climbing, and swing dancing examples.
Related Motion Warping · Sketch-based Motion Editing for Articulated Characters · Normalized Euclidean Distance Matrices for Human Motion Retargeting · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks
how to read this ▾ how to read this ▴
- Category
- Method: a motion retargeting technique
- Contributions
-
- Formulates retargeting as adapting motion between same-structure characters with different segment lengths while maintaining identified constraints (e.g. feet on floor, hands grabbing)
- A spacetime-constraints solver that re-establishes constraints while preserving the original signal's frequency characteristics
- Represents edits as a motion displacement curve built from frequency-limited B-splines, shown on walking, ladder climbing, and swing dancing
- Context
- Builds directly on Popovic and Witkin's Motion Warping and on Witkin and Kass's Spacetime Constraints, extending curve editing into constraint-preserving retargeting. Builds on: Motion Warping · Spacetime Constraints
- Correctness
- Assumes the source and target share identical skeletal structure differing only in proportions; it preserves kinematic constraints and signal frequency rather than enforcing physics, so dynamic realism is inherited from the source, not guaranteed.
- Clarity
- Clearly written but constraint-solver heavy; a first pass conveys the goal and approach, a second pass pays off for the optimization formulation.
- How to read it
- First pass for the constraints-plus-displacement-curve framing; second pass to follow the spacetime solve and the B-spline displacement representation if implementing retargeting.
Retargeting
-
, ,
Introduced subdivision surfaces into Pixar's production pipeline for Geri's Game, enabling smooth deformable character meshes from arbitrary topology.
abstract ▾ abstract ▴
This paper describes the use of subdivision surfaces in character animation production, specifically Catmull-Clark surfaces adapted for high-end CG production. The authors develop techniques for semi-sharp creases with variable sharpness, methods for cloth simulation including collision detection, and approaches for defining smooth scalar fields on subdivision surfaces for procedural shading.
Related Recursively Generated B-Spline Surfaces on Arbitrary Topological Meshes · Robust Treatment of Collisions, Contact and Friction for Cloth Animation · Automatic Rigging and Animation of 3D Characters · Subspace Neural Physics: Fast Data-Driven Interactive Simulation
how to read this ▾ how to read this ▴
- Category
- Method / production technique: subdivision surfaces for character meshes
- Contributions
-
- Brings Catmull-Clark subdivision surfaces into high-end character-animation production for smooth deformable meshes from arbitrary topology
- Semi-sharp creases with variable sharpness
- Cloth simulation with collision detection and smooth scalar fields on subdivision surfaces for procedural shading
- Context
- Extends classical Catmull-Clark subdivision into a production pipeline (the Geri's Game work), bridging subdivision-surface theory and practical character modeling. Builds on: Recursively Generated B-Spline Surfaces on Arbitrary Topological Meshes
- Correctness
- Validated by use in actual film production rather than a formal benchmark; the semi-sharp crease and shading techniques are engineering solutions tuned for that pipeline, so generality beyond it should not be assumed.
- Clarity
- Quite accessible for its topic and grounded in a concrete production; a first pass conveys the techniques, a second pass clarifies the crease and scalar-field mechanics.
- How to read it
- First pass for why subdivision surfaces suit characters and what semi-sharp creases buy you; second pass on the crease rules and scalar-field shading if you work on modeling or shading tools.
Rigging / Skinning
-
, ,
Radial basis function interpolation framework for blending motion clips along semantic dimensions like speed and emotion for interactive characters.
abstract ▾ abstract ▴
This paper presents a technique for interpolating between example motions of complex linked figures, derived from motion capture or traditional animation tools, to create real-time controllable character animation. Parameterized motions called verbs are controlled by parameters called adverbs, and a combination of radial basis functions and low order polynomials builds the interpolation space between example motions. Inverse kinematic constraints augment the interpolations to prevent artifacts such as feet slipping on the floor during a walk cycle. Verbs are assembled into a verb graph with smooth transitions, allowing an animated figure to exhibit a repertoire of expressive behaviors driven interactively or programmatically at runtime.
Related Geostatistical Motion Interpolation · Automated Extraction and Parameterization of Motions in Large Data Sets · Authoring Motion Cycles · Motion Warping
how to read this ▾ how to read this ▴
- Category
- Method: multidimensional motion interpolation
- Contributions
-
- Parameterized motions (verbs) controlled by semantic parameters (adverbs), interpolated from example motions
- An interpolation space built from radial basis functions plus low-order polynomials, augmented with inverse-kinematic constraints to avoid artifacts like foot slipping
- A verb graph with smooth transitions for an interactive, expressive runtime repertoire
- Context
- Connects to the interactive-actor lineage (Perlin and Goldberg's Improv), turning example motions into a continuously controllable, runtime-driven blend space. Builds on: Improv: A System for Scripting Interactive Actors in Virtual Worlds
- Correctness
- Quality depends on having representative example motions spanning the adverb axes; radial-basis interpolation can extrapolate poorly outside the examples, and IK is used to patch kinematic artifacts rather than to enforce dynamics.
- Clarity
- Accessible with clear terminology; a first pass conveys the verb/adverb framing, a second pass pays off for the RBF interpolation construction.
- How to read it
- First pass for the verb/adverb/verb-graph concepts; second pass to understand the radial-basis-plus-polynomial interpolation and the IK constraints if building a blend system.
Motion Synthesis
-
,
Deforms a surface by associating it with editable curves (wires); moving a wire pulls the nearby surface, a deformation paradigm widely used for facial and character rigs.
abstract ▾ abstract ▴
Introduces Wires, a curve-based geometric deformation technique inspired by the armatures sculptors use. A wire is a curve associated with a region of the surface together with parameters controlling its influence; deforming the wire deforms the surrounding surface, and multiple wires combine to shape complex features. Wires give an intuitive, reusable control structure for free-form surface deformation and have been used extensively for facial and character setup.
Related Joint-Dependent Local Deformations for Hand Animation and Object Grasping · Sculpt Processing for Character Rigging · Mean Value Coordinates for Closed Triangular Meshes · Animation Setup Transfer for 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: curve-based geometric deformation (Wires)
- Contributions
-
- Associates editable curves (wires) with surface regions plus influence parameters
- Deforming a wire deforms the surrounding surface, and multiple wires combine
- Gives a reusable, sculptor-armature-like control structure for free-form deformation
- Context
- Builds on the space-deformation idea of Free-Form Deformation, replacing the lattice with intuitive curves widely used in facial and character setup. Builds on: Free-Form Deformation of Solid Geometric Models
- Correctness
- Sound and influential; results depend on wire placement and influence falloff, which the artist tunes.
- Clarity
- Very readable; the armature analogy makes the technique immediately graspable.
- How to read it
- First pass: the wire-plus-influence-region idea and the armature analogy. Second pass: how multiple wires blend if you will build rigs with it.
Rigging / Skinning
1997
2-
,
Builds animals from anatomical components, bones, muscles, and generalized tissue, and grows a deformable skin over them, an early template for muscle-driven character deformation.
abstract ▾ abstract ▴
Presents a system for constructing and animating animals based on their anatomy. Bones, muscles, and generalized soft tissues are modeled as deformable primitives, with muscles as deformed cylinders and ellipsoids attached to the skeleton, and an elastic skin is generated over the underlying components and deforms as they move. The approach ties surface deformation to an anatomical substrate rather than to ad hoc skinning weights, an early step toward muscle- and anatomy-driven character deformation.
Related Anatomy Transfer · Data-driven Modeling of Skin and Muscle Deformation · Capture and Statistical Modeling of Arm-Muscle Deformations · Layered Construction for Deformable Animated Characters
how to read this ▾ how to read this ▴
- Category
- Method: anatomy-based animal modeling and deformation
- Contributions
-
- Models bones, muscles, and soft tissue as deformable anatomical primitives
- Grows an elastic skin over the components that deforms as they move
- Ties surface deformation to anatomy rather than ad hoc skinning weights
- Context
- Builds on layered character construction, pushing the skeleton-muscle-skin idea toward an anatomical substrate for deformation. Builds on: Layered Construction for Deformable Animated Characters
- Correctness
- A modeling framework more than a validated biomechanical model; muscles are approximate primitives, but the anatomy-drives-skin principle is sound.
- Clarity
- Clear and example-rich on animals; the component model is easy to follow.
- How to read it
- Read for the bones-muscles-skin layering and how the skin is grown and deformed; one careful pass.
Muscles / Skinning
-
, , ,
Models human muscles with anatomically motivated primitives that bulge and change shape with joint angle, driving realistic skin deformation over the musculature.
abstract ▾ abstract ▴
Presents anatomy-based muscle models for the human body aimed at realistic surface form. Muscles are represented with anatomically motivated primitives, including ellipsoidal and multi-belly muscle models, whose shape changes as joints articulate while preserving volume and producing bulging. The muscle layer drives the deformation of an overlying skin, giving figures realistic surface form that responds to pose, a foundation for muscle-based character deformation.
Related Active Volumetric Musculoskeletal Systems · Anatomy Transfer · Implicit Skinning: Real-Time Skin Deformation with Contact Modeling · Anatomically Based Modeling
how to read this ▾ how to read this ▴
- Category
- Method: anatomy-based human muscle modeling
- Contributions
-
- Anatomically motivated muscle primitives, including ellipsoidal and multi-belly muscles
- Volume-preserving bulging as joints articulate
- Muscle layer drives realistic skin deformation
- Context
- Companion in spirit to anatomically based modeling and a descendant of layered character construction, focused on the human musculature. Builds on: Layered Construction for Deformable Animated Characters
- Correctness
- Anatomy-inspired rather than physically simulated; volume preservation and bulging are modeled geometrically, which is enough for convincing form.
- Clarity
- Well organized around muscle types, with clear figures of bulging behavior.
- How to read it
- First pass: the muscle primitives and how they change with joint angle. Second pass: how the muscle layer drives the skin.
Muscles
1996
1-
,
Scripting system for interactive character behaviors combining procedural motion layering with behavioral state machines for real-time actors.
abstract ▾ abstract ▴
Improv is a system for authoring believable interactive characters through two subsystems: an Animation Engine using procedural techniques and action layering with inverse kinematics, and a Behavior Engine with layered scripts and probabilistic decision rules. The system enables non-programmers to create real-time interactive actors that respond to users and each other while maintaining consistent personalities.
Related Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Character Motion Synthesis by Topology Coordinates · Automated Extraction and Parameterization of Motions in Large Data Sets · Factorized Motion Diffusion for Precise and Character-Agnostic Motion Inbetweening
how to read this ▾ how to read this ▴
- Category
- System: authoring interactive animated actors
- Contributions
-
- An Animation Engine using procedural motion, action layering, and inverse kinematics
- A Behavior Engine with layered scripts and probabilistic decision rules for personality-consistent behavior
- An authoring approach letting non-programmers script real-time actors that respond to users and each other
- Context
- Sits in the procedural-animation and interactive-character lineage and is referenced as a basis for later parameterized motion work such as Verbs and Adverbs.
- Correctness
- Presented as a working authoring system; its strength is believable real-time interaction rather than physical accuracy, and the quality of results depends on author-supplied scripts and rules.
- Clarity
- Accessible and idea-driven; a first pass conveys the two-engine architecture, a second pass clarifies the scripting and layering details.
- How to read it
- First pass for the split between motion generation and behavior scripting; second pass if you care about how layers and probabilistic rules combine to give consistent personality.
Motion Synthesis
1995
2-
, , ,
Physics-based simulation of human athletic activities including running, jumping, and gymnastics with reactive balance control.
abstract ▾ abstract ▴
This paper describes algorithms for the animation of male and female models performing three dynamic athletic behaviors: running, bicycling, and vaulting. We animate these behaviors using control algorithms that cause a physically realistic model to perform the desired maneuver. For example, control algorithms allow the simulated humans to maintain balance while moving their arms, to run or bicycle at a variety of speeds, and to perform two vaults. For each simulation, we compare the computed motion to that of humans performing similar maneuvers. We perform the comparison both qualitatively through real and simulated video images and quantitatively through simulated and biomechanical data.
Related SIMBICON: Simple Biped Locomotion Control · Dog Code: Human to Quadruped Embodiment Using Shared Codebooks · Improv: A System for Scripting Interactive Actors in Virtual Worlds · Character Motion Synthesis by Topology Coordinates
how to read this ▾ how to read this ▴
- Category
- Method: control algorithms for physics-based human motion
- Contributions
-
- Control algorithms that drive a physically realistic human model through running, bicycling, and vaulting
- Reactive balance and limb control that hold balance across a range of speeds and maneuvers
- Validation comparing simulated motion to real humans both qualitatively (video) and quantitatively (biomechanical data)
- Context
- A foundational work in physics-based character animation for athletic, dynamic motion, predating and motivating later data-driven and learned-control approaches.
- Correctness
- Results are demonstrated on three specific athletic behaviors with hand-built controllers and compared to human data; controllers are behavior-specific, so a reader should not assume the approach generalizes automatically to arbitrary motions.
- Clarity
- Readable with biomechanics flavor; a first pass conveys the approach, a second pass pays off for the per-behavior control structure.
- How to read it
- First pass for the control-algorithm framing and the validation method; second pass if you want the specifics of balance control and how each athletic behavior is controlled.
Motion Synthesis
-
,
Motion editing technique warping captured animations to meet user-defined spacetime constraints while preserving dynamic continuity.
abstract ▾ abstract ▴
This paper introduces motion warping, a technique for editing captured or keyframed animation by warping the motion parameter curves. The animator interactively defines keyframe-like constraints that derive a smooth deformation preserving the fine high-frequency structure of the original motion, while time warp constraints retime the motion via an interpolating Cardinal spline. Motion clips are combined by overlapping and blending the parameter curves with a slow-in/slow-out weight function. The authors show that whole families of realistic motions, such as varied human walks and tennis swings, can be created from a single captured sequence using only a few warping keyframes.
Related Retargeting Motion to New Characters · Optimal and Interactive Keyframe Selection for Motion Capture · Automated Extraction and Parameterization of Motions in Large Data Sets · Motion Retargeting for Crowd Simulation
how to read this ▾ how to read this ▴
- Category
- Method: a motion-editing technique
- Contributions
-
- Motion warping: editing captured or keyframed animation by warping its parameter curves to meet keyframe-like constraints while preserving fine high-frequency detail
- Time-warp constraints that retime motion via an interpolating Cardinal spline
- Blending of overlapping clips with a slow-in/slow-out weight function to synthesize motion families from a single sequence
- Context
- Builds on keyframe-animation lineage (Burtnyk and Wein's interactive skeleton key-frame techniques) and underpins later constraint-based editing such as Gleicher's retargeting. Builds on: Interactive Skeleton Techniques for Enhancing Motion Dynamics in Key Frame Animation
- Correctness
- Demonstrated on motions like human walks and tennis swings; it is a signal-deformation method with no notion of physics, so warped results preserve style but are not guaranteed to remain physically plausible.
- Clarity
- Accessible; a first pass conveys the idea, do a second pass for the curve-warping and time-warp formulation.
- How to read it
- First pass for the warp-the-curves intuition and the constraint types; second pass to understand the deformation and retiming math if you plan to implement editing.
Retargeting
1994
1-
Genetic algorithm evolution of virtual creature morphologies and neural locomotion controllers, pioneering learned physics-based animation.
abstract ▾ abstract ▴
Describes a system for automatically generating 3D virtual creatures by coevolving both morphology and neural control systems using genetic algorithms. Creatures develop within simulated physical worlds, with fitness evaluated for behaviors like swimming, walking, and light-following, discovering diverse and often unexpected locomotion strategies without manual design.
Related Physics-Based Motion Retargeting from Sparse Inputs · DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds · ReGAIL: Toward Agile Character Control From a Single Reference Motion · Character Controllers Using Motion VAEs
how to read this ▾ how to read this ▴
- Category
- Method: evolutionary co-design of creature morphology and control
- Contributions
-
- A genetic-algorithm system that co-evolves 3D creature morphology and neural control together
- Fitness-driven discovery of locomotion behaviors (swimming, walking, light-following) in simulated physics, with no hand-authored bodies or controllers
- Context
- An early landmark in learned, physics-based procedural animation, joining artificial-life and evolutionary-computation ideas to character generation rather than building on a specific prior graphics paper.
- Correctness
- Behaviors are demonstrated qualitatively inside a simulated physical world, so plausibility is judged by the simulator and the chosen fitness functions rather than against real organisms; reproducibility depends heavily on those simulation and fitness choices.
- Clarity
- Conceptually accessible and famous for its visuals; a first pass conveys the idea, a second pass pays off for the genotype encoding and the evolution loop.
- How to read it
- First pass for the co-evolution concept and the genotype-to-phenotype graph idea; do a second pass only if you care about how morphology and the neural controller are jointly encoded and mutated.
Motion Synthesis
1992
1-
, ,
Lets the user drag points on the model directly and solves for the lattice control points that achieve it, making FFD usable without manipulating the cage by hand.
abstract ▾ abstract ▴
Makes free-form deformation directly manipulable. Instead of moving lattice control points and watching the embedded model follow, the user picks and drags points on the object itself, and the system solves, via a least-squares pseudo-inverse, for the control-point movements that best produce the requested surface motion. This turns FFD from an indirect lattice-first tool into one where the artist edits the shape directly, a key step toward practical deformation interfaces.
Related Laplacian Surface Editing · Extended Free-Form Deformation: A Sculpturing Tool for 3D Geometric Modeling · Free-Form Deformation of Solid Geometric Models · Wires: A Geometric Deformation Technique
how to read this ▾ how to read this ▴
- Category
- Method: direct-manipulation interface for free-form deformation
- Contributions
-
- Lets the user drag points on the model rather than lattice control points
- Solves for the control-point motion via a least-squares pseudo-inverse
- Turns FFD into a direct shape-editing tool
- Context
- Builds on Free-Form Deformation and Extended FFD, fixing their main usability problem: editing the cage instead of the shape. Builds on: Free-Form Deformation of Solid Geometric Models · Extended Free-Form Deformation: A Sculpturing Tool for 3D Geometric Modeling
- Correctness
- Correct and practical; the pseudo-inverse gives a least-norm solution, so edits can be under-determined and need sensible point choices.
- Clarity
- Readable; the manipulation idea is intuitive and the math is compact.
- How to read it
- First pass: the drag-the-surface idea and why it matters. Second pass: the least-squares solve mapping surface motion back to control points.
Skinning
1990
2-
Extends FFD to non-parallelepiped and arbitrarily shaped lattices, so the deformation tool can match the feature being sculpted rather than a rigid box.
abstract ▾ abstract ▴
Extends free-form deformation (FFD) beyond the parallelepiped control lattice of the original method. Extended free-form deformation (EFFD) lets the lattice take non-parallelepiped and arbitrary, even cylindrical, shapes, and supports merging lattices and forming arbitrarily shaped bumps. This yields a more flexible sculpting tool that can be shaped to the feature being deformed instead of forcing every deformation through a rectangular box.
Related Direct Manipulation of Free-Form Deformations · Free-Form Deformation of Solid Geometric Models · Global and Local Deformations of Solid Primitives · Wires: A Geometric Deformation Technique
how to read this ▾ how to read this ▴
- Category
- Method: an extension of free-form deformation (EFFD)
- Contributions
-
- Allows non-parallelepiped and arbitrarily shaped control lattices
- Supports merging lattices and forming arbitrarily shaped bumps
- Makes the FFD tool conform to the feature being sculpted
- Context
- Builds on Free-Form Deformation, loosening its rigid rectangular lattice into shapes that fit the deformation at hand. Builds on: Free-Form Deformation of Solid Geometric Models
- Correctness
- Sound extension; the added flexibility comes with more lattice-setup complexity.
- Clarity
- Clear with good examples; the value over plain FFD is immediately visible.
- How to read it
- Read it right after FFD to see how the lattice constraint was relaxed; focus on the arbitrary-lattice examples.
Skinning
-
Drives a digital face directly from tracked human performance, the founding paper of performance capture.
abstract ▾ abstract ▴
As computer graphics technique rises to the challenge of rendering lifelike performers, more lifelike performance is required. The techniques used to animate robots, arthropods, and suits of armor, have been extended to flexible surfaces of fur and flesh. Physical models of muscle and skin have been devised. But more complex databases and sophisticated physical modeling do not directly address the performance problem. The gestures and expressions of a human actor are not the solution to a dynamic system. This paper describes a means of acquiring the expressions of real faces, and applying them to computer-generated faces. Such an "electronic mask" offers a means for the traditional talents of actors to be flexibly incorporated in digital animations. Efforts in a similar spirit have resulted in servo-controlled "animatrons," high-technology puppets, and CG puppetry [1]. The manner in which the skills of actors and puppetteers as well as animators are accommodated in such systems may point the way for a more general incorporation of human nuance into our emerging computer media.The ensuing description is divided into two major subjects: the construction of a highly-resoved human head model with photographic texture mapping, and the concept demonstration of a system to animate this model by tracking and applying the expressions of a human performer.
Related Facial Retargeting with Automatic Range of Motion Alignment · Transferring the Rig and Animations from a Character to Different Face Models · High Fidelity Facial Animation Capture and Retargeting with Contours · A Facial Motion Retargeting Pipeline for Appearance Agnostic 3D Characters
how to read this ▾ how to read this ▴
- Category
- Method: performance-driven facial animation (founding performance-capture paper)
- Contributions
-
- A means of acquiring the expressions of real human faces and applying them to computer-generated faces
- An 'electronic mask' that lets an actor's gestures and expressions drive a digital face
- Construction of a highly resolved human head model to receive the captured performance
- Context
- Builds on Parke's parametric face model (A Parametric Model for Human Faces, 1974) and shifts from synthesizing expressions to capturing them from a live performer. Builds on: A Parametric Model for Human Faces
- Correctness
- Frames performance, not dynamic simulation, as the source of lifelike expression and demonstrates transfer of tracked faces onto CG heads; a reader should note it is an early proof of concept whose fidelity is bounded by the era's tracking and capture setup.
- Clarity
- Very readable and motivational; a first pass conveys the founding idea, a second pass repays the acquisition and head-construction details.
- How to read it
- Read as the origin of performance capture: focus on the acquisition-and-retargeting pipeline and the 'electronic mask' framing; a second pass on the capture and head-model construction pays off if you work in facial capture.
Facial / Retargeting
1989
1-
, ,
Layered character model with skeleton, muscle, and skin layers using proximity-based deformation for realistic anatomical character animation.
abstract ▾ abstract ▴
A methodology is proposed for creating and animating computer generated characters which combines recent research advances in robotics, physically based modeling and geometric modeling. The control points of geometric modeling deformations are constrained by an underlying articulated robotics skeleton. These deformations are tailored by the animator and act as a muscle layer to provide automatic squash and stretch behavior of the surface geometry. A hierarchy of composite deformations provides the animator with a multi-layered approach to defining both local and global transition of the character's shape. The muscle deformations determine the resulting geometric surface of the character. This approach provides independent representation of articulation from surface geometry, supports higher level motion control based on various computational models, as well as a consistent, uniform character representation which can be tuned and tweaked by the animator to meet very precise expressive qualities. A prototype system (Critter) currently under development demonstrates research results towards layered construction of deformable animated characters.
Related Real-Time Skeletal Skinning with Optimized Centers of Rotation · Anatomically Based Modeling · Data-driven Modeling of Skin and Muscle Deformation · Capture and Statistical Modeling of Arm-Muscle Deformations
how to read this ▾ how to read this ▴
- Category
- Method: a layered (skeleton/muscle/skin) deformable character model
- Contributions
-
- A layered character methodology combining articulated robotics skeletons, physically based modeling, and geometric modeling
- Deformation control points constrained by the skeleton, with a muscle layer that gives automatic squash-and-stretch
- A hierarchy of composite deformations for local and global shape change, keeping articulation independent from surface geometry (prototype system Critter)
- Context
- Builds on Magnenat-Thalmann et al.'s joint-dependent local deformations (1988) and combines it with robotics and physically based modeling for full-character animation. Builds on: Joint-Dependent Local Deformations for Hand Animation and Object Grasping
- Correctness
- Presented via a prototype (Critter) under development; a reader should treat it as a methodology and early system rather than a fully evaluated production tool, with results tuned by the animator.
- Clarity
- Accessible at the architecture level; a first pass conveys the layered scheme, a second pass repays the deformation hierarchy.
- How to read it
- Read for the enduring skeleton/muscle/skin layering idea: focus on how the muscle layer drives squash-and-stretch and how articulation is decoupled from surface geometry; a first pass is usually enough.
Skinning / Muscles
1988
2-
, ,
Introduced joint-dependent local deformations (JLD) for hand animation, an early precursor to pose-space deformation and corrective skinning.
abstract ▾ abstract ▴
This paper presents algorithms for animating synthetic actor hands with realistic deformations including joint rounding and muscle inflation. The approach maps surfaces onto a skeleton using Joint-dependent Local Deformation (JLD) operators, which are specific local deformation operators based on joint properties. The method handles both finger and palm deformation through a sophisticated coordinate basis calculation system that separates the topology of surfaces from the skeleton, enabling automatic continuity between different surface regions during animation.
Related Wires: A Geometric Deformation Technique · Inverse Kinematics for Reduced Deformable Models · Rig-Space Physics · Generating Upper-Body Motion for Real-Time Characters Making their Way through Dynamic Environments
how to read this ▾ how to read this ▴
- Category
- Method: joint-dependent local deformations for hand animation
- Contributions
-
- Joint-dependent Local Deformation (JLD) operators that produce realistic effects like joint rounding and muscle inflation
- A mapping of surfaces onto a skeleton that separates surface topology from the skeleton itself
- A coordinate-basis calculation that handles finger and palm deformation and enforces continuity between surface regions
- Context
- Standalone work (no prior context given) on articulated hand deformation that is an early precursor to pose-space deformation and corrective skinning. Builds on: Interactive Skeleton Techniques for Enhancing Motion Dynamics in Key Frame Animation · A System for Computer Generated Movies
- Correctness
- Demonstrated on synthetic actor hands and grasping; a reader should note the operators are hand- and joint-specific and tuned for that domain rather than a fully general skinning solution.
- Clarity
- Moderately technical; a first pass conveys the JLD idea, a second pass is needed for the coordinate-basis machinery.
- How to read it
- Read for the lineage toward corrective and pose-space skinning: focus on what JLD operators do at the joints and how surface topology is decoupled from the skeleton; a second pass pays off if you care about the deformation math.
Skinning / Rigging
-
,
Spacetime optimization formulation for physics-based character animation, optimizing motions over time to satisfy user-specified constraints.
abstract ▾ abstract ▴
Spacetime constraints are a method for creating character animation in which the animator specifies what the character must do, how the motion should be performed, the character's physical structure, and the physical resources available to accomplish the motion. Together with Newton's laws these requirements form a constrained optimization problem whose solution is a physically valid motion that satisfies the constraints while optimizing the given criteria. The functions for position and force are discretized and solved over the entire time interval at once using a variant of Sequential Quadratic Programming with sparse matrix techniques, supported by an object-oriented symbolic algebra system that automates the difficult task of setting up the equations. The authors demonstrate the method with a Luxo lamp performing jumps and ski jumps, showing that traditional animation principles such as anticipation, squash-and-stretch, follow-through, and timing emerge automatically from minimal kinematic constraints.
Related Physically Based Motion Transformation · Retargeting Motion to New Characters · Physics-based Motion Capture Imitation with Deep Reinforcement Learning · Robust Motion In-Betweening
how to read this ▾ how to read this ▴
- Category
- Method: a spacetime optimization formulation for physics-based animation
- Contributions
-
- Spacetime constraints: posing animation as a constrained optimization over the whole time interval given goals, style, structure, and physical resources
- A discretized position-and-force solve using sparse Sequential Quadratic Programming, with a symbolic algebra system to set up the equations
- A demonstration (a Luxo lamp jumping and ski-jumping) showing anticipation, squash-and-stretch, follow-through, and timing emerge from minimal constraints
- Context
- Standalone foundational work (no prior context given) that links Newtonian physics and optimization, seeding later trajectory-optimization and physics-based motion synthesis.
- Correctness
- Solutions are physically valid motions that satisfy the constraints while optimizing a criterion; a reader should note the approach is computationally heavy, sensitive to the optimization setup, and shown on a simple character rather than full humanoids.
- Clarity
- Conceptually striking but technically demanding; a first pass conveys the idea, deeper passes are needed for the optimization and equation setup.
- How to read it
- Read for the elegant premise that animation principles fall out of optimization: focus on the constraint formulation and the Luxo result; budget a second/third pass for the SQP and symbolic-setup details if you do trajectory optimization.
Motion Synthesis
1987
2-
Parameterized face muscle model using linear and sphincter muscle types to animate facial expressions, influencing decades of facial animation research.
abstract ▾ abstract ▴
Develops a parameterized muscle model for three-dimensional facial animation that controls facial expressions through muscle vectors rather than hard-wired actions. Uses action units from the Facial Action Coding System to define muscle operations with parameters for zone of influence, falloff radius, and spring constants. The model separates linear muscles that pull from sphincter muscles that squeeze, enabling animation of diverse facial types without topology-specific constraints.
Related Animating Facial Expressions · Art-Directed Muscle Simulation for High-End Facial Animation · Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer · Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild
how to read this ▾ how to read this ▴
- Category
- Method: a parameterized muscle model for facial animation
- Contributions
-
- A parameterized muscle model that controls expressions through muscle vectors rather than hard-wired actions
- Two muscle types, linear muscles that pull and sphincter muscles that squeeze, defined via FACS action units
- Parameters for zone of influence, falloff radius, and spring constants that generalize across facial types without topology-specific constraints
- Context
- Builds directly on Platt and Badler's muscle-based facial animation (Animating Facial Expressions, 1981) and continues the FACS-driven line of facial deformation. Builds on: Animating Facial Expressions
- Correctness
- Demonstrated to animate diverse face shapes by abstracting muscles as parameterized vectors; a reader should note it is a procedural, geometry-driven approximation of muscle action rather than a physically simulated tissue model.
- Clarity
- Accessible and well structured; a first pass conveys the linear/sphincter distinction, a second pass repays the parameter definitions.
- How to read it
- Read for the durable linear-vs-sphincter muscle abstraction: focus on the muscle-vector parameters (influence zone, falloff, spring constant); a second pass pays off if you want to reimplement the deformation.
Facial / Muscles
-
, , ,
The paper that brought physically based elastic simulation to graphics, ancestor of all cloth and flesh solvers.
abstract ▾ abstract ▴
The theory of elasticity describes deformable materials such as rubber, cloth, paper, and flexible metals. We employ elasticity theory to construct differential equations that model the behavior of non-rigid curves, surfaces, and solids as a function of time. Elastically deformable models are active: they respond in a natural way to applied forces, constraints, ambient media, and impenetrable obstacles. The models are fundamentally dynamic and realistic animation is created by numerically solving their underlying differential equations. Thus, the description of shape and the description of motion are unified.
Related FEM Simulation of 3D Deformable Solids: A Practitioner's Guide to Theory, Discretization and Model Reduction · Loki: A Unified Multiphysics Simulation Framework for Production · The Synthesis of Cloth Objects · Fast Contact Determination for Intersecting Deformable Solids
how to read this ▾ how to read this ▴
- Category
- Method: physically based elastic deformation for graphics
- Contributions
-
- Differential equations from elasticity theory that model non-rigid curves, surfaces, and solids as a function of time
- Active deformable models that respond naturally to forces, constraints, ambient media, and impenetrable obstacles
- A unified, dynamic formulation where shape and motion are described together and animated by numerically solving the equations
- Context
- Standalone landmark work (no prior context given) that introduces continuum elasticity to graphics and is the ancestor of later cloth and flesh solvers. Builds on: The Synthesis of Cloth Objects
- Correctness
- Grounded in classical elasticity theory and demonstrated on deformable curves, surfaces, and solids; a reader should keep in mind the results depend on numerical integration choices and the era's compute, so stability and cost are practical concerns.
- Clarity
- Conceptually accessible but mathematically heavy; a first pass conveys the active-model idea, deeper passes are needed for the PDE and numerics.
- How to read it
- Read as the root of physics-based deformation: focus on the unification of shape and motion and on what makes the models active; budget a second and third pass for the elasticity equations and the numerical solution if you work in simulation.
CFX / Muscles
1986
2-
,
Introduces FFD lattices: deforming models by warping the space around them, a foundation for countless rig deformers.
abstract ▾ abstract ▴
Presents free-form deformation (FFD), a technique for deforming solid geometric models using trivariate tensor product Bernstein polynomials. The method enables intuitive sculpting-like manipulation of objects embedded in a flexible lattice of control points, supporting local or global deformations while maintaining derivative continuity and optionally preserving volume.
Related Extended Free-Form Deformation: A Sculpturing Tool for 3D Geometric Modeling · Direct Manipulation of Free-Form Deformations · Implicit Skinning: Real-Time Skin Deformation with Contact Modeling · Wires: A Geometric Deformation Technique
how to read this ▾ how to read this ▴
- Category
- Method: a space-warping deformation technique (free-form deformation)
- Contributions
-
- Free-form deformation (FFD) that deforms solid models by warping the space around them via a lattice of control points
- Use of trivariate tensor-product Bernstein polynomials for intuitive, sculpting-like manipulation
- Support for local or global deformations with derivative continuity and optional volume preservation
- Context
- Builds on the object-deformation operators of Barr and the lattice-embedding idea of Bezier, generalizing both into arbitrary free-form warps of the embedding space; it in turn underpins countless later rig deformers and free-form modeling tools. Builds on: Global and Local Deformations of Solid Primitives · General Distortion of an Ensemble of Biparametric Surfaces
- Correctness
- The method is general and mathematically grounded in Bernstein polynomial lattices; a reader should note the deformation quality depends on lattice resolution and placement, and that it warps embedding space rather than respecting surface or anatomical structure.
- Clarity
- Accessible at the concept level but math-dense in the formulation; a first pass conveys the lattice idea, a second pass is needed for the polynomial machinery.
- How to read it
- Read for the core idea that you deform the space, not the mesh: focus on the lattice and control-point setup; a second pass on the Bernstein-polynomial trivariate formulation pays off if you will implement it.
Skinning
-
One of the first computer graphics cloth methods: drape a fabric hung from constraint points by approximating it with catenary curves, then relax it under gravity.
abstract ▾ abstract ▴
Presents one of the first techniques for representing cloth in computer graphics. A piece of cloth is defined by a set of constraint points from which it hangs, and the surface within the convex hull of those points is approximated with catenary curves between the points. The drape is then refined by an iterative relaxation process that approximates the fabric settling under gravity, producing convincing folds and hanging cloth. It established cloth as a surface-deformation problem in graphics.
Related Elastically Deformable Models · Abstracting Rigging Concepts for a Future Proof Framework Design · A.C.M.E. Multilimb System · Premo: Powerful Character Rigging, Fast Animation
how to read this ▾ how to read this ▴
- Category
- Foundational method: one of the first computer-graphics cloth models
- Contributions
-
- Represents a hanging cloth by the constraint points it hangs from
- Approximates the surface with catenary curves between points, then relaxes it under gravity
- Established cloth as a surface-deformation problem in graphics
- Context
- An early CFX root; the geometric cloth model that the physically-based deformable models of Terzopoulos and later cloth solvers built past.
- Correctness
- A geometric approximation, not a physical simulation; folds come from catenaries and relaxation rather than true dynamics, so judge it as a first step that physically-based cloth superseded.
- Clarity
- Readable and concrete; the catenary idea is easy to follow.
- How to read it
- One pass for the catenary-plus-relaxation idea and why cloth became a graphics problem, then jump to Terzopoulos 1987 and Baraff 1998 for the physical approach.
CFX
1984
1-
Deforms solid primitives with global operators (twist, bend, taper, scale) applied as a position-dependent transform, and derives how surface normals carry through via the inverse-transpose of the deformation's Jacobian.
abstract ▾ abstract ▴
Introduces a class of hierarchical deformation operators that bend, twist, taper, and scale solid primitives by applying a transformation that varies with position over the object, turning simple primitives into a wide range of new shapes. The central result is that the surface normal of an arbitrarily deformed smooth surface can be computed directly from the undeformed normal and the deformation, using the inverse transpose of the deformation's Jacobian matrix, so deformed surfaces shade correctly without re-deriving normals by hand. The operators apply globally or locally along an axis and compose hierarchically into more complex shapes.
Related Extended Free-Form Deformation: A Sculpturing Tool for 3D Geometric Modeling · Super-Helices for Predicting the Dynamics of Natural Hair · A Mass Spring Model for Hair Simulation · Differentiable Simulation of Inertial Musculotendons
how to read this ▾ how to read this ▴
- Category
- Foundational method: global and local deformation operators for solid primitives (Barr deformations)
- Contributions
-
- A class of operators (twist, bend, taper, scale) that deform a primitive via a transformation that varies with position over the object
- A direct rule for carrying surface normals through an arbitrary deformation using the inverse-transpose of the deformation's Jacobian
- Hierarchical composition of global and axis-local deformations into more complex shapes
- Context
- A root of the deformation lineage with no prior archive context; the named precursor whose object-transform deformations free-form deformation generalized by warping the embedding space instead.
- Correctness
- Mathematically grounded and still standard; the main limitation is that the deformations are a fixed analytic menu applied to the object rather than arbitrary free-form warps of space.
- Clarity
- Clearly written and example driven; the operator descriptions are intuitive, while the normal-transformation derivation rewards a careful second pass.
- How to read it
- First pass: the four operators and what each does to a primitive. Second pass: the Jacobian inverse-transpose result for normals, the part that makes deformed surfaces shade correctly.
Skinning
1981
1-
,
Early muscle-based facial animation system using pseudo-muscles anchored to a facial mesh to drive expression deformations.
abstract ▾ abstract ▴
System for animating facial expressions using muscle-based simulation and the Facial Action Coding System (FACS). Faces are modeled as tension networks of interconnected points representing skin, muscles, and bone. Action units corresponding to minimal muscle contractions drive facial deformations, allowing complex expressions to be synthesized from basic anatomical actions.
Related A Muscle Model for Animating Three-Dimensional Facial Expression · Animatomy: An Animator-Centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer · Art-Directed Muscle Simulation for High-End Facial Animation · Direct Manipulation Blendshapes
how to read this ▾ how to read this ▴
- Category
- Method: an early muscle-based facial animation system
- Contributions
-
- A facial model as a tension network of interconnected points representing skin, muscles, and bone
- Action units corresponding to minimal muscle contractions that drive facial deformations
- Synthesis of complex expressions by composing basic anatomical (FACS-based) actions
- Context
- Builds on Parke's parametric face model (A Parametric Model for Human Faces, 1974) and grounds expression control in Ekman's Facial Action Coding System (1978). Builds on: A Parametric Model for Human Faces · Facial Action Coding System
- Correctness
- Assumes facial motion can be approximated by a spring/tension network driven by FACS action units; a reader should treat it as a pioneering pseudo-muscle scheme rather than an anatomically exact or quantitatively validated simulation.
- Clarity
- Moderately accessible if you already know FACS; a first pass conveys the muscle-network idea, a second pass repays attention to the network formulation.
- How to read it
- Read alongside FACS: focus on how action units map to deformations of the tension network; do a second pass if you want the spring/point-network mechanics.
Facial / Muscles
1978
3-
,
The psychology taxonomy of facial action units that became the de facto vocabulary of every film and game facial rig.
Facial
-
Distorts embedded shapes by trapping them in a triparametric Bernstein lattice and warping that lattice, the embed-and-warp idea later formalized as free-form deformation.
how to read this ▾ how to read this ▴
- Category
- Foundational method: an early free-form deformation precursor (shape distortion via an embedding lattice)
- Contributions
-
- Distorts a numerically defined object by embedding it in an auxiliary triparametric (3D) lattice of control points
- Warps that lattice with Bernstein polynomials and maps the embedded surfaces through the distorted parametric volume
- Anticipates the embed-and-warp paradigm that free-form deformation formalized eight years later
- Context
- A root of the deformation lineage with no prior archive context; it seeds the embed-and-warp idea that free-form deformation later generalized and that countless rig deformers inherited.
- Correctness
- Sound as a conceptual demonstration from CAD practice; it is a short technical note rather than a rigorous derivation, so treat it as the origin of the idea more than a complete method.
- Clarity
- Brief and high level; the lattice-distortion concept reads quickly, but the note predates the cleaner trivariate Bernstein formalism that free-form deformation would give it.
- How to read it
- Read it for one idea: trap a shape inside a control-point volume and let it follow as the volume bends. A single pass is enough; go to free-form deformation for the worked-out math.
Skinning
-
,
Defines Catmull-Clark subdivision surfaces: recursively refine an arbitrary-topology polygon mesh toward a smooth limit surface, the substrate later adopted for production character meshes.
abstract ▾ abstract ▴
Introduces a recursive refinement scheme that generates a smooth surface from an arbitrary-topology polygonal mesh, generalizing uniform bicubic B-spline surfaces to meshes that are not regular grids. Each step splits faces and repositions points by simple averaging rules, and repeated subdivision converges to a smooth limit surface that behaves well even at extraordinary vertices. Catmull-Clark subdivision became a standard way to model smooth, deformable surfaces of arbitrary topology.
Related Subdivision Surfaces in Character Animation · Robust Treatment of Collisions, Contact and Friction for Cloth Animation
how to read this ▾ how to read this ▴
- Category
- Foundational method: Catmull-Clark subdivision surfaces
- Contributions
-
- Generalizes uniform bicubic B-spline surfaces to meshes of arbitrary topology
- Simple face-split and averaging rules that converge to a smooth limit surface
- Handles extraordinary vertices, enabling smooth surfaces on irregular meshes
- Context
- A root of the smooth-surface substrate lineage; the scheme that Subdivision Surfaces in Character Animation later brought into production.
- Correctness
- Mathematically sound and now ubiquitous; the subtlety is surface behavior at extraordinary vertices, refined by much later analysis.
- Clarity
- Short and elegant; the refinement rules are simple to state, the limit-surface analysis is the deeper part.
- How to read it
- First pass: the split-and-average rules and the idea of a smooth limit from any mesh. Second pass: behavior at extraordinary vertices if you will implement it.
Skinning
1976
1- Interactive Skeleton Techniques for Enhancing Motion Dynamics in Key Frame Animation SIGGRAPH Academic 195 cites
,
Early interactive skeleton system for keyframe animation using interpolation between poses, foundational for articulated character animation tools.
abstract ▾ abstract ▴
A significant increase in the capability for controlling motion dynamics in key frame animation is achieved through skeleton control. This technique allows an animator to develop a complex motion sequence by animating a stick figure representation of an image. This control sequence is then used to drive an image sequence through the same movement. The simplicity of the stick figure image encourages a high level of interaction during the design stage. Its compatibility with the basic key frame animation technique permits skeleton control to be applied selectively to only those components of a composite image sequence that require enhancement.
Related Optimal and Interactive Keyframe Selection for Motion Capture · Green Coordinates · Learning an Intrinsic Garment Space for Interactive Authoring of Garment Animation · Build Your Own Procedural Grooming Pipeline
how to read this ▾ how to read this ▴
- Category
- Method: an interactive skeleton control technique for keyframe animation
- Contributions
-
- Skeleton control that lets an animator drive a complex motion sequence by animating a stick-figure representation
- A simple stick-figure interface that encourages high interaction during the design stage
- Selective application of skeleton control to only the components of a composite sequence that need enhancement
- Context
- Standalone early work (no prior context given) that extends basic keyframe animation and prefigures articulated, skeleton-driven character animation tools.
- Correctness
- Built to be compatible with existing keyframe practice and demonstrated as an interactive design aid; a reader should remember it predates modern skinning and treats motion at the level of a 2D stick figure rather than full 3D rigs.
- Clarity
- Accessible and concept-forward; a single first pass conveys the core idea of skeleton-driven motion.
- How to read it
- Read for the lineage of skeleton-based animation tooling: focus on how the stick figure drives the image sequence and the emphasis on interactivity; a first pass is enough.
Rigging
1974
1-
First parameterized 3D face model enabling controllable facial expression animation through a set of continuous deformation parameters.
Facial
1972
2-
The system behind the Computer Animated Hand, one of the first 3D computer character animations: Catmull digitized a model of his own hand into roughly 350 polygons and animated it, an origin point for articulated 3D characters.
abstract ▾ abstract ▴
A 1972 short film and the system behind it, made by Edwin Catmull and Fred Parke at the University of Utah for a graduate course. Catmull built a model of his own left hand, drew roughly 350 triangles and polygons on its surface, digitized that data into a three-dimensional model, and animated it with a 3D animation program he wrote. It is widely regarded as one of the first examples of 3D computer character animation and the first animated articulated hand, documented in Catmull's 1972 paper A System for Computer Generated Movies.
Related Sketch-based Motion Editing for Articulated Characters · Stretchable and Twistable Bones for Skeletal Shape Deformation · Physics-based Character Skinning using Multi-Domain Subspace Deformations
how to read this ▾ how to read this ▴
- Category
- Foundational work: one of the first 3D computer character animations (the Computer Animated Hand)
- Contributions
-
- Digitizes a physical model of Catmull's own left hand into roughly 350 polygons to build a 3D model
- Animates that articulated hand with a purpose-written three-dimensional animation program
- Stands as an origin point for 3D computer character animation and articulated characters
- Context
- A root of the archive timeline with no prior context; the first animated 3D hand, which the later joint-dependent hand-deformation work descends from.
- Correctness
- A pioneering demonstration rather than a method paper; its significance is historical (proof that articulated 3D characters could be modeled and animated at all), so judge it as an origin, not a technique to reuse.
- Clarity
- Best understood through the film and Catmull 1972 paper A System for Computer Generated Movies; the ideas are simple to follow, the achievement is in having done it first.
- How to read it
- Watch the short and skim the 1972 paper for the workflow: physical model, hand-drawn polygons, digitize, animate. One pass is plenty; read it for where the lineage starts.
Rigging
-
First known computerized 3D facial animation, creating a parametric model of a human face and producing the first CG facial animation.
abstract ▾ abstract ▴
This foundational paper presents techniques for representing, animating, and acquiring 3D data for realistic computer-generated facial animation. The face is approximated with a polygonal mesh of approximately 250 polygons defined by 400 vertices, laid out to support natural deformation. Animation is achieved through cosine interpolation of vertex positions between keyframe expressions to simulate acceleration and deceleration of facial motion. Three-dimensional coordinates for facial expressions are obtained photogrammetrically from orthogonal photograph pairs, with the approach producing realistic animated sequences of facial performance.
Related Direct Manipulation Blendshapes · Performance-Driven Facial Animation · Realtime Performance-Based Facial Animation · High Fidelity Facial Animation Capture and Retargeting with Contours
how to read this ▾ how to read this ▴
- Category
- Method: a parametric facial model and the first CG facial animation
- Contributions
-
- A polygonal face representation (around 250 polygons, 400 vertices) laid out to support natural deformation
- Keyframe animation via cosine interpolation of vertex positions to simulate acceleration and deceleration
- Photogrammetric acquisition of 3D facial coordinates from orthogonal photograph pairs
- Context
- Foundational, standalone work (no prior context given) that originates 3D facial animation and seeds later parametric and muscle-based face models.
- Correctness
- Demonstrated by producing realistic animated facial sequences from photographed expressions; a reader should note it is an early, low-resolution mesh approach with manual keyframing rather than a validated, general-purpose system.
- Clarity
- Accessible as a historical primer; a first pass conveys the idea, a second pass repays attention to the interpolation and photogrammetric setup.
- How to read it
- Read it as the origin point of the field: focus on the mesh layout, the cosine interpolation choice, and the photo-pair acquisition; one pass suffices unless you want the acquisition details.
Facial
Nothing matches these filters.
Sources
RSS ↗Per-year link lists for every SIGGRAPH, SIGGRAPH Asia and Eurographics technical-papers program.
The Medusa and Anyma facial capture line, rig-space physics, and character simulation.
WDAS production papers on hair, cloth, and skinning from Tangled through Moana 2.
Loki, Animatomy, and the systems behind the Avatar-era face and creature pipelines.
Character articulation, cloth and hair, and the deformation papers behind Presto.
Masquerade and Charlatan: the face pipelines behind Thanos and digital doubles.
Tables of contents for the production and animation-specific venues.
The talks that moved game animation: motion matching, IK rigs, facial pipelines.
Studio breakdowns of creature, face, and CFX pipelines from the Stuttgart conference.
Learned motion matching, motion in-betweening, and physics-based characters.
MetaHuman, Control Rig, the ML Deformer, Chaos Cloth, and Motion Matching talks.
KineFX and APEX rigging, ML skinning, muscle and tissue, and crowd character FX.
Delta Mush, Proximity Wrap, Bifrost rigging, HumanIK, and the Maya ML Deformer.
Preprints for the ML character animation wave.