Listen to a high-level overview of our MaskedManipulator system in action.
We tackle the challenges of synthesizing versatile, physically simulated human motions for full-body object manipulation. Unlike prior methods that are focused on detailed motion tracking, trajectory following, or teleoperation, our framework enables users to specify versatile high-level objectives such as target object poses or body poses. To achieve this, we introduce MaskedManipulator, a generative control policy distilled from a tracking controller trained on large-scale human motion capture data. This two-stage learning process allows the system to perform complex interaction behaviors, while providing intuitive user control over both character and object motions. MaskedManipulator produces goal-directed manipulation behaviors that expand the scope of interactive animation systems beyond task-specific solutions.
The first application we consider is motion tracking. Here, MaskedManipulator is provided a set of target joint positions and/or orientations and must generate a full-body motion that is consistent with these constraints. Common examples for such a task include scene-retargeting, where the goal is to reproduce a reference motion in a new scene, and teleoperation, where the goal is to generate plausible full-body motion infered from sensors located on a vr headset and hand controllers.
Provided motion capture recordings, our method is able to reconstruct them in a physically plausible way.
The second application we consider is sparse tracking. Here, MaskedManipulator is provided a set of target sparse joints and/or object positions and must generate a full-body motion that is consistent with these joint positions and/or orientations and must generate a full-body motion that is consistent with these constraints. Common examples for such a task include teleoperation, where the goal is to generate plausible full-body motion infered from sensors located on a vr headset and hand controllers.
Provided only a subset of the joints, our method is able to reconstruct plausible full-body motions. Here we showcase tracking from head, hand, and object constraints (akin to VR tracking).
MaskedManipulator can be conditioned on future object positions. Here, the goal is to move the object to the target position by the specified time.
The third application we consider is generative behavior. Here, MaskedManipulator is not provided any constraints, and must generate a full-body motion that is consistent with the object in front of it. constraints.
When no goal is provided, MaskedManipulator produces natural behavior that best matches the object in front of it.
@inproceedings{tessler2025maskedmanipulator,
author = {
Tessler, Chen and Jiang, Yifeng and Coumans, Erwin and Luo, Zhengyi and Chechik, Gal and Peng, Xue Bin
},
title = {
MaskedManipulator: Versatile Whole-Body Manipulation
},
year = {2025},
booktitle={ACM SIGGRAPH Asia 2025 Conference Proceedings}
}
MaskedManipulator: Versatile Whole-Body Manipulation
Chen Tessler, Yifeng Jiang, Erwin Coumans, Zhengyi Luo, Gal Chechik, and Xue Bin Peng