In this work, we present Conditional Adversarial Latent Models (CALM), an approach for generating diverse and directable behaviors for user-controlled interactive virtual characters.
Using imitation learning, CALM learns a representation of movement that captures the complexity and diversity of human motion, and enables direct control over character movements.
The approach jointly learns a control policy and a motion encoder that reconstructs key characteristics of a given motion without merely replicating it.
The results show that CALM learns a semantic motion representation, enabling control over the generated motions and style-conditioning for higher-level task training.
Once trained, the character can be controlled using intuitive interfaces, akin to those found in video games.
Overview
To achieve zero-shot task solutions, CALM consists of 3 phases.
(1) A motion encoder and a low-level policy (decoder) are jointly trained to map from a motion capture sequence into actions controlling the simulated character.
(2) A high-level policy is trained using latent space conditioning, to enable control over the direction in which a motion is performed, while retaining the requested style.
(3) Steps 1 and 2 are combined using a simple finite-state-machine in order to solve tasks without further training and without meticulous reward/termination design.
Phase 1: Low-level Training
Meaningful Motion Representations
Phase 2: Directionality Control
To control motion direction, we train a high-level task-driven policy to select latent variables.
These latents are provided to the low-level policy which generates the requested motion.
Here, the learned motion representation enables a form of style-conditioning.
To achieve this, the motion encoder is used to obtain the latent representation of the requested motion.
The high-level policy is then provided an additional reward proportional to the cosine distance between the selected latents and the latent representing the requested style, thus guiding the high-level policy to adopt a desired behavioral style.
For example, here a directionality-controller is trained to enable control over the form of loco-motion performed and the direction in which the character performs it -- crouch-walk, walk shield-up, and run.
Phase 3: Inference
Finally, the previously trained models (low-level policy and directional controller) are combined to compose complex movements without additional training.
To do so, the user produces a finite-state machine (FSM) containing standard rules and commands.
These determine which motion to perform, similar to how a user controls a video game character.
For example, they determine whether the character should perform a simple motion, performed directly using the low-level policy, or a directed motion requiring high-level control.
As an example, one may construct an FSM like (a) "crouch-walk towards the target, until distance < 1m", then (b) "kick", and finally (c) "celebrate".
Citation
@inproceedings{tessler2023calm,
author = {
Tessler, Chen
and Kasten, Yoni
and Guo, Yunrong
and Mannor, Shie
and Chechik, Gal
and Peng, Xue Bin},
title = {CALM: Conditional Adversarial Latent Models for Directable Virtual Characters},
year = {2023},
isbn = {9798400701597},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3588432.3591541},
doi = {10.1145/3588432.3591541},
booktitle = {ACM SIGGRAPH 2023 Conference Proceedings},
keywords = {
reinforcement learning,
animated character control,
adversarial training,
motion capture data
},
location = {Los Angeles, CA, USA},
series = {SIGGRAPH '23}
}
Paper
CALM: Conditional Adversarial Latent Models for Directable Virtual Characters
Chen Tessler, Yoni Kasten, Yunrong Guo, Shie Mannor, Gal Chechik, and Xue Bin Peng