4D Digital Twins: Real-to-Sim-to-Real for Physical AI

CVPR 2026 Workshop

Date/Time TBD Half-Day Workshop

About the Workshop

In this workshop, we will explore graphics-in-the-loop, physically grounded 4D reconstruction for physical AI digital twins. We aim to encourage discourse in the research community to tackle the real-to-sim-to-real challenge by bridging graphics, vision, and physical modeling.

This workshop will address three key challenges:

  • Reconstructing sim-ready representations of the dynamic world
  • Robustly training physical AI agents within these environments
  • Deploying learned policies back into the physical world

It aims to foster the exchange of ideas between researchers in varied communities, across vision, graphics, and robotics, to solve interdisciplinary challenges towards embodied physical AI.

Background

Physical AI represents the next frontier of artificial intelligence: embodied agents that learn by interacting with the physical world. Achieving the scale and diversity of data required to train physical AI foundation models, however, requires implausible amounts of photorealistic, physically accurate interaction data.

Advances in 4D neural representations, such as neural radiance fields and Gaussian splatting, enable unprecedented visual realism, but lack physical grounding for sim-ready usage. While video diffusion models exhibit visually impressive results, they fundamentally lack physical understanding, exhibiting what researchers term "case-based" generalization rather than true physical dynamics. More critically, they do not provide the rich multimodal feedback (e.g., tactile feedback and collision physics) required for physically grounded simulators for embodied AI.

Graphics-in-the-Loop Approach

Physics-based simulation through traditional computer graphics offers a practical alternative, yet persistent sim-to-real gaps limit its effectiveness. Combining traditional computer graphics with 4D neural representations offers new possibilities for interactive, physically grounded environments for embodied AI.

Computer graphics brings decades of expertise in modeling physics through rigorous frameworks (e.g., rigid body dynamics and collision detection, soft body deformations, fluid simulations), contributing physical fidelity that purely data-driven models struggle to replicate. These graphics simulators naturally support multimodal sensory feedback, such as appearance changes, deformations, contact forces, and tactile responses, critical for manipulation and embodied learning.

Moreover, the explicit geometric and physical representations in graphics enables controlled, agent-in-the-loop experimentation essential for reinforcement learning. By grounding neural 4D representations with graphics-inspired physics, we can join photorealistic rendering capabilities of neural methods with real-world physical consistency, narrowing the sim-to-real gap and enabling more robust policy transfer.

Workshop Topics

  • Real-to-sim-to-real learning with physical grounding
  • Photorealistic and sim-ready 4D neural scene reconstruction
  • Integrating physics-based modeling from traditional computer graphics with 4D neural scene representations
  • Physical AI policy learning, including cross-domain adaptation and transfer learning for different environments, sim-to-real gap, etc.
  • Data-efficient and scalable 4D neural representations for embodied AI, from capture to reconstruction to transmission
  • Editable and generalizable 4D neural scenes for simulation and control (e.g., relighting, object decomposition, etc.)

We will be hosting invited speakers and will also be inviting poster presentations from accepted papers.

Keynote Speakers

Distinguished researchers in 4D vision, neural rendering, and embodied AI

Katerina Fragkiadaki
Carnegie Mellon University
Hanbyul Joo
Seoul National University
Xiaolong Wang
University of California, San Diego
Gordon Wetzstein
Stanford University

Schedule

Coming soon!

Organizers

Amrita Mazumdar
NVIDIA Research
Tianye Li
NVIDIA Research