The 1st International Workshop on Interactive Physical AI (IPA 2026) at CVPR 2026 will bring together researchers from computer vision, robotics, and multimodal AI, providing the first comprehensive forum to address the full scope of interactive physical AI systems while building upon prior workshops that have explored subsets of this space. The workshop topics include (but are not limited to):
- Human-AI interaction in physical environments
- Embodied conversational AI and multimodal learning
- Full-duplex multimodal conversational models
- Social intelligence and communication for robots and avatars
- Egocentric vision and first-person perception
- Real-time audio-visual processing for interactive systems
- Safe and cooperative human-robot interaction
- Personalization and lifelong learning for physical AI
- Privacy-aware learning in interactive settings
- Physically authentic perception and generation for avatars and agents
We will be hosting invited speakers and will also be accepting the submission of full unpublished papers. These papers will be peer-reviewed via a double-blind process, and will be published in the official CVPR 2026 workshop proceedings and be presented at the workshop itself.
Advances in multimodal learning, embodied intelligence, and conversational AI are transforming how humans interact with intelligent AI systems situated alongside us in our physical world. We define such systems as Interactive Physical AI (IPA). IPA systems simultaneously
- Perceive humans and scenes using audio-visual signals
- Generate communication signals via verbal and nonverbal behaviors (speech, prosody, backchannels, visual cues such as gaze and gestures)
- Act safely and effectively under physical-world constraints in shared spaces
Embodiments of IPA include:
- Robots (both humanoids and non-humanoids)
- Physically-grounded and environment-aware avatars (e.g., AR telepresence)
- On-device audio-visual agents
Call for Contributions
Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) on OpenReview (The link will be provided soon).
Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.
Note: Authors of previously rejected main
conference submissions are also welcome to submit their work to our workshop.
When doing so, you must submit the previous reviewers' comments (named as
previous_reviews.pdf) and a letter of changes (named as
letter_of_changes.pdf) as part of your supplementary materials to
clearly demonstrate the changes made to address the comments made by previous
reviewers.
NVIDIA Research
NVIDIA Research
NVIDIA Research
Carnegie Mellon University
NVIDIA Research
