IPA 2026: Interactive Physical AI Workshop at CVPR 2026

Introduction

The 1st International Workshop on Interactive Physical AI (IPA 2026) at CVPR 2026 will bring together researchers from computer vision, robotics, and multimodal AI, providing the first comprehensive forum to address the full scope of interactive physical AI systems while building upon prior workshops that have explored subsets of this space. The workshop topics include (but are not limited to):

Human-AI interaction in physical environments
Embodied conversational AI and multimodal learning
Full-duplex multimodal conversational models
Social intelligence and communication for robots and avatars
Egocentric vision and first-person perception
Real-time audio-visual processing for interactive systems
Safe and cooperative human-robot interaction
Personalization and lifelong learning for physical AI
Privacy-aware learning in interactive settings
Physically authentic perception and generation for avatars and agents

We will be hosting invited speakers and will also be accepting the submission of full unpublished papers. These papers will be peer-reviewed via a double-blind process, and will be published in the official CVPR 2026 workshop proceedings and be presented at the workshop itself.

What is Interactive Physical AI?

Advances in multimodal learning, embodied intelligence, and conversational AI are transforming how humans interact with intelligent AI systems situated alongside us in our physical world. We define such systems as Interactive Physical AI (IPA). IPA systems simultaneously

Perceive humans and scenes using audio-visual signals
Generate communication signals via verbal and nonverbal behaviors (speech, prosody, backchannels, visual cues such as gaze and gestures)
Act safely and effectively under physical-world constraints in shared spaces

Embodiments of IPA include:

Robots (both humanoids and non-humanoids)
Physically-grounded and environment-aware avatars (e.g., AR telepresence)
On-device audio-visual agents

that interact with humans in the physical world.

Call for Papers

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) on OpenReview (The link will be provided soon).

Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Note: Authors of previously rejected main conference submissions are also welcome to submit their work to our workshop. When doing so, you must submit the previous reviewers' comments (named as previous_reviews.pdf) and a letter of changes (named as letter_of_changes.pdf) as part of your supplementary materials to clearly demonstrate the changes made to address the comments made by previous reviewers.

Submit Your Paper

Important Dates

Paper Submission Deadline	March 10, 2026 (23:59 PST)
Notification to Authors	March 24, 2026
Camera-Ready Deadline	April 10, 2026

Schedule

Schedule to be announced.

Keynote Speakers

Yaser Sheikh

Professor at Carnegie Mellon University

ex-Meta VP

Founder of Meta Reality Labs

Yaser Sheikh builds frontier systems that enable machines to perceive and predict. He is currently a Consulting Professor at Carnegie Mellon University and the founder of a new AI venture focused on foundational advances in long-horizon foresight.

Previously, he served as a Vice President at Meta (2015–2025), where he founded the Meta Reality Lab in Pittsburgh and led the invention and productization of Codec Avatars, a breakthrough in real-time, photorealistic telepresence that will usher in the next generation of global communication.

Before Meta, he spent over a decade as faculty at CMU's Robotics Institute (2006–2019), where his group developed fundamental advances in machine perception, including OpenPose and the Panoptic Studio, systems that reshaped how AI understands human motion and behavior.

Maja Matarić

Professor at University of Southern California

Principal Scientist at Google DeepMind

Founding Director, Robotics and Autonomous Systems Center (RASC)

Founding Director, Interaction Lab

Maja Matarić is the Chan Soon-Shiong Chair and Distinguished Professor of Computer Science, with appointments in Neuroscience, and Pediatrics at the University of Southern California (USC), and a Principal Scientist at Google DeepMind. She is the founding director of the USC Robotics and Autonomous Systems Center (rasc.usc.edu), past interim Vice President of Research (Jan 2020–Jul 2021), past Vice Dean for Research (Jul 2006–Dec 2019), and past President of the USC faculty and the Academic Senate (2005–06).

A pioneer of the field of socially assistive robotics, her USC Interaction Lab's research is aimed at endowing machines with the ability to provide users with personalized motivation and support to empower them to reach their potential. Her lab's research supports users with differences, including children on the autism spectrum, stroke patients, dementia patients, and students and adults with anxiety or depression, among others.

One more speaker to be announced.