Publications | NVIDIA Learning and Perception Research

Stathi Fotiadis, Noah Brenowitz, Tomas Geffner, Yair Cohen, Mike Pritchard, Arash Vahdat, Morteza Mardani

July 2025 International Conference on Machine Learning (ICML)

Adaptive Flow Matching for Resolving Small-Scale Physics

arXiv

Yiqing Liang, Abhishek Badki, Hang Su, James Tompkin, Orazio Gallo

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Best paper award candidate, Oral

Zero-Shot Monocular Scene Flow Estimation in the Wild

arXiv

Xueting Li, Ye Yuan, Shalini De Mello, Gilles Daviet, Jonathan Leaf, Miles Macklin, Jan Kautz, Umar Iqbal

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

arXiv Website

Abhiram Maddukuri, Zhenyu Jiang, Lawrence Yunliang Chen, Soroush Nasiriany, Yuqi Xie, Yu Fang, Wenqi Huang, Zu Wang, Zhenjia Xu, Nikita Chernyadev, Scott Reed, Ken Goldberg, Ajay Mandlekar, Linxi Fan, Yuke Zhu

June 2025 Robotics: Science and Systems (RSS)

Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation

arXiv Website

Baifeng Shi, Boyi Li, Han Cai, Yao Lu, Sifei Liu, Marco Pavone, Jan Kautz, Song Han, Trevor Darrell, Pavlo Molchanov, Hongxu (Danny) Yin

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight

Scaling Vision Pre-Training to 4K Resolution

arXiv Website

Chan Hee Song, Valts Blukis, Jonathan Tremblay, Stephen Tyree, Yu Su, Stan Birchfield

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics

arXiv

Jiarui Xu, Shihao Han, Karan Dalal, Daniel Koceja, Xinhao Li, Yue Zhao, Ka Chun Cheung, Yejin Choi, Jan Kautz, Sifei Liu, Yu Sun, Xiaolong Wang

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recreating 1940s Tom and Jerry with Test-Time Training

Greg Heinrich, Mike Ranzinger, Hongxu (Danny) Yin, Yao Lu, Jan Kautz, Bryan Catanzaro, Andrew Tao, Pavlo Molchanov

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models

arXiv

Hongjun Wang, Wonmin Byeon, Jiarui Xu, Jinwei Gu, Ka Chun Cheung, Xiaolong Wang, Kai Han, Jan Kautz, Sifei Liu

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Parallel Sequence Modeling via Generalization Spatial Propagation Network (GSPN)

arXiv Website

Shihao Wang, Zhiding Yu, Xiaohui Jiang, Shiyi Lan, Min Shi, Nadine Chang, Jan Kautz, Ying Li, Jose M. Alvarez

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counter Factual Reasoning

arXiv

Miran Heo, Min-Hung Chen, De-An Huang, Sifei Liu, Subhashree Radhakrishnan, Seon Joo Kim, Yu-Chiang Frank Wang, Ryo Hachiuma

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

arXiv Website

Zhijian Liu, Ligeng Zhu, Baifeng Shi, Zhuoyang Zhang, Yuming Lou, Shang Yang, Haocheng Xi, Shiyi Cao, Yuxian Gu, Dacheng Li, Xiuyu Li, Yunhao Fang, Yukang Chen, Cheng-Yu Hsieh, De-An Huang, An-Chieh Cheng, Vishwesh Nath, Andriy Myronenko, Jinyi Hu, Sifei Liu, Ranjay Krishna, Daguang Xu, Xiaolong Wang, Pavlo Molchanov, Jan Kautz, Hongxu (Danny) Yin, Song Han, and Yao Lu

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

NVILA: Efficient Frontier Visual Language Models

arXiv

An-Chieh Cheng, Yandong Ji, Zhaojing Yang, Zaitan Gongye, Xueyan Zou, Jan Kautz, Erdem Biyik, Hongxu (Danny) Yin, Sifei Liu, Xiaolong Wang

June 2025 Robotics: Science and Systems (RSS)

NaVILA: Legged Robot Vision-Language-Action Model for Navigation

arXiv Website

Junha Lee, Chunghyun Park, Jaesung Choe, Yu-Chiang Frank Wang, Jan Kautz, Minsu Cho, Christopher Choy

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

arXiv Website

Ali Hatamizadeh, Jan Kautz

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

arXiv

Bowen Wen, Matthew Trepte, Oluwaseun Joseph Aribido, Jan Kautz, Orazio Gallo, Stan Birchfield

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Best paper award candidate, Oral

FoundationStereo: Zero-Shot Stereo Matching

arXiv

Weixi Feng, Chao Liu, Sifei Liu, William Yang Wang, Arash Vahdat, Weili Nie

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations

arXiv Website

Tairan He, Jiawei Gao, Wenli Xiao, Yuanhang Zhang, Zi Wang, Jiashun Wang, Zhengyi Luo, Guanqi He, Nikhil Sobanbabu, Chaoyi Pan, Zeji Yi, Guannan Qu, Kris Kitani, Jessica Hodgins, Linxi "Jim" Fan, Yuke Zhu, Changliu Liu, Guanya Shi

June 2025 Robotics: Science and Systems (RSS)

ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills

arXiv Website

Yunze Man, De-An Huang, Guilin Liu, Shiwei Sheng, Shilong Liu, Liangyan Gui, Jan Kautz, Yu-Xiong Wang, Zhiding Yu

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Shengyi Qian, Kaichun Mo, Valts Blukis, David Fouhey, Dieter Fox, Ankit Goyal

June 2025 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

3D-MVP: 3D Multiview Pretraining for Robotic Manipulation

arXiv

Cheng-Chun Hsu, Bowen Wen, Jie Xu, Yashraj Narang, Xiaolong Wang, Yuke Zhu, Joydeep Biswas, Stan Birchfield

May 2025 International Conference on Robotics and Automation (ICRA)

SPOT: SE(3) Pose Trajectory Diffusion for Object-Centric Manipulation

arXiv

Yanwei Wang, Lirui Wang, Yilun Du, Balakumar Sundaralingam, Xuning Yang, Yu-Wei Chao, Claudia Pérez-D'Arpino, Dieter Fox, Julie A. Shah

May 2025 International Conference on Robotics and Automation (ICRA)

Inference-Time Policy Steering with Human Interactions

arXiv

Neel Anand Jawale, Byron Boots, Balakumar Sundaralingam, Mohak Bhardwaj

May 2025 International Conference on Robotics and Automation (ICRA)

Dynamic Non-Prehensile Object Transport via Model-Predictive Reinforcement Learning

arXiv

Yecheng Wu, Zhuoyang Zhang, Junyu Chen, Haotian Tang, Dacheng Li, Yunhao Fang, Ligeng Zhu, Enze Xie, Hongxu (Danny) Yin, Li Yi, Song Han, Yao Lu

April 2025 International Conference on Learning Representations (ICLR)

VILA-U: Efficient and Unified Visual Language Understanding and Generation

arXiv

Sangyun Lee, Yilun Xu, Tomas Geffner, Giulia Fanti, Karsten Kreis, Arash Vahdat, Weili Nie

April 2025 International Conference on Learning Representations (ICLR)

Truncated Consistency Models

arXiv

Sulin Liu, Juno Nam, Andrew Campbell, Hannes Stark, Yilun Xu, Tommi Jaakkola, Rafael Gomez-Bombarelli

April 2025 International Conference on Learning Representations (ICLR)

Think while You Generate: Discrete Diffusion with Planned Denoising

arXiv

Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar

April 2025 International Conference on Learning Representations (ICLR)

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

arXiv

Nicols Zilberstein, Morteza Mardani, Santiago Seggara

April 2025 International Conference on Learning Representations (ICLR)

Repulsive Latent Score Distillation Sampling for Diverse Sampling of Diffusion Models

arXiv pdf

Tomas Geffner, Kieran Didi, Zuobai Zhang, Danny Reidenbach, Zhonglin Cao, Jason Yim, Mario Geiger, Christian Dallago, Emine Kucukbenli, Arash Vahdat, Karsten Kreis

April 2025 International Conference on Learning Representations (ICLR)
Oral

Proteina: Scaling Flow-based Protein Structure Generative Models

arXiv Website

Hannes Stark, Bowen Jing, Tomas Geffner, Jason Yim, Tommi Jaakkola, Arash Vahdat, Karsten Kreis

April 2025 International Conference on Learning Representations (ICLR)
Oral

ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids

arXiv Website

Ka-Hei Hui, Chao Liu, Xiaohui Zeng, Chi-Wing Fu, Arash Vahdat

April 2025 International Conference on Learning Representations (ICLR)

Not-So-Optimal Transport Flows for 3D Point Cloud Generation

arXiv

Kyle Vedder, Neehar Peri, Ishan Khatri, Siyi Li, Eric Eaton, Mehmet Kemal Kocamaz, Yue Wang, Zhiding Yu, Deva Ramanan, Joachim Pehserl

April 2025 International Conference on Learning Representations (ICLR)

Neural Eulerian Scene Flow Fields

arXiv Website

Yukang Chen, Fuzhao Xue, Dacheng Li, Qinghao Hu, Ligeng Zhu, Xiuyu Li, Yunhao Fang, Haotian Tang, Shang Yang, Zhijian Liu, Yihui He, Hongxu (Danny) Yin, Pavlo Molchanov, Jan Kautz, Linxi Fan, Yuke Zhu, Yao Lu, Song Han

April 2025 International Conference on Learning Representations (ICLR)

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

arXiv

Zhifan Ye, Kejing Xia, Yonggan Fu, Xin Dong, Jihoon Hong, Xiangchi Yuan, Shizhe Diao, Jan Kautz, Pavlo Molchanov, Yingyan Celine Lin

April 2025 International Conference on Learning Representations (ICLR)

LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement

pdf

Ruisi Cai, Saurav Muralidharan, Hongxu (Danny) Yin, Zhangyang Wang, Jan Kautz, Pavlo Molchanov

April 2025 International Conference on Learning Representations (ICLR)

LlamaFlex: Many-in-One LLMs via Generalized Pruning and Weight Sharing

pdf

Hongkai Zheng, Wenda Chu, Bingliang Zhang, Zihui Wu, Austin Wang, Berthy Feng, Caifeng Zou, Yu Sun, Nikola B. Kovachki, Zachary E Ross, Katherine Bouman, Yisong Yue

April 2025 International Conference on Learning Representations (ICLR)

InverseBench: Benchmarking Plug-and-Play Diffusion Models for Scientific Inverse Problems

arXiv

Xin Dong, Yonggan Fu, Shizhe Diao, Wonmin Byeon, Zijia Chen, Ameya Sunil Mahabaleshwarkar, Shih-Yang Liu, Matthijs Van Keirsbilck, Min-Hung Chen, Yoshi Suhara, Yingyan Celine Lin, Jan Kautz, Pavlo Molchanov

April 2025 International Conference on Learning Representations (ICLR)

Hymba: A Hybrid-head Architecture for Small Language Models

arXiv

Kushagra Pandey, Jaideep Pathak, Yilun Xu, Stephan Mandt, Mike Pritchard, Arash Vahdat, Morteza Mardani

April 2025 International Conference on Learning Representations (ICLR)

Heavy-Tailed Diffusion Models

arXiv

Songlin Yang, Jan Kautz, Ali Hatamizadeh

April 2025 International Conference on Learning Representations (ICLR)

Gated Delta Networks: Improving Mamba2 with Delta Rule

arXiv

Minkai Xu, Tomas Geffner, Karsten Kreis, Weili Nie, Yilun Xu, Jure Leskovec, Stefano Ermon, Arash Vahdat

April 2025 International Conference on Learning Representations (ICLR)

Energy-based Diffusion Language Models for Text Generation

arXiv

Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu (Danny) Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Guilin Liu, Zhiding Yu

April 2025 International Conference on Learning Representations (ICLR)

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

arXiv

Mauro Comi, Alessio Tonioni, Max Yang, Jonathan Tremblay, Valts Blukis, Yijiong Lin, Nathan F. Lepora, Laurence Aitchison

March 2025 International Conference on 3D Vision (3DV)

Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces

arXiv

NVIDIA, Nikita Cherniadev Johan Bjorck andFernando Castañeda, Xingye Da, Runyu Ding, Linxi "Jim" Fan, Yu Fang, Dieter Fox, Fengyuan Hu, Spencer Huang, Joel Jang, Zhenyu Jiang, Jan Kautz, Kaushil Kundalia, Lawrence Lao, Zhiqi Li, Zongyu Lin, Kevin Lin, Guilin Liu, Edith Llontop, Loic Magne, Ajay Mandlekar, Avnish Narayan, Soroush Nasiriany, Scott Reed, You Liang Tan, Guanzhi Wang, Zu Wang, Jing Wang, Qi Wang, Jiannan Xiang, Yuqi Xie, Yinzhen Xu, Zhenjia Xu, Seonghyeon Ye, Zhiding Yu, Ao Zhang, Hao Zhang, Yizhou Zhao, Ruijie Zheng, Yuke Zhu

March 2025 ArXiv Preprint

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots

arXiv

Morteza Mardani, Noah Brenowitz, Yair Cohen, Jaideep Pathak, Chieh-Yu Chen, Cheng-Chin Liu, Arash Vahdat, Mohammad Amin Nabian, Tao Ge, Akshay Subramaniam, Karthik Kashinath, Jan Kautz, Mike Pritchard

February 2025 Nature Communications Earth & Environment

Residual corrective diffusion modeling for km-scale atmospheric downscaling

arXiv Website

Ashkan Ganj, Hang Su, Tian Guo

February 2025 IEEE Winter Conference on Applications of Computer Vision (WACV)

HybridDepth: Robust Metric Depth Fusion by Leveraging Depth from Focus and Single-Image Priors

arXiv

Marc T. Law, Karsten Kreis, Haggai Maron

January 2025 Transactions on Machine Learning Research

Directed Graph Generation with Heat Kernels

pdf

Giannis Daras, Weili Nie, Karsten Kreis, Alex Dimakis, Morteza Mardani, Nikola B. Kovachki, Arash Vahdat

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models

arXiv

An-Chieh Cheng, Hongxu (Danny) Yin, Yang Fu, Qiushan Guo, Ruihan Yang, Jan Kautz, Xiaolong Wang, Sifei Liu

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models

arXiv

Md Ashiqur Rahman, Robert Joseph George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A Yeh, Jean Kossaifi, others

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs

arXiv

Seul Lee, Karsten Kreis, Srimukh Prasad Veccham, Meng Liu, Danny Reidenbach, Saee Paliwal, Arash Vahdat, Weili Nie

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

Molecule Generation with Fragment Retrieval Augmentation

arXiv

Yiming Li, Zehong Wang, Yue Wang, Zhiding Yu, Zan Gojcic, Marco Pavone, Chen Feng, Jose M. Alvarez

December 2024 Advances in Neural Information Processing Systems (NeurIPS)
Spotlight

Memorize What Matters: Emergent Scene Decomposition from Multitraverse

arXiv

Gongfan Fang, Hongxu (Danny) Yin, Saurav Muralidharan, Greg Heinrich, Jeff Pool, Jan Kautz, Pavlo Molchanov, Xinchao Wang

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

arXiv

Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

arXiv

Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui Zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

L4GM: Large 4D Gaussian Reconstruction Model

arXiv

Fan-Yun Sun, Harini S I, Angela Yi, Yihan Zhou, Alex Zook, Jonathan Tremblay, Logan Cross, Jiajun Wu, Nick Haber

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

FACTORSIM: Generative Simulation via Factorized Representation

arXiv

Omri Avrahami, Rinon Gal, Gal Chechik, Ohad Fried, Dani Lischinski, Arash Vahdat, Weili Nie

December 2024 ACM Transactions on Graphics (SIGGRAPH ASIA)

DiffUHaul: A Training-Free Method for Object Dragging in Images

arXiv Website

Sifei Liu, Shalini De Mello, Jan Kautz

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

Cosine Autoencoder with Extremely Narrow Bottleneck for Image Restoration

Website pdf

Sifei Liu, Shalini De Mello, Jan Kautz

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

CosAE: Learnable Fourier Series for Image Restoration

pdf

Saurav Muralidharan, Sharath Turuvekere Sreenivas, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Jan Kautz, Pavlo Molchanov

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

Compact Language Models via Pruning and Knowledge Distillation

arXiv

Chao Liu, Weili Nie, Sifei Liu, Abhishek Badki, Hang Su, Morteza Mardani, Benjamin Eckart, Arash Vahdat

December 2024 ACM Transactions on Graphics (SIGGRAPH ASIA)

BlobGEN-3D: Compositional 3D-Consistent Freeview Image Generation with 3D Blobs

pdf

Siyi Gu, Minkai Xu, Alexander S Powers, Weili Nie, Tomas Geffner, Karsten Kreis, Jure Leskovec, Arash Vahdat, Stefano Ermon

December 2024 Advances in Neural Information Processing Systems (NeurIPS)

Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization

arXiv

Caelan Reed Garrett, Ajay Mandlekar, Bowen Wen, Dieter Fox

November 2024 Conference on Robot Learning (CoRL)

SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment

arXiv Website

Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

November 2024 Conference on Robot Learning (CoRL)

RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics

arXiv

Zhenyu Jiang, Yuqi Xie, Jinhan Li, Ye Yuan, Yifeng Zhu, Yuke Zhu

November 2024 Conference on Robot Learning (CoRL)

Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions

arXiv Website

Huang Huang, Balakumar Sundaralingam, Arsalan Mousavian, Adithyavairavan Murali, Ken Goldberg, Dieter Fox

November 2024 Conference on Robot Learning (CoRL)

DiffusionSeeder: Seeding Motion Optimization with Diffusion for Rapid Motion Planning

arXiv

Adam Fishman, Aaron Walsman, Mohak Bhardwaj, Wentao Yuan, Balakumar Sundaralingam, Byron Boots, Dieter Fox

November 2024 Conference on Robot Learning (CoRL)

Avoid Everything: Model-Free Collision Avoidance with Expert-Guided Fine-Tuning

pdf

Yaozhong Shi, Angela F Gao, Zachary E Ross, Kamyar Azizzadenesheli

October 2024 Transactions on Machine Learning Research

Universal Functional Regression with Neural Operator Flows

arXiv code

Yiming Li, Sihang Li, Xinhao Liu, Moonjun Gong, Kenan Li, Nuo Chen, Zijun Wang, Zhiheng Li, Tao Jiang, Fisher Yu, Yue Wang, Hang Zhao, Zhiding Yu, Chen Feng

October 2024 International Conference on Intelligent Robots and Systems (IROS)

SSCBench: Monocular 3D Semantic Scene Completion Benchmark in Street Views

arXiv Website

Bardienus P. Duisterhof, Mandi Zhao, Yunchao Yao, Jia-Wei Liu, Jenny Seidenschwarz, Mike Zheng Shou, Deva Ramanan, Shuran Song, Stan Birchfield, Bowen Wen, Jeffrey Ichnowski

October 2024 Workshop on the Algorithmic Foundations of Robotics (WAFR)

DeformGS: Scene Flow in Highly Deformable Scenes for Deformable Object Manipulation

arXiv Website

Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, Arash Vahdat

October 2024 Transactions on Machine Learning Research

AGG: Amortized Generative 3D Gaussians for Single Image to 3D

arXiv Website

Gyeongrok Oh, Jaehwan Jeong, Sieun Kim, Wonmin Byeon, Jinkyu Kim, Sungwoong Kim, Sangpil Kim

September 2024 European Conference on Computer Vision (ECCV)

MEVG: Multi-event Video Generation with Text-to-Video Models

arXiv

De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu (Danny) Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz

September 2024 European Conference on Computer Vision (ECCV)

LITA: Language Instructed Temporal-localization Assistant

arXiv

Robert Joseph George, Jiawei Zhao, Jean Kossaifi, Zongyi Li, Anima Anandkumar

September 2024 Transactions on Machine Learning Research

Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs

arXiv

Sahin Lale, Peter I. Renn, Kamyar Azizzadenesheli, Babak Hassibi, Morteza Gharib, Anima Anandkumar

September 2024 npj Robotics

FALCON: Fourier Adaptive Learning and Control for Disturbance Rejection Under Extreme Turbulence

pdf

Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat

September 2024 European Conference on Computer Vision (ECCV)

DiffiT: Diffusion Vision Transformers for Image Generation

arXiv Website

Caifeng Zou, Kamyar Azizzadenesheli, Zachary E. Ross, and Robert W. Clayton

September 2024 Geophysical Journal International

Deep Neural Helmholtz Operators for 3D Elastic Wave Propagation and Inversion

arXiv

Jifeng Li, Ye Yuan, Davis Rempe, Haotian Zhang, Pavlo Molchanov, Cewu Lu, Jan Kautz, Umar Iqbal

September 2024 European Conference on Computer Vision (ECCV)

COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation

arXiv Website

Ekta Prashnani, Koki Nagano, Shalini De Mello, David Luebke, Orazio Gallo

September 2024 European Conference on Computer Vision (ECCV)

Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos

arXiv pdf

Junfei Xiao, Ziqi Zhou, Wenxuan Li, Shiyi Lan, Jieru Mei, Zhiding Yu, Alan Yuille, Yuyin Zhou, Cihang Xie

September 2024 European Conference on Computer Vision (ECCV)

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties

arXiv

Jean Kossaifi, Nikola B. Kovachki, Kamyar Azizzadenesheli, Anima Anandkumar

August 2024 Transactions on Machine Learning Research

Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs

pdf

Seung Hyun Lee, Chanyoung Kim, Wonmin Byeon, Sang Ho Yoon, Jinkyu Kim, Sangpil Kim

August 2024 Computational Visual Media

LISA: Localized Image Stylization with Audio via Implicit Neural Representation

arXiv

David Durst, Vishnu Sarukkai, Brennan Shacklett, Iuri Frosio, Chen Tessler, Joohwan Kim, Carly Wolfbrandt, Gilbert Bernstein, Sanjiban Choudhury, Pat Hanrahan, Kayvon Fatahalian

August 2024 ACM SIGGRAPH / Eurographics Symposium on Computer Animation

Learning to Move Like Professional Counter-Strike Players Learning to Move Like Professional Counter-Strike Players

arXiv Website

Ziqi Ma, David Pitt, Kamyar Azizzadenesheli, Anima Anandkumar

August 2024 Transactions on Machine Learning Research

Calibrated Uncertainty Quantification for Operator Learning via Conformal Prediction

pdf

Kamyar Azizzadenesheli, William Lu, Anuran Makur, Qian Zhang

July 2024 Transactions on Machine Learning Research

Sparse Contextual CDF Regression

pdf

Ankit Goyal, Valts Blukis, Jie Xu, Yijie Guo, Yu-Wei Chao, Dieter Fox

July 2024 Robotics: Science and Systems (RSS)

RVT-2: Learning Precise Manipulation from Few Demonstrations

arXiv

Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Sang Ho Yoon, Jinkyu Kim, Sangpil Kim

July 2024 Neural Networks

Robust Sound-Guided Image Manipulation

arXiv

Miguel Liu-Schiaffini, Julius Berner, Boris Bonev, Thorsten Kurth, Kamyar Azizzadenesheli, Anima Anandkumar

July 2024 International Conference on Machine Learning (ICML)

Neural Operators with Localized Integral and Differential Kernels

arXiv

Ruisi Cai, Saurav Muralidharan, Greg Heinrich, Hongxu (Danny) Yin, Zhangyang Wang, Jan Kautz, Pavlo Molchanov

July 2024 International Conference on Machine Learning (ICML)
Oral

Flextron: Many-in-One Flexible Large Language Model

arXiv

Jingwei Sun, Ziyue Xu, Hongxu (Danny) Yin, Dong Yang, Daguang Xu, Yudong Liu, Zhixu Du, Yiran Chen, Holger Roth

July 2024 International Conference on Machine Learning (ICML)

FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models

arXiv

Minkai Xu, Jiaqi Han, Aaron Lou, Jean Kossaifi, Arvind Ramanathan, Kamyar Azizzadenesheli, Jure Leskovec, Stefano Ermon, Anima Anandkumar

July 2024 International Conference on Machine Learning (ICML)

Equivariant Graph Neural Operator for Modeling 3D Dynamics

arXiv

Shi-yang Liu, Chien-yi Wang, Hongxu (Danny) Yin, Yu-Chiang Frank Wang, Kwang-Ting Cheng, Min-Hung Chen

July 2024 International Conference on Machine Learning (ICML)

DoRA: Weight-decomposed Low-rank Adaptation

arXiv

Yilun Xu, Gabriele Corso, Tommi Jaakkola, Arash Vahdat, Karsten Kreis

July 2024 International Conference on Machine Learning (ICML)

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Website pdf code

Weili Nie, Sifei Liu, Morteza Mardani, Chao Liu, Benjamin Eckart, Arash Vahdat

July 2024 International Conference on Machine Learning (ICML)

Compositional Text-to-Image Generation with Dense Blob Representations

arXiv Website

Bingjie Tang, Iretiayo Akinola, Jie Xu, Bowen Wen, Ankur Handa, Karl Van Wyk, Dieter Fox, Gaurav S. Sukhatme, Fabio Ramos, Yashraj Narang

July 2024 Robotics: Science and Systems (RSS)

AutoMate: Specialist and Generalist Assembly Policies over Diverse Geometries

Website pdf

Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis

July 2024 International Conference on Machine Learning (ICML)

Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

arXiv Website

Ji Lin, Hongxu (Danny) Yin, Wei Ping, Yao Lu, Pavlo Molchanov, Andrew Tao, Huizi Mao, Jan Kautz, Mohammad Shoeybi, Song Han

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

VILA: On pretraining for vision language models

arXiv

Hongchi Xia, Yang Fu, Sifei Liu, Xiaolong Wang

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

arXiv Website

Alex Trevithick, Matthew Chan, Towaki Takikawa, Umar Iqbal, Shalini De Mello, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Rendering Every Pixel for High-Fidelity Geometry in 3D GANs

arXiv

Qiushan Guo, Shalini De Mello, Hongxu (Danny) Yin, Wonmin Byeon, Ka Chun Cheung, Yizhou Yu, Ping Luo, Sifei Liu

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

RegionGPT: Towards Region Understanding Vision Language Model

arXiv

Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

arXiv

Jingbo Wang, Ye Yuan Zhengyi Luo, Yixuan Li, Bo Dai

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios

arXiv Website

Dongsu Zhang, Francis Williams, Zan Gojcic, Karsten Kreis, Sanja Fidler, Young Min Kim, Amlan Kar

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

arXiv

Yijia Weng, Bowen Wen, Jonathan Tremblay, Valts Blukis, Dieter Fox, Leonidas Guibas, Stan Birchfield

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Neural Implicit Representation for Building Digital Twins of Unknown Articulated Objects

arXiv Website

Zhenggang Tang, Zhongzheng Ren, Xiaoming Zhao, Bowen Wen, Jonathan Tremblay, Stan Birchfield, Alex Schwing

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows

pdf

Mathis Petrovich, Or Litany, Umar Iqbal, Michael J. Black, Gül Varol, Xue Bin Peng, Davis Rempe

June 2024 CVPR Workshop on Human Motion Generation (HuMoGen)

Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation

arXiv

Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, Jose M. Alvarez

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

arXiv Website

Zetong Yang, Zhiding Yu, Christopher Choy, Renhao Wang, Anima Anandkumar, Jose M. Alvarez

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Improving Distant 3D Object Detection Using 2D Box Supervision

arXiv

Zhenxin Li, Kailin Li, Shihao Wang, Shiyi Lan, Zhiding Yu, Yishen Ji, Zhiqi Li, Ziyue Zhu, Jan Kautz, Zuxuan Wu, Yu-Gang Jiang, Jose M. Alvarez

June 2024 ArXiv Preprint

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

arXiv

Mengqi Zhang, Yang Fu, Zheng Ding, Sifei Liu, Zhuowen Tu, Xiaolong Wang

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data

arXiv Website

Ye Yuan, Xueting Li, Yangyi Huang, Shalini De Mello, Koki Nagano, Jan Kautz, Umar Iqbal

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

arXiv Website

Bowen Wen, Wei Yang, Jan Kautz, Stan Birchfield

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

arXiv Website

Yufeng Zheng, Xueting Li, Koki Nagano, Sifei Liu, Karsten Kreis, Otmar Hilliges, Shalini De Mello

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Dream-in-4D: A Unified Approach for Text- and Image-guided 4D Scene Generation

arXiv Website

Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A. Efros, Xiaolong Wang

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight

COLMAP-Free 3D Gaussian Splatting

arXiv Website

Roger Waleffe, Wonmin Byeon, Duncan Riach, Brandon Norick, Vijay Korthikanti, Tri Dao, Albert Gu, Ali Hatamizadeh, Sudhakar Singh, Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh, Jared Casper, Jan Kautz, Mohammad Shoeybi, Bryan Catanzaro

June 2024 ArXiv Preprint

An Empirical Study of Mamba-based Language Models

arXiv Website

Mike Ranzinger, Greg Heinrich, Pavlo Molchanov, Jan Kautz

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

AM-RADIO: Agglomerative Model - Reduce All Domains Into One

arXiv Website pdf

Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis

June 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

arXiv Website

Katja Schwarz, Seung Wook Kim, Jun Gao, Sanja Fidler, Andreas Geiger, Karsten Kreis

May 2024 International Conference on Learning Representations (ICLR)

WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space

arXiv

Helen Zhou, Audrey Huang, Kamyar Azizzadenesheli, David Childers, Zachary Lipton

May 2024 International Conference on Artificial Intelligence and Statistics (AISTATS)

Timing as an Action: Learning When to Observe and Act

pdf

Haque Ishfaq, Qingfeng Lan, Pan Xu, A Rupam Mahmood, Doina Precup, Anima Anandkumar, Kamyar Azizzadenesheli

May 2024 International Conference on Learning Representations (ICLR)

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

pdf

Yichen Li, Yilun Du, Chao Liu, Francis Williams, Michael Foshey, Benjamin Eckart, Jan Kautz, Joshua B. Tenenbaum, Antonio Torralba, Wojciech Matusik

May 2024 International Conference on Learning Representations (ICLR)

Learning to Jointly Understand Visual and Tactile Signals

pdf

Colin White, Renbo Tu, Jean Kossaifi, Gennady Pekhimenko, Kamyar Azizzadenesheli, Anima Anandkumar

May 2024 International Conference on Learning Representations (ICLR)

Guaranteed Approximation Bounds for Mixed-Precision Neural Operators

pdf

Ali Hatamizadeh, Greg Heinrich, Hongxu (Danny) Yin, Andrew Tao, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

May 2024 International Conference on Learning Representations (ICLR)

FasterViT: Fast Vision Transformers with Hierarchical Attention

arXiv Website

Sihyun Yu, Weili Nie, De-An Huang, Boyi Li, Jinwoo Shin, Anima Anandkumar

May 2024 International Conference on Learning Representations (ICLR)

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition

arXiv Website

Suyash Bire, Jean Kossaifi, Simone Silvestri, Nikola B. Kovachki, Kamyar Azizzadenesheli, Chris N Hill, Anima Anandkumar

May 2024 ICLR Workshop on Climate Change AI

AI-driven emulation of ocean dynamics on sub-seasonal scales

Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat

May 2024 International Conference on Learning Representations (ICLR)

A Variational Perspective on Solving Inverse Problems with Diffusion Models

arXiv

Yang Fu, Shalini De Mello, Xueting Li, Amey Kulkarni, Jan Kautz, Xiaolong Wang, Sifei Liu

May 2024 International Conference on Learning Representations (ICLR)

3D Reconstruction with Generalizable Neural Fields using Scene Priors

arXiv Website

Kamyar Azizzadenesheli, Nikola B. Kovachki, Zongyi Li, Miguel Liu-Schiaffini, Jean Kossaifi, Anima Anandkumar

April 2024 Nature Reviews Physics

Neural operators for accelerating scientific simulations and design

pdf

Fan-Yun Sun, Jonathan Tremblay, Valts Blukis, Kevin Lin, Danfei Xu, Boris Ivanovic, Peter Karkus, Stan Birchfield, Dieter Fox, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Marco Pavone, Nick Haber

March 2024 International Conference on 3D Vision (3DV)
Spotlight

Partial-View Object View Synthesis via Filtering Inversion

arXiv Website

Muhammed Kocabas, Ye Yuan, Pavlo Molchanov, Yunrong Guo, Michael Black, Otmar Hilliges, Jan Kautz, Umar Iqbal

March 2024 International Conference on 3D Vision (3DV)
Spotlight

PACE: Human and Camera Motion Estimation from in-the-wild Videos

arXiv

Qian Zhang, Anuran Makur, Kamyar Azizzadenesheli

March 2024 Transactions on Machine Learning Research

Functional Linear Regression of Cumulative Distribution Functions

pdf

Daniel Lichy, Hang Su, Abhishek Badki, Jan Kautz, Orazio Gallo

March 2024 International Conference on 3D Vision (3DV)
Oral

FoVA-Depth: Field-of-View Agnostic Depth Estimation for Cross-Dataset Generalization

arXiv Website

Yaozhong Shi, Grigorios Lavrentiadis, Domniki Asimaki, Zachary E Ross, Kamyar Azizzadenesheli

March 2024 Bulletin of the Seismological Society of America

Broadband ground motion synthesis via generative adversarial neural operators: Development and validation

pdf

Nikola B Kovachki, Samuel Lanthaler, Andrew M Stuart

February 2024 Handbook of Numerical Analysis

Operator Learning: Algorithms and Analysis

arXiv

Ashkan Ganj, Yiqin Zhao, Hang Su, Tian Guo

February 2024 International Workshop on Mobile Computing Systems and Applications

Mobile AR Depth Estimation: Challenges \& Prospects

arXiv

E. Alvarez Fanjul, S. Ciliberti, J. Pearlman, K. Wilmer-Becker, P. Bahurel, F. Ardhuin, A. Arnaud, K. Azizzadenesheli, Others Others

January 2024 Frontiers in Marine Science

Promoting best practices in ocean forecasting through an Operational Readiness Level

pdf

Kaushik Bhattacharya, Nikola B. Kovachki, Aakila Rajan, Andrew M Stuart, Margaret Trautner

January 2024 SIAM Journal on Numerical Analysis

Learning Homogenization For Elliptic Operators

arXiv

Colin White, Julius Berner, Jean Kossaifi, Mogab Elleithy, David Pitt, Daniel Leibovici, Zongyi Li, Kamyar Azizzadenesheli, Anima Anandkumar

December 2023 Symbiosis of Deep Learning and Differential Equations III, Neural Information Processing Systems (NeurIPS)

Physics-informed neural operators with exact differentiation on arbitrary geometries

pdf

Hongyu Sun, Zachary E Ross, Weiqiang Zhu, Kamyar Azizzadenesheli

December 2023 Geophysical Research Letters

Phase Neural Operator for Multi-Station Picking of Seismic Arrivals

pdf code project

Zongyi Li, Nikola B. Kovachki, Christopher Choy, Boyi Li, Jean Kossaifi, Shourya Otta, Mohammad Amin Nabian, Maximilian Stadler, Christian Hundt, Kamyar Azizzadenesheli, Anima Anandkumar

December 2023 Advances in Neural Information Processing Systems (NeurIPS)

Geometry-informed neural operator for large-scale 3D PDEs

pdf

Xueting Li, Shalini De Mello, Sifei Liu, Koki Nagano, Umar Iqbal, Jan Kautz

December 2023 Advances in Neural Information Processing Systems (NeurIPS)

Generalizable One-shot Neural Head Avatar

arXiv

Jimmy T.H. Smith, Shalini De Mello, Jan Kautz, Scott Linderman, Wonmin Byeon

December 2023 Advances in Neural Information Processing Systems (NeurIPS)

Convolutional State Space Models for Long-Range Spatiotemporal Modeling

arXiv pdf

Ajay Mandlekar, Soroush Nasiriany, Bowen Wen, Iretiayo Akinola, Yashraj Narang, Linxi Fan, Yuke Zhu, Dieter Fox

November 2023 Conference on Robot Learning (CoRL)

MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations

arXiv Website

Weixuan Sun, Zhen Qin, Hui Deng, Jianyuan Wang, Yi Zhang, Kaihao Zhang, Nick Barnes, Stan Birchfield, Lingpeng Kong, Yiran Zhong

October 2023 IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Vicinity Vision Transformer

pdf

Yujin Jeong, Won Jeong Ryoo, Seung Hyun Lee, Da Bin Seo, Wonmin Byeon, Sangpil Kim, Jinkyu Kim

October 2023 IEEE International Conference on Computer Vision (ICCV)

The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion

arXiv

Tianshi Cao, Karsten Kreis, Sanja Fidler, Nicholas Sharp, Kangxue Yin

October 2023 IEEE International Conference on Computer Vision (ICCV)

TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models

arXiv Website

Batu Ozturkler, Chao Liu, Benjamin Eckart, Morteza Mardani, Jiaming Song, Jan Kautz

October 2023 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI)

SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models

arXiv

Umar Iqbal, Akin Caliskan, Koki Nagano, Sameh Khamis, Pavlo Molchanov, Jan Kautz

October 2023 IEEE International Conference on Computer Vision (ICCV)

RANA: Relightable and Articulated Neural Avatars

arXiv Website

Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, Jan Kautz

October 2023 IEEE International Conference on Computer Vision (ICCV)
Oral

PhysDiff: Physics-Guided Human Motion Diffusion Model

arXiv Website

Jingbo Wang, Ye Yuan, Zhengyi Luo, Kevin Xie, Dahua Lin, Umar Iqbal, Sanja Fidler, Sameh Khamis

October 2023 IEEE International Conference on Computer Vision (ICCV)

Learning Human Dynamics in Autonomous Driving Scenarios

pdf

Andrew Guo, Bowen Wen, Jianhe Yuan, Jonathan Tremblay, Stephen Tyree, Jeff Smith, Stan Birchfield

October 2023 International Conference on Intelligent Robots and Systems (IROS)

HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions

arXiv Website

Eric Ryan Chan, Koki Nagano, Jeong Joon Park, Matthew Chan, Alexander William Bergman, Axel Levy, Miika Aittala, Shalini De Mello, Tero Karras, Gordon Wetzstein

October 2023 IEEE International Conference on Computer Vision (ICCV)
Oral

Generative Novel View Synthesis with 3D-Aware Diffusion Models

arXiv Website

Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, Jose M. Alvarez

October 2023 IEEE International Conference on Computer Vision (ICCV)

Fully Attentional Networks with Self-emerging Token Labeling

arXiv pdf

Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Anima Anandkumar, Jiaya Jia, Jose M. Alvarez

October 2023 IEEE International Conference on Computer Vision (ICCV)

FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

arXiv Website

Zhiqi Li, Zhiding Yu, Wenhai Wang, Anima Anandkumar, Tong Lu, Jose M. Alvarez

October 2023 IEEE International Conference on Computer Vision (ICCV)

FB-BEV: BEV Representation from Forward-Backward View Transformations

arXiv Website

Yanwei Li, Zhiding Yu, Jonah Philion, Anima Anandkumar, Sanja Fidler, Jiaya Jia, Jose M. Alvarez

October 2023 IEEE International Conference on Computer Vision (ICCV)

End-to-end 3D Tracking with Decoupled Queries

arXiv Website pdf

Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler

October 2023 IEEE International Conference on Computer Vision (ICCV)

DreamTeacher: Pretraining Image Backbones with Deep Generative Models

arXiv Website

Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis

October 2023 Transactions on Machine Learning Research

Differentially Private Diffusion Models

arXiv pdf

Connor Lin, Koki Nagano, Jan Kautz, Eric Chan, Umar Iqbal, Leonidas Guibas, Gordon Wetzstein, Sameh Khamis

August 2023 ACM Transactions on Graphics (SIGGRAPH)

Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization

arXiv Website

Alexander Trevithick, Matthew Chan, Michael Stengel, Eric Ryan Chan, Chao Liu, Zhiding Yu, Sameh Khamis, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

August 2023 ACM Transactions on Graphics (SIGGRAPH)

Real-Time Radiance Fields for Single-Image Portrait View Synthesis

arXiv Website

Haotian Zhang, Ye Yuan, Viktor Makoviychuk, Yunrong Guo, Sanja Fidler, Xue Bin Peng, Kayvon Fatahalian

August 2023 ACM Transactions on Graphics (SIGGRAPH)
Best paper honorable mention

Learning Physically Simulated Tennis Players from Broadcast Videos

arXiv

Jiaming Song, Qinsheng Zhang, Hongxu (Danny) Yin, Morteza Mardani, Ming-Yu Liu, Jan Kautz, Yongxin Chen, Arash Vahdat

July 2023 International Conference on Machine Learning (ICML)

Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation

pdf

Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos Theodorou, Weili Nie, Anima Anandkumar

July 2023 International Conference on Machine Learning (ICML)

I^2SB: Image-to-Image Schrödinger Bridge

arXiv Website

Ali Hatamizadeh, Hongxu (Danny) Yin, Greg Heinrich, Jan Kautz, Pavlo Molchanov

July 2023 International Conference on Machine Learning (ICML)

Global Context Vision Transformers

arXiv Website

Hongkai Zheng, Weili Nie, Arash Vahdat, Kamyar Azizzadenesheli, Anima Anandkumar

July 2023 International Conference on Machine Learning (ICML)

Fast Sampling of Diffusion Models via Operator Learning

arXiv

Jiashun Wang, Xueting Li, Sifei Liu, Shalini De Mello, Orazio Gallo, Xiaolong Wang, Jan Kautz

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot Pose Transfer for Unrigged Stylized 3D Characters

pdf

Yiming Li, Zhiding Yu, Christopher Choy, Chaowei Xiao, Jose M Alvarez, Sanja Fidler, Chen Feng, Anima Anandkumar

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight

VoxFormer: Sparse voxel transformer for camera-based 3D semantic scene completion

arXiv code

Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M Alvarez, Anima Anandkumar

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Vision Transformers Are Good Mask Auto-Labelers

arXiv code

Taeyeop Lee, Jonathan Tremblay, Valts Blukis, Bowen Wen, Byeong-Uk Lee, Inkyu Shin, Stan Birchfield, In So Kweon, Kuk-Jin Yoon

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation

arXiv

Davis Rempe, Zhengyi Luo, Xue Bin Peng, Ye Yuan, Kris Kitani, Karsten Kreis, Sanja Fidler, Or Litany

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

arXiv Website

Iuri Frosio, Jan Kautz

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The Best Defense is a Good Offense: Adversarial Augmentation Against Adversarial Attacks

arXiv Website

Paul Micaelli, Pavlo Molchanov, Arash Vahdat, Hongxu (Danny) Yin, Jan Kautz

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

arXiv

Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, Shalini De Mello

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

arXiv

Valts Blukis, Taeyeop Lee, Jonathan Tremblay, Bowen Wen, In So Kweon, Kuk-Jin Yoon, Dieter Fox, Stan Birchfield

June 2023 CVPR Workshop on Advances in NeRF for the Metaverse (XRNeRF)

One-Shot Neural Fields for 3D Object Understanding

arXiv

Seung Wook Kim, Bradley Brown, Kangxue Yin, Karsten Kreis, Katja Schwarz, Daiqing Li, Robin Rombach, Antonio Torralba, Sanja Fidler

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

arXiv Website

Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Magic3D: High-Resolution Text-to-3D Content Creation

arXiv Website

Divyam Madaan, Hongxu (Danny) Yin, Wonmin Byeon, Jan Kautz, Pavlo Molchanov

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight

Heterogeneous Continual Learning

pdf

Huanrui Yang, Hongxu (Danny) Yin, Maying Shen, Pavlo Molchanov, Hai Li, Jan Kautz

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Global Vision Transformer Pruning with Hessian-Aware Saliency

arXiv

Alessandro Ruzzi, Xiangwei Shi, Xi Wang, Gengyan Li, Shalini De Mello, Hyung Jin Chang, Xucong Zhang, Otmar Hilliges

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields

arXiv

Zhiqi Li, Zhiding Yu, David Austin, Mingsheng Fang, Shiyi Lan, Jan Kautz, Jose M. Alvarez

June 2023 CVPR Workshop on End-to-end Autonomous Driving

FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation

arXiv Website

Wei Dong, Christopher Choy, Charles Loop, Or Litany, Yuke Zhu

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

arXiv

Bowen Wen, Jonathan Tremblay, Valts Blukis, Stephen Tyree, Thomas Müller, Alex Evans, Dieter Fox, Jan Kautz, Stan Birchfield

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

arXiv Website video

Jason Clemons, Iuri Frosio, Maying Shen, Jose M. Alvarez, Stephen W. Keckler

June 2023 IEEE Intelligent Vehicles Symposium

Augmenting Legacy Networks for Flexible Inference

pdf

Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

arXiv Website

Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu

June 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Affordance Diffusion: Synthesizing Hand-Object Interactions

arXiv Website video

Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield

May 2023 International Conference on Robotics and Automation (ICRA)

RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

arXiv Website video

Jiaming Song, Arash Vahdat, Morteza Mardani, Jan Kautz

May 2023 International Conference on Learning Representations (ICLR)

Pseudoinverse-Guided Diffusion Models for Inverse Problems

Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg

May 2023 International Conference on Robotics and Automation (ICRA)

PROGPROMPT: Generating Situated Robot Task Plans using Large Language Models

arXiv Website

Yunzhi Lin, Thomas Müller, Jonathan Tremblay, Bowen Wen, Stephen Tyree, Alex Evans, Patricio A. Vela, Stan Birchfield

May 2023 International Conference on Robotics and Automation (ICRA)

Parallel Inversion of Neural Radiance Fields for Robust Pose Estimation

arXiv Website video

Chao Liu, Benjamin Eckart, Jan Kautz

May 2023 International Conference on Robotics and Automation (ICRA)

Online Consistent Video Depth using Continuous Geometric Representations

pdf

Chenhongyi Yang, Jiarui Xu, Shalini De Mello, Elliot J. Crowley, Xiaolong Wang

May 2023 International Conference on Learning Representations (ICLR)

GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation

arXiv

Xuan Su, Jiaming Song, Chenlin Meng, Stefano Ermon

May 2023 International Conference on Learning Representations (ICLR)

Dual Diffusion Implicit Bridges for Image-to-Image Translation

arXiv

Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar

March 2023 ArXiv Preprint

Prismer: A Vision-Language Model with An Ensemble of Experts

arXiv Website code demo

Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, others

February 2023 ArXiv Preprint

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

arXiv

Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

Test-time prompt tuning for zero-shot generalization in vision-language models

arXiv website code

Maying Shen, Hongxu (Danny) Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M Alvarez

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

Structural Pruning via Latency-Saliency Knapsack

arXiv Website

De-An Huang, Zhiding Yu, Anima Anandkumar

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

MinVIS: A minimal video instance segmentation framework without video-based training

arXiv code

Yann Labbè, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, Josef Sivic

December 2022 Conference on Robot Learning (CoRL)

MegaPose: 6D Pose Estimation of Novel Objects via Render and Compare

arXiv Website video openreview

Divyansh Garg, Sakanda Vaidyanath, Kuno Kim, Jiaming Song, Stefano Ermon

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

LISA: Learning Interpretable Skill Abstractions from Language

arXiv

Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

LION: Latent Point Diffusion Models for 3D Shape Generation

arXiv Website

Tim Dockhorn, Arash Vahdat, Karsten Kreis

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

GENIE: Higher-Order Denoising Diffusion Solvers

arXiv Website

Zhengyi Luo, Shun Iwase, Ye Yuan, Kris Kitani

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

Embodied Scene-aware Human Pose Estimation

arXiv Website video

Bahjat Kawar, Michael Elad, Stefano Ermon, Jiaming Song

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

Denoising Diffusion Restoration Models

arXiv Website code

Chenlin Meng, Kristy Choi, Jiaming Song, Stefano Ermon

December 2022 Advances in Neural Information Processing Systems (NeurIPS)

Concrete Score Matching: Generalized Score Matching for Discrete Data

arXiv

Eugene Vorontsov, Pavlo Molchanov, Matej Gazda, Christopher Beckham, Jan Kautz, Samuel Kadoury

November 2022 Medical Image Analysis

Towards Annotation-efficient Segmentation via Image-to-image Translation

arXiv

Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Jihyun Bae, Chanyoung Kim, Won Jeong Ryoo, Sang Ho Yoon, Jinkyu Kim, Sangpil Kim

October 2022 European Conference on Computer Vision (ECCV)

Sound-Guided Semantic Video Generation

arXiv

Xueting Li, Sifei Liu, Xiaolong Wang, Ming-Hsuan Yang, Alyosha Efros

October 2022 European Conference on Computer Vision (ECCV)

Scraping Textures from Natural Images for Synthesis and Editing

Website pdf video

Jonathan Tremblay, Moustafa Meshry, Alex Evans, Jan Kautz, Alexander Keller, Sameh Khamis, Thomas Müller, Charles Loop, Nathan Morrical, Koki Nagano, Towaki Takikawa, Stan Birchfield

October 2022 ECCV Workshop on Learning to Generate 3D Shapes and Scenes

RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis

arXiv Website video

Xin Dong, Hongxu (Danny) Yin, Jose M Alvarez, Jan Kautz, Pavlo Molchanov

October 2022 British Machine Vision Conference (BMVC)

Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks

arXiv

Zian Wang, Wenzheng Chen, David Acuna, Jan Kautz, Sanja Fidler

October 2022 European Conference on Computer Vision (ECCV)

Neural Light Field Estimation for Outdoor Scenes with Differentiable Virtual Object Insertion

arXiv

Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu

October 2022 European Conference on Computer Vision (ECCV)

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

arXiv Website

Pavlo Molchanov, Jimmy Hall, Hongxu (Danny) Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

October 2022 European Conference on Computer Vision (ECCV)

LANA: Latency Aware Network Acceleration

arXiv

An-Chieh Cheng, Xueting Li, Sifei Liu, Min Sun, Ming-Hsuan Yang

October 2022 European Conference on Computer Vision (ECCV)

Autoregressive 3D shape generation via canonical mapping

arXiv

Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

October 2022 European Conference on Computer Vision (ECCV)

Audio-Visual Segmentation

arXiv code website

Stephen Tyree, Jonathan Tremblay, Thang To, Jia Cheng, Terry Mosier, Jeff Smith, Stan Birchfield

October 2022 International Conference on Intelligent Robots and Systems (IROS)

6-DoF Pose Estimation of Household Objects for Robotic Manipulation: An Accessible Dataset and Benchmark

arXiv website

Guilin Liu, Aysegul Dundar, Kevin J Shih, Ting-Chun Wang, Fitsum A Reda, Karan Sapra, Zhiding Yu, Xiaodong Yang, Andrew Tao, Bryan Catanzaro

September 2022 IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Partial Convolution for Padding, Inpainting, and Image Synthesis

pdf

Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Animashree Anandkumar, Jiashi Feng, Jose M Alvarez

July 2022 International Conference on Machine Learning (ICML)

Understanding The Robustness in Vision Transformers

arXiv code

Jiahao Su, Wonmin Byeon, Furong Huang

July 2022 International Conference on Machine Learning (ICML)

Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework

arXiv

Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar

July 2022 International Conference on Machine Learning (ICML)

Diffusion Models for Adversarial Purification

arXiv Website

Maying Shen, Pavlo Molchanov, Hongxu (Danny) Yin, Jose M Alvarez

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

When to Prune? A Policy towards Early Structural Pruning

arXiv

Atsuhiro Noguchi, Umar Iqbal, Jonathan Tremblay, Tatsuya Harada, Orazio Gallo

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects

arXiv Website

Seung Hyun Lee, Wonseok Roh, Wonmin Byeon, Sang Ho Yoon, Chan Young Kim, Jinkyu Kim, Sangpil Kim

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Sound-Guided Semantic Image Manipulation

arXiv Website

Benjamin Wu, Oliver Hennigh, Jan Kautz, Sanjay Choudhry, Wonmin Byeon

June 2022 International Conference on Computational Science (ICCS)

Physics Informed RNN-DCT Networks for Time-Dependent Partial Differential Equations

arXiv

Zhiqi Li, Wenhai Wang, Enze Xie, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, Ping Luo, Tong Lu

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Panoptic SegFormer: Delving deeper into panoptic segmentation with transformers

arXiv code

Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

GroupViT: Semantic Segmentation Emerges From Text Supervision

video

Ali Hatamizadeh, Hongxu (Danny) Yin, Holger R Roth, Wenqi Li, Jan Kautz, Daguang Xu, Pavlo Molchanov

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

GradViT: Gradient Inversion of Vision Transformers

arXiv Website

Ye Yuan, Umar Iqbal, Pavlo Molchanov, Kris Kitani, Jan Kautz

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Oral

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras

arXiv Website

Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Animashree Anandkumar, Chunhua Shen, Jose M. Alvarez

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

FreeSOLO: Learning to Segment Objects without Annotations

arXiv code

Eric Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas J. Guibas, Jonathan Tremblay, Sameh Khamis, Tero Karras, Gordon Wetzstein

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Efficient Geometry-aware 3D Generative Adversarial Networks

arXiv Website

Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Jan Kautz, Sifei Liu

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs

video

Hongxu (Danny) Yin, Arash Vahdat, Jose M. Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

June 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

A-ViT: Adaptive Tokens for Efficient Vision Transformer

arXiv Website

Yunzhi Lin, Jonathan Tremblay, Stephen Tyree, Patricio A. Vela, Stan Birchfield

May 2022 International Conference on Robotics and Automation (ICRA)

Single-Stage Keypoint-Based Category-Level Object Pose Estimation from an RGB Image

arXiv website

Alexey Kamenev, Lirui Wang, Ollin Boer Bohan, Ishwar Kulkarni, Bilal Kartal, Artem Molchanov, Stan Birchfield, David Nistér, Nikolai Smolyanskiy

May 2022 International Conference on Robotics and Automation (ICRA)

PredictionNet: Real-Time Joint Probabilistic Traffic Prediction for Planning, Control, and Simulation

arXiv video

Yunzhi Lin, Jonathan Tremblay, Stephen Tyree, Patricio A. Vela, Stan Birchfield

May 2022 International Conference on Robotics and Automation (ICRA)

Keypoint-Based Category-Level Object Pose Tracking from an RGB Sequence with Uncertainty Estimation

arXiv website

Yiran Zhong, Charles Loop, Wonmin Byeon, Stan Birchfield, Yuchao Dai, Kaihao Zhang, Alexey Kamenev, Thomas Breuel, Hongdong Li, Jan Kautz

May 2022 International Journal of Computer Vision (IJCV)

Displacement-Invariant Cost Computation for Efficient Stereo Matching

arXiv pdf

Zhisheng Xiao, Karsten Kreis, Arash Vahdat

April 2022 International Conference on Learning Representations (ICLR)

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

arXiv Website

Tim Dockhorn, Arash Vahdat, Karsten Kreis

April 2022 International Conference on Learning Representations (ICLR)

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

arXiv Website

Enze Xie, Zhiding Yu, Daquan Zhou, Jonah Philion, Anima Anandkumar, Sanja Fidler, Ping Luo, Jose M Alvarez

April 2022 ArXiv Preprint

M$^2$BEV: Multi-camera joint 3D detection and segmentation with unified birds-eye view representation

arXiv website

Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz, Ming-Hsuan Yang

March 2022 International Journal of Computer Vision (IJCV)

Learning contrastive representation for semantic correspondence

arXiv

Amit Raj, Umar Iqbal, Koki Nagano, Sameh Khamis, Pavlo Molchanov, James Hays, Jan Kautz

March 2022 ArXiv Preprint

DRaCoN--Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars

arXiv Website

Ben Wu, Chao Liu, Benjamin Eckart, Jan Kautz

February 2022 AAAI Conference on Artificial Intelligence (AAAI)

Neural Interferometry: Image Reconstruction from Astronomical Interferometers using Transformer-Conditioned Neural Fields

pdf

Ali Hatamizadeh, Hongxu (Danny) Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona G Flores, Jan Kautz, Daguang Xu, others

February 2022 ArXiv Preprint

Do Gradient Inversion Attacks Make Federated Learning Unsafe?

arXiv

Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, Ping Luo

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

SegFormer: Simple and efficient design for semantic segmentation with transformers

arXiv code video demo tutorial zhihu

Arash Vahdat, Karsten Kreis, Jan Kautz

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

Score-based Generative Modeling in Latent Space

arXiv Website

Umar Iqbal, Kevin Xie, Yunrong Guo, Jan Kautz, Pavlo Molchanov

December 2021 International Conference on 3D Vision (3DV)

KAMA: 3D Keypoint Aware Body Mesh Articulation

arXiv video

Gal Dalal, Assaf Hallak, Steven Dalton, Iuri Frosio, Shie Mannor, Gal Chechik

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction

arXiv

Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

arXiv Website

Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

Coupled Segmentation and Edge Learning via Dynamic Graph Propagation

pdf

Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Animashree Anandkumar, Jan Kautz

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

Coupled Segmentation and Edge Learning Using Dynamic Graph Propagation

pdf

Weili Nie, Arash Vahdat, Anima Anandkumar

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

Controllable and Compositional Generation with Latent-Space Energy-Based Models

arXiv Website

Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat

December 2021 Advances in Neural Information Processing Systems (NeurIPS)

A Contrastive Learning Approach for Training Variational Autoencoder Priors

arXiv

Ekta Prashnani, Orazio Gallo, Joohwan Kim, Josef B. Spjut, Pradeep Sen, Iuri Frosio

November 2021 British Machine Vision Conference (BMVC)

Noise-Aware Video Saliency Prediction

arXiv video

Xitong Yang, Xiaodong Yang, Sifei Liu, Deqing Sun, Larry Davis, Jan Kautz

November 2021 British Machine Vision Conference (BMVC)

Hierarchical Contrastive Motion Learning for Video Action Recognition

arXiv

Aayush Prakash, Shoubhik Debnath, Jean-Francois Lafleche, Eric Cameracci, Gavriel State, Stan Birchfield, Marc T. Law

October 2021 IEEE International Conference on Computer Vision (ICCV)

Self-Supervised Real-to-Sim Scene Generation

arXiv project

Siva Karthik Mustikovela, Shalini De Mello, Aayush Prakash, Umar Iqbal, Sifei Liu, Thu Nguyen-Phuoc, Carsten Rother, Jan Kautz

October 2021 IEEE International Conference on Computer Vision (ICCV)

Self-Supervised Object Detection via Generative Image Synthesis

arXiv

Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W Fletcher, Sarita V Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B Sullivan, Timothy Tsai, Stephen W Keckler

October 2021 International Symposium on Software Reliability Engineering (ISSRE)

Optimizing Selective Protection for CNN Resilience

pdf

Huanrui Yang, Hongxu (Danny) Yin, Pavlo Molchanov, Hai Li, Jan Kautz

October 2021 ArXiv Preprint

NViT: Vision Transformer Compression and Parameter Redistribution

arXiv

Zian Wang, Jonah Philion, Sanja Fidler, Jan Kautz

October 2021 IEEE International Conference on Computer Vision (ICCV)
Oral

Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting

arXiv Website

Zekun Hao, Arun Mallya, Serge Belongie, Ming-Yu Liu

October 2021 IEEE International Conference on Computer Vision (ICCV)
Oral

GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds

arXiv Website

Yunzhi Lin, Jonathan Tremblay, Stephen Tyree, Patricio A. Vela, Stan Birchfield

September 2021 International Conference on Intelligent Robots and Systems (IROS)

Multi-View Fusion for Multi-Level Robotic Scene Understanding

arXiv video

Visak Kumar, David Hoeller, Balakumar Sundaralingam, Jonathan Tremblay, Stan Birchfield

September 2021 International Conference on Intelligent Robots and Systems (IROS)

Joint Space Control via Deep Reinforcement Learning

arXiv video

Ching-An Cheng, Mustafa Mukadam, Jan Issac, Stan Birchfield, Dieter Fox, Byron Boots, Nathan Ratliff

July 2021 IEEE Transactions on Automation Science and Engineering (TASE)

RMPflow: A Geometric Framework for Generation of Multi-Task Motion Policies

arXiv

Aysegul Dundar, Ming-Yu Liu, Zhiding Yu, Ting-Chun Wang, John Zedlewski, Jan Kautz

July 2021 IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Domain Stylization: A Fast Covariance Matching Framework towards Domain Adaptation

pdf

Rakshit Kothari, Shalini De Mello, Umar Iqbal, Wonmin Byeon, Seonwook Park, Jan Kautz

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Oral

Weakly-Supervised Physically Unconstrained Gaze Estimation

arXiv

Anand Bhattad, Aysegul Dundar, Guilin Liu, Andrew Tao, Bryan Catanzaro

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

View Generalization for Single Image Textured 3D Models

Website pdf video

Benjamin Eckart, Wentao Yuan, Chao Liu, Jan Kautz

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models

pdf

Hongxu (Danny) Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

See through Gradients: Image Batch Recovery via GradInversion

arXiv

Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu (Danny) Yin, Miguel A Carreira-Perpinän, Jose M Alvarez

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Optimal Quantization Using Scaled Codebook

pdf

Ting-Chun Wang, Arun Mallya, Ming-Yu Liu

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Oral

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

arXiv Website

Oliver Hennigh, Susheela Narasimhan, Mohammad Amin Nabian, Akshay Subramaniam, Kaustubh Tangsali, Max Rietmann, Jose del Aguila Ferrandis, Wonmin Byeon, Zhiwei Fang, Sanjay Choudhry

June 2021 International Conference on Computational Science (ICCS)

NVIDIA SimNet: An AI-accelerated multi-physics simulation framework

arXiv pdf

Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Learning to Track Instances without Video Annotations

Ning Yu, Guilin Liu, Aysegul Dundar, Andrew Tao, Bryan Catanzaro, Larry S Davis, Mario Fritz

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Dual Contrastive Loss and Attention for GANs

pdf video

Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S Davis, Anima Anandkumar

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

code video

Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, Dieter Fox

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

arXiv project

Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, Hongdong Li

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Deep Two-View Structure-from-Motion Revisited

arXiv code

Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

June 2021 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Oral, Best student paper honorable mention

Binary TTC: A Temporal Geofence for Autonomous Navigation

arXiv Website

Adrian Spurr, Pavlo Molchanov, Umar Iqbal, Jan Kautz, Otmar Hilliges

June 2021 ArXiv Preprint

Adversarial Motion Modelling Helps Semi-Supervised Hand Pose Estimation

arXiv

Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

May 2021 International Conference on Learning Representations (ICLR)

VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

arXiv code

Sangho Lee, Youngjae Yu, Gunhee Kim, Thomas Breuel, Jan Kautz, Yale Song

May 2021 International Conference on Learning Representations (ICLR)

Parameter Efficient Multimodal Transformers for Video Representation Learning

arXiv

Nathan Morrical, Jonathan Tremblay, Yunzhi Lin, Stephen Tyree, Stan Birchfield, Valerio Pascucci, Ingo Wald

May 2021 ICLR Workshop on Synthetic Data Generation

NViSII: A Scriptable Tool for Photorealistic Image Generation

arXiv code

Xueting Li, Shalini De Mello, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz, Sifei Liu

May 2021 International Conference on Learning Representations (ICLR)

Learning continuous environment fields via implicit functions

arXiv

Yifeng Zhu, Jonathan Tremblay, Stan Birchfield, Yuke Zhu

May 2021 International Conference on Robotics and Automation (ICRA)

Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs

arXiv project video

Guanya Shi, Yifeng Zhu, Jonathan Tremblay, Stan Birchfield, Fabio Ramos, Animashree Anandkumar, Yuke Zhu

May 2021 International Conference on Robotics and Automation (ICRA)

Fast Uncertainty Quantification for Deep Object Pose Estimation

arXiv project video

Aditya Jonnalagadda, Iuri Frosio, Seth Schneider, Morgan McGuire, Joohwan Kim

March 2021 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games

Robust Vision-Based Cheat Detection in Competitive Gaming

arXiv

Chaoyang Wang, Ben Eckart, Simon Lucey, Orazio Gallo

March 2021 ArXiv Preprint

Neural Trajectory Fields for Dynamic Novel View Synthesis

arXiv

Zahra Ghodsi, Siva Hari, Iuri Frosio, Timothy Tsai, Alejandro Troccoli, Stephen Keckler, Siddharth Garg, Anima Anandkumar

March 2021 IEEE Intelligent Vehicles Symposium

Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles

arXiv

Xueting Li, Sifei Liu, Shalini De Mello, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz

December 2020 Advances in Neural Information Processing Systems (NeurIPS)

Online adaptation for consistent mesh reconstruction in the wild

video

Jeremy Bernstein, Arash Vahdat, Yisong Yue, Ming-Yu Liu

December 2020 Advances in Neural Information Processing Systems (NeurIPS)

On the Distance between Two Neural Networks and the Stability of Learning

arXiv code

Arash Vahdat, Jan Kautz

December 2020 Advances in Neural Information Processing Systems (NeurIPS)

NVAE: A Deep Hierarchical Variational Autoencoder

arXiv Website

Morteza Mardani, Guilin Liu, Aysegul Dundar, Shiqiu Liu, Andrew Tao, Bryan Catanzaro

December 2020 Advances in Neural Information Processing Systems (NeurIPS)

Neural FFTs for Universal Texture Image Synthesis

pdf

Tewodros Amberbir Habtegebrial, Varun Jampani, Orazio Gallo, Didier Stricker

December 2020 Advances in Neural Information Processing Systems (NeurIPS)

Generative View Synthesis: From Single-view Semantics to Novel-view Images

arXiv Website code

Jiahao Su, Wonmin Byeon, Furong Huang, Jan Kautz, Animashree Anandkumar

December 2020 Advances in Neural Information Processing Systems (NeurIPS)

Convolutional Tensor-Train LSTM for Spatio-temporal Learning

arXiv

Steven Dalton, Iuri Frosio

December 2020 Advances in Neural Information Processing Systems (NeurIPS)

Accelerating reinforcement learning through GPU atari emulation

arXiv Website

Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

October 2020 European Conference on Computer Vision (ECCV)

Self-supervised single-view 3D reconstruction via semantic consistency

video

Ke Chen, Ryan Oldja, Nikolai Smolyanskiy, Stan Birchfield, Alexander Popov, David Wehr, Ibrahim Eden, Joachim Pehserl

October 2020 International Conference on Intelligent Robots and Systems (IROS)

MVLidarNet: Real-Time Multi-Class Scene Understanding for Autonomous Driving Using Multiple Views

arXiv video

Jonathan Tremblay, Stephen Tyree, Terry Mosier, Stan Birchfield

October 2020 International Conference on Intelligent Robots and Systems (IROS)

Indirect Object-to-Robot Pose Estimation from an External Monocular RGB Camera

arXiv video

Arun Mallya, Ting-Chun Wang, Karan Sapra, Ming-Yu Liu

August 2020 European Conference on Computer Vision (ECCV)

World-Consistent Video-to-Video Synthesis

arXiv Website

Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander Schwing, Jan Kautz

August 2020 European Conference on Computer Vision (ECCV)

UFO2: A Unified Framework towards Omni-supervised Object Detection

arXiv

Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz

August 2020 European Conference on Computer Vision (ECCV)
Oral

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

arXiv

Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem

August 2020 European Conference on Computer Vision (ECCV)
Spotlight

Contrastive Learning for Weakly Supervised Phrase Grounding

arXiv Website

Arash Vahdat, Evgeny Andriyash, William G Macready

July 2020 International Conference on Machine Learning (ICML)

Undirected Graphical Models as Approximate Posteriors

arXiv

Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum A Reda, Karan Sapra, Andrew Tao, Bryan Catanzaro

July 2020 ArXiv Preprint

Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter

arXiv pdf video

Beidi Chen, Weiyang Liu, Animesh Garg, Zhiding Yu, Anshumali Shrivastava, Jan Kautz, Anima Anandkumar

July 2020 International Conference on Machine Learning (ICML)

Angular Visual Hardness

arXiv

Umar Iqbal, Pavlo Molchanov, Jan Kautz

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild

arXiv

Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

arXiv code

Mark Boss, Varun Jampani, Kihwan Kim, Hendrik Lensch, Jan Kautz

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Two-shot Spatially-varying BRDF and Shape Estimation

arXiv Website

Moustafa S. Ibrahim, Arash Vahdat, Mani Ranjbar, William G. Macready

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Semi-Supervised Semantic Image Segmentation with Self-correcting Networks

arXiv

Siva Karthik Mustikovela, Varun Jampani, Shalini De Mello, Umar Iqbal, Sifei Liu, Carsten Rother, Jan Kautz

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Self-Supervised Viewpoint Learning from Image Collections

arXiv

Abdulrahman Mahmoud, Neeraj Aggarwal, Alex Nobbe, Jose Rodrigo Sanchez Vicarte, Sarita V. Adve, Christopher W. Fletcher, Iuri Frosio, Siva Kumar Sastry Hari

June 2020 IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)

PyTorchFI: A Runtime Perturbation Tool for DNNs

Website

Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, Jan Kautz

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera

arXiv Website

Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Meshlet Priors for 3D Mesh Reconstruction

arXiv video

Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander Schwing, Jan Kautz

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Instance-aware, Context-focused, and Memory-efficient Weakly-Supervised Object Detection

arXiv

Abhishek Badki, Alejandro Troccoli, Kihwan Kim, Jan Kautz, Pradeep Sen, Orazio Gallo

June 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Bi3D: Stereo Depth Estimation via Binary Classifications

arXiv Website

Shariq Iqbal, Jonathan Tremblay, Thang To, Jia Cheng, Erik Leitch, Andy Campbell, Kirby Leung, Duncan McKay, Stan Birchfield

May 2020 International Conference on Robotics and Automation (ICRA)

Toward Sim-to-Real Directional Semantic Grasping

arXiv video

Ankur Handa, Karl Van Wyk, Wei Yang, Jacky Liang, Yu-Wei Chao, Qian Wan, Stan Birchfield, Nathan Ratliff, Dieter Fox

May 2020 International Conference on Robotics and Automation (ICRA)

DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System

arXiv Website

Timothy E Lee, Jonathan Tremblay, Thang To, Jia Cheng, Terry Mosier, Oliver Kroemer, Dieter Fox, Stan Birchfield

May 2020 International Conference on Robotics and Automation (ICRA)

Camera-to-robot pose estimation from a single image

arXiv website video

Matthias Innmann, Kihwan Kim, Jinwei Gu, Matthias Niessner, Charles Loop, Marc Stamminger, Jan Kautz

March 2020 IEEE Winter Conference on Applications of Computer Vision (WACV)

NRMVS: Non-Rigid Multi-View Stereo

pdf