Zhiding Yu

I am a principal research scientist and research lead at the Learning and Perception Research Group, NVIDIA Research. Before joining NVIDIA, I obtained Ph.D. in ECE from Carnegie Mellon University in 2017, and M.Phil. in ECE from The Hong Kong University of Science and Technology in 2012. I graduated with a bachelor's degree from the Union Class of Electrical Engineering (FENG Bingquan Pilot Class), South China University of Technology in 2008.

I am interested in building general autonomy and intelligence across both virtual and physical domains. My recent focus lies in Vision Transformers, LLMs, multimodal LLMs, and vision-language-action (VLA) models, with applications spanning open-world understanding, reasoning, AV/robot perception-planning, and agentic systems. I have led or contributed to numerous flagship research efforts and products at NVIDIA, including SegFormer (Most Influential NeurIPS Papers, Demo), VoxFormer, FB-BEV/FB-OCC, (CVPR23 3D Occ Pred Challenge winner, video), Hydra-MDP (CVPR24 E2E Driving Challenge winner, video), the Eagle VLM project, Nemotron, Llama-Nemotron-VL, and GR00T N1/GR00T N1.5 (NVIDIA’s foundation models for humanoid robots). I also participated in designing NVIDIA’s next-generation end-to-end autonomous driving system. My works are characterized by state-of-the-art performance, scalable architectures, and data-centric strategies towards real-world generalization.

Honors and Awards
Winner, CVPR 2024 Challenge on End-to-End Driving at Scale
2nd Place, CVPR 2024 Challenge on Driving with Language
Winner, CVPR 2023 Challenge on 3D Occupancy Prediction
Winner, ECCV 2022 Robust Vision Challenge (RVC) on Semantic Segmentation
Winner, CVPR 2018 Autonomous Driving Challenge (WAD) on Domain Adaptation
2nd Place, ICMI 2015 EmotiW Challenge on Static Facial Expression Recognition
Best Paper Award, BMVC 2020
Best Paper Award, WACV 2015
Best Student Paper Award, ISCSLP 2014

For more information, please visit my Homepage.

Research Area(s)

Artificial Intelligence and Machine Learning

Main Field of Interest

Computer Vision

Google Scholar

https://scholar.google.com/citations?user=1VI_oYUAAAAJ&hl=en

Publications

2023

Live 3D Portrait: Real-Time Radiance Fields for Single-Image Portrait View Synthesis

Alexander Trevithick, Matthew Chan, Michael Stengel, Eric R. Chan, Chao Liu, Zhiding Yu, Sameh Khamis, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

ACM Transactions On Graphics (SIGGRAPH 2023)

2022

CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs

Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Sifei Liu

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022

FreeSOLO: Learning to Segment Objects without Annotations

Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022

Learning Contrastive Representation for Semantic Correspondence

Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz, Ming-Hsuan Yang

International Journal of Computer Vision (IJCV) 2022

2021

Contrastive Syn-to-Real Generalization

Wuyang Chen, Zhiding Yu, Shalini De Mello, Sifei Liu, Jose M. Alvarez, Zhangyang Wang, Anima Anandkumar

International Conference on Learning Representations (ICLR) 2021

2020

Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning

Weili Nie, Zhiding Yu, Lei Mao, Ankit B. Patel, Yuke Zhu, Anima Anandkumar

Conference on Neural Information Processing Systems (NeurIPS) 2020 (Spotlight)

Neural Networks with Recurrent Generative Feedback

Yujia Huang, James Gornet, Sihui Dai, Zhiding Yu, Tan Nguyen, Doris Y. Tsao, Anima Anandkumar

Conference on Neural Information Processing Systems (NeurIPS) 2020

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz

European Conference on Computer Vision (ECCV) 2020 (Oral)

UFO2: A Unified Framework towards Omni-supervised Object Detection

Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

European Conference on Computer Vision (ECCV) 2020

Angular Visual Hardness

Beidi Chen, Weiyang Liu, Zhiding Yu, Jan Kautz, Anshumali Shrivastava, Animesh Garg, Anima Anandkumar

International Conference on Machine Learning (ICML) 2020

Automated Synthetic-to-Real Generalization

Wuyang Chen, Zhiding Yu, Zhangyang Wang, Anima Anandkumar

International Conference on Machine Learning (ICML) 2020

Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander G. Schwing, Jan Kautz

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020

Regularizing Neural Networks via Minimizing Hyperspherical Energy

Weiyang Liu, Rongmei Lin, Zhen Liu, Chen Feng, Zhiding Yu, James M. Rehg, Li Xiong, Le Song

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020

Domain Stylization: A Fast Covariance Matching Framework towards Domain Adaptation

Aysegul Dundar, Ming-Yu Liu, Zhiding Yu, Ting-Chun Wang, John Zedlewski, Jan Kautz

IEEE Transactions on Pattern Analysis and Machine Intelligence

2019

Confidence Regularized Self-Training

Yang Zou, Zhiding Yu, Xiaofeng Liu, B. V. K. Vijaya Kumar, Jinsong Wang

IEEE/CVF International Conference on Computer Vision (ICCV) 2019 (Oral)

Joint Discriminative and Generative Learning for Person Re-identification

Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, Jan Kautz

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019

2018

Learning towards Minimum Hyperspherical Energy

Weiyang Liu, Rongmei Lin, Zhen Liu, Lixin Liu, Zhiding Yu, Bo Dai, Le Song

Conference on Neural Information Processing Systems (NeurIPS) 2018

Simultaneous Edge Alignment and Learning

Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, Jan Kautz

European Conference on Computer Vision (ECCV) 2018

Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training

Yang Zou, Zhiding Yu, B. V. K. Vijaya Kumar, Jinsong Wang

European Conference on Computer Vision (ECCV) 2018

Decoupled Networks

Weiyang Liu, Zhen Liu, Zhiding Yu, Bo Dai, Rongmei Lin, Yisen Wang, James M. Rehg, Le Song

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2018

Learning Strict Identity Mappings in Deep Residual Networks

Xin Yu, Zhiding Yu, Srikumar Ramalingam