Zhiding Yu  

 
  ![](/sites/default/files/person/Profile.jpg)

  
 I am a principal research scientist and research lead at the [Learning and Perception Research Group](https://research.nvidia.com/labs/lpr/), NVIDIA Research. Before joining NVIDIA, I obtained Ph.D. in ECE from [Carnegie Mellon University](https://www.cmu.edu/) in 2017, and M.Phil. in ECE from [The Hong Kong University of Science and Technology](http://www.ust.hk/) in 2012. I graduated with a bachelor's degree from the Union Class of Electrical Engineering (FENG Bingquan Pilot Class), [South China University of Technology](http://www.scut.edu.cn/) in 2008.

I am interested in building general autonomy and intelligence across both virtual and physical domains. My recent focus lies in Vision Transformers, LLMs, multimodal LLMs, and vision-language-action (VLA) models, with applications spanning open-world understanding, reasoning, AV/robot perception-planning, and agentic systems. I have led or contributed to numerous flagship research efforts and products at NVIDIA, including [SegFormer](https://github.com/NVlabs/SegFormer) [(Most Influential NeurIPS Papers](https://resources.paperdigest.org/2024/09/most-influential-nips-papers-2024-09/), [Demo](https://www.youtube.com/watch?v=J0MoRQzZe8U)), [VoxFormer](https://github.com/NVlabs/VoxFormer), [FB-BEV/FB-OCC](https://github.com/NVlabs/FB-BEV), ([CVPR23 3D Occ Pred Challenge winner](https://opendrivelab.com/challenge2023/#3d_occupancy_prediction), [video](https://www.youtube.com/watch?v=KEn8oklzyvo)), [Hydra-MDP](https://github.com/NVlabs/Hydra-MDP) ([CVPR24 E2E Driving Challenge winner](https://opendrivelab.com/challenge2024/#end_to_end_driving_at_scale), [video](https://www.youtube.com/watch?v=wfpLLSz5iWY)), the [Eagle VLM](https://github.com/NVlabs/Eagle) project, [Nemotron](https://research.nvidia.com/labs/adlr/nemotronh/), [Llama-Nemotron-VL](https://developer.nvidia.com/blog/new-nvidia-llama-nemotron-nano-vision-language-model-tops-ocr-benchmark-for-accuracy/), and [GR00T N1](https://www.youtube.com/watch?v=m1CH-mgpdYg)/[GR00T N1.5](https://research.nvidia.com/labs/gear/gr00t-n1_5/) ([NVIDIA’s foundation models for humanoid robots](https://chrisding.github.io/github.com/NVIDIA/Isaac-GR00T)). I also participated in designing NVIDIA’s next-generation end-to-end autonomous driving system. My works are characterized by state-of-the-art performance, scalable architectures, and data-centric strategies towards real-world generalization.

**Honors and Awards**
Winner, [CVPR 2024 Challenge on End-to-End Driving at Scale](https://opendrivelab.com/challenge2024/#end_to_end_driving_at_scale)
2nd Place, [CVPR 2024 Challenge on Driving with Language](https://opendrivelab.com/challenge2024/#driving_with_language)
Winner, [CVPR 2023 Challenge on 3D Occupancy Prediction](https://opendrivelab.com/challenge2023/#3d_occupancy_prediction)
Winner, [ECCV 2022 Robust Vision Challenge (RVC) on Semantic Segmentation](http://robustvision.net/leaderboard.php?benchmark=semantic)
Winner, CVPR 2018 Autonomous Driving Challenge (WAD) on Domain Adaptation
2nd Place, [ICMI 2015 EmotiW Challenge on Static Facial Expression Recognition](https://users.cecs.anu.edu.au/~few_group/emotiw2015.html)
Best Paper Award, BMVC 2020
Best Paper Award, WACV 2015
Best Student Paper Award, ISCSLP 2014

For more information, please visit my [Homepage](https://chrisding.github.io/).


   Research Area(s)

[Artificial Intelligence and Machine Learning ](/research-area/machine-learning-artificial-intelligence)

[Autonomous Vehicles](/research-area/autonomous-vehicles)

[Computer Vision](/research-area/computer-vision)

[Generative AI](/research-area/generative-ai)

[Robotics](/research-area/robotics)

 
 Main Field of Interest

[Computer Vision](/research-area/computer-vision)

 
 Google Scholar

[https://scholar.google.com/citations?user=1VI\_oYUAAAAJ&amp;hl=en](https://scholar.google.com/citations?user=1VI_oYUAAAAJ&hl=en)

 
 ### Publications

 
### 2023 

[Live 3D Portrait: Real-Time Radiance Fields for Single-Image Portrait View Synthesis](/publication/2023-08_live-3d-portrait-real-time-radiance-fields-single-image-portrait-view-synthesis)

Alexander Trevithick, Matthew Chan, [Michael Stengel](/person/michael-stengel), Eric R. Chan, [Chao Liu](/person/chao-liu), [Zhiding Yu](/person/zhiding-yu), Sameh Khamis, Manmohan Chandraker, Ravi Ramamoorthi, [Koki Nagano](/person/koki-nagano)


[ACM Transactions On Graphics (SIGGRAPH 2023)](https://s2023.siggraph.org/)


### 2022 

[CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs](/publication/2022-07_coordgan-self-supervised-dense-correspondences-emerge-gans)

Jiteng Mu, [Shalini De Mello](/person/shalini-de-mello), [Zhiding Yu](/person/zhiding-yu), Nuno Vasconcelos, Xiaolong Wang, [Sifei Liu](/person/sifei-liu)


[IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022](https://cvpr2022.thecvf.com/)


[FreeSOLO: Learning to Segment Objects without Annotations](/publication/2022-06_freesolo-learning-segment-objects-without-annotations)

Xinlong Wang, [Zhiding Yu](/person/zhiding-yu), [Shalini De Mello](/person/shalini-de-mello), [Jan Kautz](/person/jan-kautz), Anima Anandkumar, Chunhua Shen, Jose M. Alvarez


[ IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022](https://cvpr2022.thecvf.com/)


[Learning Contrastive Representation for Semantic Correspondence](/publication/2022-03_learning-contrastive-representation-semantic-correspondence)

Taihong Xiao, [Sifei Liu](/person/sifei-liu), [Shalini De Mello](/person/shalini-de-mello), [Zhiding Yu](/person/zhiding-yu), [Jan Kautz](/person/jan-kautz), Ming-Hsuan Yang


[International Journal of Computer Vision (IJCV) 2022](https://link.springer.com/article/10.1007/s11263-022-01602-y)


### 2021 

[Contrastive Syn-to-Real Generalization](/publication/2021-05_contrastive-syn-real-generalization)

Wuyang Chen, [Zhiding Yu](/person/zhiding-yu), [Shalini De Mello](/person/shalini-de-mello), [Sifei Liu](/person/sifei-liu), Jose M. Alvarez, Zhangyang Wang, Anima Anandkumar


[International Conference on Learning Representations (ICLR) 2021](https://openreview.net/group?id=ICLR.cc/2021/Conference)


### 2020 

[Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning](/publication/2020-12_bongard-logo-new-benchmark-human-level-concept-learning-and-reasoning)

Weili Nie, [Zhiding Yu](/person/zhiding-yu), Lei Mao, Ankit B. Patel, [Yuke Zhu](/person/yuke-zhu), Anima Anandkumar


[Conference on Neural Information Processing Systems (NeurIPS) 2020 (Spotlight)](https://nips.cc/Conferences/2020)


[Neural Networks with Recurrent Generative Feedback](/publication/2020-12_neural-networks-recurrent-generative-feedback)

Yujia Huang, James Gornet, Sihui Dai, [Zhiding Yu](/person/zhiding-yu), Tan Nguyen, Doris Y. Tsao, Anima Anandkumar


[Conference on Neural Information Processing Systems (NeurIPS) 2020](https://nips.cc/Conferences/2020)


[Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification](/publication/2020-08_joint-disentangling-and-adaptation-cross-domain-person-re-identification)

Yang Zou, Xiaodong Yang, [Zhiding Yu](/person/zhiding-yu), B. V. K. Vijaya Kumar, [Jan Kautz](/person/jan-kautz)


[European Conference on Computer Vision (ECCV) 2020 (Oral)](https://eccv2020.eu/)


[UFO2: A Unified Framework towards Omni-supervised Object Detection](/publication/2020-08_ufo2-unified-framework-towards-omni-supervised-object-detection)

Zhongzheng Ren, [Zhiding Yu](/person/zhiding-yu), Xiaodong Yang, [Ming-Yu Liu](/person/ming-yu-liu), Alexander G. Schwing, [Jan Kautz](/person/jan-kautz)


[European Conference on Computer Vision (ECCV) 2020](https://eccv2020.eu/)


[Angular Visual Hardness](/publication/2020-07_angular-visual-hardness)

Beidi Chen, Weiyang Liu, [Zhiding Yu](/person/zhiding-yu), [Jan Kautz](/person/jan-kautz), Anshumali Shrivastava, Animesh Garg, Anima Anandkumar


[International Conference on Machine Learning (ICML) 2020](https://icml.cc/virtual/2020)


[Automated Synthetic-to-Real Generalization](/publication/2020-07_automated-synthetic-real-generalization)

Wuyang Chen, [Zhiding Yu](/person/zhiding-yu), Zhangyang Wang, Anima Anandkumar


[International Conference on Machine Learning (ICML) 2020](https://icml.cc/virtual/2020)


[Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection](/publication/2020-06_instance-aware-context-focused-and-memory-efficient-weakly-supervised-object)

Zhongzheng Ren, [Zhiding Yu](/person/zhiding-yu), Xiaodong Yang, [Ming-Yu Liu](/person/ming-yu-liu), Yong Jae Lee, Alexander G. Schwing, [Jan Kautz](/person/jan-kautz)


[IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020](http://cvpr2020.thecvf.com/)


[Regularizing Neural Networks via Minimizing Hyperspherical Energy](/publication/2020-06_regularizing-neural-networks-minimizing-hyperspherical-energy)

Weiyang Liu, Rongmei Lin, Zhen Liu, Chen Feng, [Zhiding Yu](/person/zhiding-yu), James M. Rehg, Li Xiong, Le Song


[IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020](http://cvpr2020.thecvf.com/)


[Domain Stylization: A Fast Covariance Matching Framework towards Domain Adaptation](/publication/2020-01_domain-stylization-fast-covariance-matching-framework-towards-domain-adaptation)

Aysegul Dundar, [Ming-Yu Liu](/person/ming-yu-liu), [Zhiding Yu](/person/zhiding-yu), [Ting-Chun Wang](/person/ting-chun-wang), John Zedlewski, [Jan Kautz](/person/jan-kautz)


[IEEE Transactions on Pattern Analysis and Machine Intelligence](https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=34)


### 2019 

[Confidence Regularized Self-Training](/publication/2019-10_confidence-regularized-self-training)

Yang Zou, [Zhiding Yu](/person/zhiding-yu), Xiaofeng Liu, B. V. K. Vijaya Kumar, Jinsong Wang


[IEEE/CVF International Conference on Computer Vision (ICCV) 2019 (Oral)](http://iccv2019.thecvf.com/)


[Joint Discriminative and Generative Learning for Person Re-identification](/publication/2019-06_joint-discriminative-and-generative-learning-person-re-identification)

Zhedong Zheng, [Xiaodong Yang](/person/xiaodong-yang), [Zhiding Yu](/person/zhiding-yu), Liang Zheng, Yi Yang, [Jan Kautz](/person/jan-kautz)


[IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019](http://cvpr2019.thecvf.com/)


### 2018 

[Learning towards Minimum Hyperspherical Energy](/publication/2018-12_learning-towards-minimum-hyperspherical-energy)

Weiyang Liu, Rongmei Lin, Zhen Liu, Lixin Liu, [Zhiding Yu](/person/zhiding-yu), Bo Dai, Le Song


[Conference on Neural Information Processing Systems (NeurIPS) 2018](https://nips.cc/Conferences/2018)


[Simultaneous Edge Alignment and Learning](/publication/2018-09_simultaneous-edge-alignment-and-learning)

[Zhiding Yu](/person/zhiding-yu), Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, [Jan Kautz](/person/jan-kautz)


[European Conference on Computer Vision (ECCV) 2018](https://eccv2018.org/)


[Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training](/publication/2018-09_domain-adaptation-semantic-segmentation-class-balanced-self-training)

Yang Zou, [Zhiding Yu](/person/zhiding-yu), B. V. K. Vijaya Kumar, Jinsong Wang


[European Conference on Computer Vision (ECCV) 2018](https://eccv2018.org/)


[Decoupled Networks](/publication/2018-06_decoupled-networks)

Weiyang Liu, Zhen Liu, [Zhiding Yu](/person/zhiding-yu), Bo Dai, Rongmei Lin, Yisen Wang, James M. Rehg, Le Song


[IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2018](http://cvpr2018.thecvf.com/)


[Learning Strict Identity Mappings in Deep Residual Networks](/publication/2018-06_learning-strict-identity-mappings-deep-residual-networks)

Xin Yu, [Zhiding Yu](/person/zhiding-yu), Srikumar Ramalingam


[IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2018](http://cvpr2018.thecvf.com/)