| Research

Research Labs

All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination

Publications

AI Playground

New and Featured
AI Art Gallery
NGC Demos

Research Areas

AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas

Careers

Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team

Licensing

Skip to main content

Research Labs

All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination

Publications

AI Playground

New and Featured
AI Art Gallery
NGC Demos

Research Areas

AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas

Careers

Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team

Licensing

Search

Enter the terms you wish to search for.

Student Page

https://zhaoyue-zephyrus.github.io/

Yue Zhao

University of Texas at Austin

Research

Yue’s research is focused on building video-centric foundation models. It is based on three pillars: modeling, data, and system design. For modeling, he proposes to learn video representation from free-form narratives, inspired by the recent success of large language models (LLM) enabling us to view all kinds of videos through the lens of narratives. For data, he is distilling image-based vision language models on videos such that narrating videos becomes as fast as annotating images. This scales up the video data size to match or even exceed the image counterpart. For system design, he examines the training pipeline of a modern video Transformer architecture and mitigates the video loading a

Bio

Yue is a fourth-year PhD student at the University of Texas at Austin, supervised by Prof. Philipp Krähenbühl. He obtained his MPhil’s degree from Multimedia Laboratory at the Chinese University of Hong Kong, supervised by Prof. Dahua Lin. More previously, he obtained his Bachelor's degrees from Tsinghua University. His research interests are in computer vision and machine learning. Particularly he has been focusing on developing computer vision models for video understanding and generation and video. He is also interested in building efficient systems for video compression and analysis.

Hometown

Wuxi, Jiangsu, China