Home
News
Members
Publications
NVIDIA Research
Light
Dark
Automatic
NVILA: Efficient Frontier Visual Language Models
Zhijian Liu
,
Ligeng Zhu
,
Baifeng Shi
,
Zhuoyang Zhang
,
Yuming Lou
,
Shang Yang
,
Haocheng Xi
,
Shiyi Cao
,
Yuxian Gu
,
Dacheng Li
,
Xiuyu Li
,
Yunhao Fang
,
Yukang Chen
,
Cheng-Yu Hsieh
,
De-An Huang
,
An-Chieh Cheng
,
Vishwesh Nath
,
Andriy Myronenko
,
Jinyi Hu
,
Sifei Liu
,
Ranjay Krishna
,
Daguang Xu
,
Xiaolong Wang
,
Pavlo Molchanov
,
Jan Kautz
,
Hongxu (Danny) Yin
,
Song Han
,
and Yao Lu
June 2025
Cite
arXiv
Type
Conference paper
Publication
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
De-An Huang
Sifei Liu
Xiaolong Wang
Pavlo Molchanov
Jan Kautz
Team Leader
Hongxu (Danny) Yin
Related
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Scaling Vision Pre-Training to 4K Resolution
VILA-U: Efficient and Unified Visual Language Understanding and Generation
Do Gradient Inversion Attacks Make Federated Learning Unsafe?
NaVILA: Legged Robot Vision-Language-Action Model for Navigation
Cite
×