Home
News
Members
Publications
NVIDIA Research
Light
Dark
Automatic
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Hanrong Ye
,
Chao-Han Huck Yang
,
Arushi Goel
,
Wei Huang
,
Zhen Wan
,
Jinchuan Tian
,
An-Chieh Cheng
,
Ligeng Zhu
,
Yuanhang Su
,
Yuming Lou
,
Yong-Xiang Lin
,
Dong Yang
,
Sreyan Ghosh
,
Zhijian Liu
,
Yukang Chen
,
Ehsan Jahangiri
,
Ambrish Dantrey
,
Daguang Xu
,
Ehsan Hosseini-Asl
,
Seyed Danial Mohseni Taheri
,
Vidya Nariyambut Murali
,
Sifei Liu
,
Yao Lu
,
Oluwatobi Olabiyi
,
Yu-Chiang Frank Wang
,
Rafael Valle
,
Bryan Catanzaro
,
Andrew Tao
,
Song Han
,
Jan Kautz
,
Hongxu (Danny) Yin
,
Pavlo Molchanov
April 2026
Cite
arXiv
Type
Conference paper
Publication
International Conference on Learning Representations (ICLR)
Sifei Liu
Jan Kautz
Team Leader
Hongxu (Danny) Yin
Pavlo Molchanov
Related
NVILA: Efficient Frontier Visual Language Models
Scaling RL to Long Videos
3D Aware Region Prompted Vision Language Model
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models
Cite
×