Home
News
Members
Publications
NVIDIA Research
Light
Dark
Automatic
3D Aware Region Prompted Vision Language Model
An-Chieh Cheng
,
Yang Fu
,
Yukang Chen
,
Zhijian Liu
,
Xiaolong Li
,
Subhashree Radhakrishnan
,
Song Han
,
Yao Lu
,
Jan Kautz
,
Pavlo Molchanov
,
Hongxu (Danny) Yin
,
Xiaolong Wang
,
Sifei Liu
April 2026
Cite
arXiv
Type
Conference paper
Publication
International Conference on Learning Representations (ICLR)
Jan Kautz
Team Leader
Pavlo Molchanov
Hongxu (Danny) Yin
Xiaolong Wang
Sifei Liu
Related
NVILA: Efficient Frontier Visual Language Models
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Scaling RL to Long Videos
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Scaling Vision Pre-Training to 4K Resolution
Cite
×