Home
Publications
NVIDIA Research
Light
Dark
Automatic
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing
Baifeng Shi
,
Stephanie Fu
,
Long Lian
,
Hanrong Ye
,
David Eigen
,
Aaron Reite
,
Jan Kautz
,
Boyi Li
,
David M. Chan
,
Trevor Darrell
,
Pavlo Molchanov
,
Hongxu (Danny) Yin
June 2026
Cite
arXiv
Type
Conference paper
Publication
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight
Jan Kautz
Team Leader
Pavlo Molchanov
Hongxu (Danny) Yin
Related
Scaling Vision Pre-Training to 4K Resolution
Scaling RL to Long Videos
GSPN-2: Efficient Parallel Sequence Modeling
NVILA: Efficient Frontier Visual Language Models
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Cite
×