VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding

June 2026

arXiv

Type

Conference paper

Publication

IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Highlight