Token-Efficient VLM: High-Resolution Image Understanding via Dynamic Region Proposal

Publication
IEEE International Conference on Computer Vision (ICCV)

Related