Home
Publications
NVIDIA Research
Light
Dark
Automatic
SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models
An-Chieh Cheng
,
Hongxu (Danny) Yin
,
Yang Fu
,
Qiushan Guo
,
Ruihan Yang
,
Jan Kautz
,
Xiaolong Wang
,
Sifei Liu
December 2024
Cite
arXiv
Type
Conference paper
Publication
Advances in Neural Information Processing Systems (NeurIPS)
Hongxu (Danny) Yin
Jan Kautz
Team Leader
Xiaolong Wang
Sifei Liu
Related
3D Aware Region Prompted Vision Language Model
Grounded 3D-Aware Spatial Vision-Language Modeling
NVILA: Efficient Frontier Visual Language Models
NaVILA: Legged Robot Vision-Language-Action Model for Navigation
3D Reconstruction with Generalizable Neural Fields using Scene Priors
Cite
×