NVIDIA Research Taiwan
NVIDIA Research Taiwan
Home
News
Members
Research
Publications
Contact
Light
Dark
Automatic
Vision-Language-Action Models
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Vision-language-action (VLA) reasoning tasks require agents to interpret multimodal instructions, perform long-horizon planning, and act adaptively in dynamic environments. Existing approaches typically train VLA models in an end-to-end fashion, …
Cite
×