Sameer Dharur is a research scientist on the Cosmos team at NVIDIA, helping to build vision-language-models (VLMs) that reason better about the world. Prior to that, he spent ~4.5 years as a researcher and engineer at Apple specializing in computer vision and natural language processing to solve problems in image and video understanding, question answering, and robotics.