| Research

Improving Hyperparameter Optimization with Checkpointed Model Weights

When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for more efficient optimization. For example, we can use intermediate loss evaluations to terminate bad selections.

Read more about Improving Hyperparameter Optimization with Checkpointed Model Weights

RVT-2: Learning Precise Manipulation from Few Examples

In this work, we study how to build a robotic system that can solve multiple 3D manipulation tasks given language instructions. To be useful in industrial and household domains, such a system should be capable of learning new tasks with few demonstrations and solving them precisely. Prior works, like PerAct and RVT, have studied this problem, however, they often struggle with tasks requiring high precision. We study how to make them more effective, precise, and fast.

Read more about RVT-2: Learning Precise Manipulation from Few Examples

Seonwook Park

Seonwook Park is a Senior Research Scientist at NVIDIA, where he is part of the AI-Mediated Reality and Interaction Research Group. Seonwook's research focus is computer vision and machine learning, particularly in computational human perception and gaze estimation. He obtained his Ph.D. in Computer Science from ETH Zurich, where he was advised by Prof. Otmar Hilliges. Previously, he obtained his MSc from ETH Zurich in Computational Science and Engineering and his BSc from Imperial College London in Physics.

Read more about Seonwook Park

Ahmed Nabih

Ahmed Nabih received the B.Sc. and M.Sc. degrees from Cairo University, Cairo, Egypt, in 2014 and 2017, respectively, and the Ph.D. degree from Virginia Tech, Blacksburg, VA, USA, in 2023, all in electrical engineering. Ahmed had a summer internship with Nvidia during summer 2022 where he worked on GPU power delivery. He also worked at Texas Instruments
as a Systems Engineer for a year after graduation.

Read more about Ahmed Nabih

Nemotron-4 340B

We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows the distribution, modification, and use of the models and their outputs. These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision.

Read more about Nemotron-4 340B

Wenhao Ding

I am a research scientist in the autonomous vehicle research group at Nvidia. I'm interested in driving scenario generation, reinforcement learning, and causal discovery. I got my Bachelor's degree from Tsinghua University in 2018 and my Ph.D. from Carnegie Mellon University in 2024.

Check my website for more information: https://wenhao.pub

Read more about Wenhao Ding

Christos Kozyrakis

Christos' research focuses on computer architecture and systems software. He is currently working on cloud computing technology, systems design for artificial intelligence, and artificial intelligence for systems design. Christos holds a BS degree from the University of Crete (Greece) and a PhD degree from the University of California at Berkeley (USA). He is a fellow of the ACM and the IEEE.

Read more about Christos Kozyrakis

Chris Cummings

Read more about Chris Cummings

Edward Suh

G. Edward Suh is a Senior Director of Research, and leads a group in security and privacy research.

He is also an Adjunct Professor in the School of Electrical and Computer Engineering at Cornell University, where he served on the faculty from 2007 to 2023. Before joining NVIDIA, he was a Research Scientist in the Fundamental AI Research (FAIR) team at Meta. He earned a B.S. in Electrical Engineering from Seoul National University and an M.S. and a Ph.D. in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology (MIT).