Lorenzo Maggi

Lorenzo Maggi is a Senior Research Scientist at NVIDIA, specializing in the convergence of wireless communications and machine learning. 

Before joining NVIDIA, Lorenzo developed algorithmic solutions for 5G networks at Nokia Bell Labs France, focusing on energy efficiency, beamforming, scheduling, and radiation mitigation. Prior to this, he worked on network routing algorithms at Huawei France.

Lorenzo holds a master’s degree in telecommunication engineering from the University of Pavia, Italy, and a Ph.D. in applied mathematics from Eurecom, France. 

DiffiT: Diffusion Vision Transformers for Image Generation

Diffusion models with their powerful expressivity and high sample quality have achieved State-Of-The-Art (SOTA) performance in the generative domain. The pioneering Vision Transformer (ViT) has also demonstrated strong modeling capabilities and scalability, especially for recognition tasks. In this paper, we study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT).

An Empirical Study of Mamba-based Language Models

Selective state-space models (SSMs) like Mamba overcome some of the shortcomings of Transformers, such as quadratic computational complexity with sequence length and large inference-time memory requirements from the key-value cache. Moreover, recent studies have shown that SSMs can match or exceed the language modeling capabilities of Transformers, making them an attractive alternative.

Improving Hyperparameter Optimization with Checkpointed Model Weights

When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for more efficient optimization. For example, we can use intermediate loss evaluations to terminate bad selections.

RVT-2: Learning Precise Manipulation from Few Examples

In this work, we study how to build a robotic system that can solve multiple 3D manipulation tasks given language instructions. To be useful in industrial and household domains, such a system should be capable of learning new tasks with few demonstrations and solving them precisely. Prior works, like PerAct and RVT, have studied this problem, however, they often struggle with tasks requiring high precision. We study how to make them more effective, precise, and fast.

Seonwook Park

Seonwook Park is a Senior Research Scientist at NVIDIA, where he is part of the AI-Mediated Reality and Interaction Research Group. Seonwook's research focus is computer vision and machine learning, particularly in computational human perception and gaze estimation. He obtained his Ph.D. in Computer Science from ETH Zurich, where he was advised by Prof. Otmar Hilliges. Previously, he obtained his MSc from ETH Zurich in Computational Science and Engineering and his BSc from Imperial College London in Physics.

Ahmed Nabih

Ahmed Nabih received the B.Sc. and M.Sc. degrees from Cairo University, Cairo, Egypt, in 2014 and 2017, respectively, and the Ph.D. degree from Virginia Tech, Blacksburg, VA, USA, in 2023, all in electrical engineering. Ahmed had a summer internship with Nvidia during summer 2022 where he worked on GPU power delivery. He also worked at Texas Instruments
as a Systems Engineer for a year after graduation.

Yi-Chen Lu

Yi-Chen Lu is currently a Senior Research Scientist at NVIDIA. He received his B.S. degree in Electrical Engineering from National Taiwan University (NTU), followed by his M.S. and Ph.D. degrees in Electrical and Computer Engineering from Georgia Institute of Technology. His research focuses on developing machine learning algorithms to improve Electronic Design Automation (EDA) flows for 2D and 3D Integrated Circuits (ICs), with a prime focus on Physical Design (PD).

Nemotron-4 340B

We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows the distribution, modification, and use of the models and their outputs. These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision.

Wenhao Ding

I am a research scientist in the autonomous vehicle research group at Nvidia. I'm interested in driving scenario generation, reinforcement learning, and causal discovery. I got my Bachelor's degree from Tsinghua University in 2018 and my Ph.D. from Carnegie Mellon University in 2024. 

Check my website for more information: https://wenhao.pub