Youssef Elasser

Youssef Elasser received his B.S. degree in Electrical Engineering and Computer Science with a concentration in electric power from Rensselaer Polytechnic Institute in 2018 and received his M.A. and Ph.D. degrees in Electrical and Computer Engineering from Princeton University in 2024. His research interests include power delivery for data center microprocessors, magnetics design and optimization, and dc-dc power conversion. He interned in the NVIDIA Circuits Research Group during the summer of 2023 and joined NVIDIA Research full time in June 2024. 

Tobias Zirr

Tobias Zirr is a research scientist at NVIDIA interested in machine learning, real-time rendering, and Monte Carlo simulation. Previously, he was a research scientist at Intel, working on the interface of classic and neural rendering, and as a research program lead to bring path tracing to a wider range of practical real-time applications. As a PhD student in the computer graphics group at Karlsruhe Institute of Technology, his research included MC and MCMC light transport algorithms, as well as real-time rendering and visualization techniques.

Lorenzo Maggi

Lorenzo Maggi is a Senior Research Scientist at NVIDIA, specializing in the convergence of wireless communications and machine learning. 

Before joining NVIDIA, Lorenzo developed algorithmic solutions for 5G networks at Nokia Bell Labs France, focusing on energy efficiency, beamforming, scheduling, and radiation mitigation. Prior to this, he worked on network routing algorithms at Huawei France.

Lorenzo holds a master’s degree in telecommunication engineering from the University of Pavia, Italy, and a Ph.D. in applied mathematics from Eurecom, France. 

DiffiT: Diffusion Vision Transformers for Image Generation

Diffusion models with their powerful expressivity and high sample quality have achieved State-Of-The-Art (SOTA) performance in the generative domain. The pioneering Vision Transformer (ViT) has also demonstrated strong modeling capabilities and scalability, especially for recognition tasks. In this paper, we study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT).

An Empirical Study of Mamba-based Language Models

Selective state-space models (SSMs) like Mamba overcome some of the shortcomings of Transformers, such as quadratic computational complexity with sequence length and large inference-time memory requirements from the key-value cache. Moreover, recent studies have shown that SSMs can match or exceed the language modeling capabilities of Transformers, making them an attractive alternative.

Improving Hyperparameter Optimization with Checkpointed Model Weights

When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for more efficient optimization. For example, we can use intermediate loss evaluations to terminate bad selections.

RVT-2: Learning Precise Manipulation from Few Examples

In this work, we study how to build a robotic system that can solve multiple 3D manipulation tasks given language instructions. To be useful in industrial and household domains, such a system should be capable of learning new tasks with few demonstrations and solving them precisely. Prior works, like PerAct and RVT, have studied this problem, however, they often struggle with tasks requiring high precision. We study how to make them more effective, precise, and fast.

Seonwook Park

Seonwook Park is a Senior Research Scientist at NVIDIA, where he is part of the AI-Mediated Reality and Interaction Research Group. Seonwook's research focus is computer vision and machine learning, particularly in computational human perception and gaze estimation. He obtained his Ph.D. in Computer Science from ETH Zurich, where he was advised by Prof. Otmar Hilliges. Previously, he obtained his MSc from ETH Zurich in Computational Science and Engineering and his BSc from Imperial College London in Physics.

Ahmed Nabih

Ahmed Nabih received the B.Sc. and M.Sc. degrees from Cairo University, Cairo, Egypt, in 2014 and 2017, respectively, and the Ph.D. degree from Virginia Tech, Blacksburg, VA, USA, in 2023, all in electrical engineering. Ahmed had a summer internship with Nvidia during summer 2022 where he worked on GPU power delivery. He also worked at Texas Instruments
as a Systems Engineer for a year after graduation.