Generalizable One-shot 3D Neural Head Avatar

We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image. Existing methods either involve time-consuming optimization for a specific person with multiple images, or they struggle to synthesize intricate appearance details beyond the facial region. To address these limitations, we propose a framework that not only generalizes to unseen identities based on a single-view image without requiring person-specific optimization, but also captures characteristic details within and beyond the face area (e.g. hairstyle, accessories, etc.).

Convolutional State Space Models for Long-Range Spatiotemporal Modeling

Effectively modeling long spatiotemporal sequences is challenging due to the need to model complex spatial correlations and long-range temporal dependencies simultaneously. ConvLSTMs attempt to address this by updating tensor-valued states with recurrent neural networks, but their sequential computation makes them slow to train. In contrast, Transformers can process an entire spatiotemporal sequence, compressed into tokens, in parallel. However, the cost of attention scales quadratically in length, limiting their scalability to longer sequences.

Huck Yang

I am a Sr. Research Scientist at NV Research

I obtained my Ph.D. and M.Sc. from Georgia Institute of Technology, USA with Wallace H. Coulter fellowship and my B.Sc. from National Taiwan University. 

My primary research lies in the area of Multilingual Model Alignments and Speech-Language Modeling. Specifically:

Dale Durran

Durran has a 25% appointment as a Principal Research Scientist in Climate Modeling at NVIDIA and a 60% appointment as a Professor of Atmospheric Sciences at the University of Washington.  At NVIDIA his research focus in on deep learning earth-system modeling for sub-seasonal and seasonal forecasting, forecast ensembles, and generative methods for  fine-scale modeling of convective precipitation and other mesoscale fields.

David W Romero

I am a Research Scientist in Efficient Generative AI at NVIDIA’s Deep Imagination Research Team, and a finishing PhD candidate at the Vrije Universiteit Amsterdam. My research interests include all aspects of efficiency in Deep Learning, particularly the computational and parameter efficiency of long-context models and their application to generative models.

Constant Field of View Display Size Effects on First-Person Aiming Time

Under constant display field of view, FPS game aiming performance improves with display size, resulting in 3% faster aiming time comparing 13 and 26 inches diagonal.

Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows

Achieving peak throughput on modern CPUs requires maximizing the use of single-instruction, multiple-data (SIMD) or vector compute units. Single-program, multiple-data (SPMD) programming models are an effective way to use high-level programming languages to target these ISAs. Unfortunately, many SPMD frameworks have evolved to have either overly restrictive language specifications or under-specified programming models, and this has slowed the widescale adoption of SPMD-style programming.

Tianye Li

Tianye Li joined NVIDIA Research as a Research Scientist in 2023. His research interest is in computer vision and computer graphics, especially in capturing, modeling, understanding dynamic humans. He is also interested in 3D/4D reconstruction and photorealistic rendering of generic scenes and objects. He obtained his Ph.D. in Computer Science from University of Southern California (USC), where he was advised by Prof. Hao Li and Prof. Randall Hill, Jr. He was a research scientist at Epic Games, and interned at MPI for Intelligent Systems, Snap Research, and Facebook/Meta Reality Labs.