Real-Time Global Illumination using Precomputed Light Field Probes

We introduce a new data structure and algorithms that employ it to compute real-time global illumination from static environments. Light field probes encode a scene’s full light field and internal visibility. They extend current radiance and irradiance probe structures with per-texel visibility information similar to a G-buffer and variance shadow map. We apply ideas from screen-space and voxel cone tracing techniques to this data structure to efficiently sample radiance on world space rays, with correct visibility information, directly within pixel and compute shaders.

Multilayer and Multimodal Fusion of Deep Neural Networks for Video Classification

This paper presents a novel framework to combine multiple layers and modalities of deep neural networks for video classification. We first propose a multilayer strategy to simultaneously capture a variety of levels of abstraction and invariance in a network, where the convolutional and fully connected layers are effectively represented by the proposed feature aggregation methods. We further introduce a multimodal scheme that includes four highly complementary modalities to extract diverse static and dynamic cues at multiple temporal scales.

Deep G-Buffers for Stable Global Illumination Approximation

We introduce a new hardware-accelerated method for constructing Deep G-buffers that is 2x-8x faster than the previous depth peeling method and produces more stable results. We then build several high-performance shading algorithms atop our representation, including dynamic diffuse interreflection, ambient occlusion (AO), and mirror reflection effects.

Our construction method s order-independent, guarantees a minimum separation between layers, operates in a (small) bounded memory footprint, and does not require per-pixel sorting.

A Phenomenological Scattering Model for Order-Independent Transparency

Translucent objects such as fog, smoke, glass, ice, and liquids are pervasive in cinematic environments because they frame scenes in depth and create visually compelling shots. Unfortunately, they are hard to simulate in real-time and have thus previously been rendered poorly compared to opaque surfaces in games.

This paper introduces the first model for a real-time rasterization algorithm that can simultaneously approximate the following transparency phenomena: wavelength-varying ("colored") transmission, translucent colored shadows, caustics, partial coverage, diffusion, and refraction.

Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks

We present a real-time deep learning framework for video-based facial performance capture—the dense 3D tracking of an actor’s face given a monocular video. Our pipeline begins with accurately capturing a subject using a high-end production facial capture pipeline based on multi-view stereo tracking and artist-enhanced animations. With 5–10 minutes of captured footage, we train a convolutional neural network to produce high-quality output, including self-occluded regions, from a monocular video sequence of that subject.

Stan Birchfield

Stan Birchfield is a Principal Research Scientist, exploring the intersection of computer vision and robotics.  Prior to joining NVIDIA, he was a tenured faculty member at Clemson University, where he led research in computer vision, visual tracking, mobile robotics, and the perception of highly deformable objects.  He remains an adjunct faculty member at Clemson.  He also conducted research at Microsoft, was the principal architect of a commercial product at a startup company in the Bay Area, co-founded a startup with collaborators at Clemson, and consulted for various companies.  He has authored or co-authored more than 70 publications, as well as a textbook on image processing and analysis; and his open-source software has been used by researchers around the world.  He regularly serves on the program committees and editorial boards of various leading conferences and journals in computer vision and robotics.  He received his Ph.D. in electrical engineering from Stanford University. 

Main Field of Interest: 
Additional Research Areas: 

Page Placement Strategies for GPUs within Heterogeneous Memory Systems

Systems from smartphones to supercomputers are increasingly heterogeneous, being composed of both CPUs and GPUs. To maximize cost and energy efficiency, these systems will increasingly use globally-addressable heterogeneous memory systems, making choices about memory page placement critical to performance. In this work we show that current page placement policies are not sufficient to maximize GPU performance in these heterogeneous memory systems. We propose two new page placement policies that improve GPU performance: one application agnostic and one using application profile information.

Unlocking Bandwidth for GPUs in CC-NUMA systems

Historically, GPU-based HPC applications have had a substantial memory bandwidth advantage over CPU-based workloads due to using GDDR rather than DDR memory. However, past GPUs required a restricted programming model where application data was allocated up front and explicitly copied into GPU memory before launching a GPU kernel by the programmer. Recently, GPUs have eased this requirement and now can employ on-demand software page migration between CPU and GPU memory to obviate explicit copying.


Subscribe to Research RSS