Phenomenological Transparency

Translucent objects such as fog, clouds, smoke, glass, ice, and liquids are pervasive in cinematic environments because they frame scenes in depth and create visually-compelling shots.

Hashed Alpha Testing

Renderers apply alpha testing to mask out complex silhouettes using alpha textures on simple proxy geometry. While widely used, alpha testing has a long-standing problem that is underreported in the literature, but observable in commercial games: geometry can entirely disappear as alpha mapped polygons recede with distance.

Real-Time Global Illumination using Precomputed Light Field Probes

We introduce a new data structure and algorithms that employ it to compute real-time global illumination from static environments. Light field probes encode a scene’s full light field and internal visibility. They extend current radiance and irradiance probe structures with per-texel visibility information similar to a G-buffer and variance shadow map. We apply ideas from screen-space and voxel cone tracing techniques to this data structure to efficiently sample radiance on world space rays, with correct visibility information, directly within pixel and compute shaders.

Multilayer and Multimodal Fusion of Deep Neural Networks for Video Classification

This paper presents a novel framework to combine multiple layers and modalities of deep neural networks for video classification. We first propose a multilayer strategy to simultaneously capture a variety of levels of abstraction and invariance in a network, where the convolutional and fully connected layers are effectively represented by the proposed feature aggregation methods. We further introduce a multimodal scheme that includes four highly complementary modalities to extract diverse static and dynamic cues at multiple temporal scales.

Deep G-Buffers for Stable Global Illumination Approximation

We introduce a new hardware-accelerated method for constructing Deep G-buffers that is 2x-8x faster than the previous depth peeling method and produces more stable results. We then build several high-performance shading algorithms atop our representation, including dynamic diffuse interreflection, ambient occlusion (AO), and mirror reflection effects.

Our construction method s order-independent, guarantees a minimum separation between layers, operates in a (small) bounded memory footprint, and does not require per-pixel sorting.

A Phenomenological Scattering Model for Order-Independent Transparency

Translucent objects such as fog, smoke, glass, ice, and liquids are pervasive in cinematic environments because they frame scenes in depth and create visually compelling shots. Unfortunately, they are hard to simulate in real-time and have thus previously been rendered poorly compared to opaque surfaces in games.

This paper introduces the first model for a real-time rasterization algorithm that can simultaneously approximate the following transparency phenomena: wavelength-varying ("colored") transmission, translucent colored shadows, caustics, partial coverage, diffusion, and refraction.

Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks

We present a real-time deep learning framework for video-based facial performance capture—the dense 3D tracking of an actor’s face given a monocular video. Our pipeline begins with accurately capturing a subject using a high-end production facial capture pipeline based on multi-view stereo tracking and artist-enhanced animations. With 5–10 minutes of captured footage, we train a convolutional neural network to produce high-quality output, including self-occluded regions, from a monocular video sequence of that subject.

Stan Birchfield

Stan Birchfield is a Principal Research Scientist, exploring the intersection of computer vision and robotics.  Prior to joining NVIDIA, he was a tenured faculty member at Clemson University, where he led research in computer vision, visual tracking, mobile robotics, and the perception of highly deformable objects.  He remains an adjunct faculty member at Clemson.  He also conducted research at Microsoft, was the principal architect of a commercial product at a startup company in the Bay Area, co-founded a startup with collaborators at Clemson, and consulted for various companies.  He has authored or co-authored more than 70 publications, as well as a textbook on image processing and analysis; and his open-source software has been used by researchers around the world.  He regularly serves on the program committees and editorial boards of various leading conferences and journals in computer vision and robotics.  He received his Ph.D. in electrical engineering from Stanford University in 1999. 

Research Area(s): 


Subscribe to Research RSS