Graduate Fellowships Awarded for 2017-2018

Date

Eleven Graduate Fellowship winners were announced at GTC 2017 on May 11, 2017. They each receive a grant up to $50K toward their PhD research that involves GPU computing.

Reconstructing Intensity Images from Binary Spatial Gradient Cameras

Binary gradient cameras extract edge and temporal information directly on the sensor, allowing for low-power, low-bandwidth, and high-dynamic-range capabilities, which are all critical factors for the deployment of embedded computer vision systems. However, these types of images require specialized computer vision algorithms and are not easy to interpret by a human observer. In this paper we propose to recover an intensity image from a single binary spatial gradient image with a deep autoencoder.

A Lightweight Approach for On-the-Fly Reflectance Estimation

Estimating surface reflectance (BRDF) is one key component for complete 3D scene capture, with wide applications in virtual reality, augmented reality, and human computer interaction. Prior work is either limited to controlled environments (e.g. gonioreflectometers, light stages, or multi-camera domes), or requires the joint optimization of shape, illumination, and reflectance, which is often computationally too expensive (e.g. hours of running time) for real-time applications. Moreover, most prior work requires HDR images as input which further complicates the capture process.

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage ActorCritic
(A3C) algorithm, currently the state-of-the-art method in reinforcement
learning for various gaming tasks. We analyze its computational traits and concentrate
on aspects critical to leveraging the GPU’s computational power. We
introduce a system of queues and a dynamic scheduling strategy, potentially
helpful for other asynchronous algorithms as well.

Computational Zoom: A Framework for Post-Capture Image Composition

Capturing a picture that "tells a story" requires the ability to create the right composition. The two most important parameters controlling composition are the camera position and the focal length of the lens. The traditional paradigm is for a photographer to mentally visualize the desired picture, select the capture parameters to produce it, and finally take the photograph, thus committing to a particular composition. We propose to change this paradigm.

Loss Functions for Image Restoration with Neural Networks

Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems. The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is L2. In this paper, we bring attention to alternative choices for image restoration. In particular, we show the importance of perceptually-motivated losses when the resulting image is to be evaluated by a human observer.

Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder

We describe a machine learning technique for reconstructing image sequences rendered using Monte Carlo methods. Our primary focus is on reconstruction of global illumination with extremely low sampling budgets at interactive rates. Motivated by recent advances in image restoration with deep convolutional networks, we propose a variant of these networks better suited to the class of noise present in Monte Carlo rendering. We allow for much larger pixel neighborhoods to be taken into account, while also improving execution speed by an order of magnitude.

Fusing State Spaces for Markov Chain Monte Carlo Rendering

Rendering algorithms using Markov chain Monte Carlo (MCMC) currently build upon two different state spaces. One of them is the path space, where the algorithms operate on the vertices of actual transport paths. The other state space is the primary sample space, where the algorithms operate on sequences of numbers used for generating transport paths. While the two state spaces are related by the sampling procedure of transport paths, all existing MCMC rendering algorithms are designed to work within only one of the state spaces.

Parallel Depth-First Search for Directed Acyclic Graphs

Depth-First Search (DFS) is a pervasive algorithm, often used as a building block for topological sort, connectivity and planarity testing, among many other applications. We propose a novel work-efficient parallel algorithm for the DFS traversal of directed acyclic graph (DAG). The algorithm traverses the entire DAG in a BFS-like fashion no more than three times. As a result it finds the DFS pre-order (discovery) and post-order (finish time) as well as the parent relationship associated with every node in a DAG. We analyse the runtime and work complexity of this novel parallel algorithm.

Polarimetric Multi-view Stereo

Multi-view stereo relies on feature correspondences for
3D reconstruction, and thus is fundamentally flawed in dealing
with featureless scenes. In this paper, we propose polarimetric
multi-view stereo, which combines per-pixel photometric
information from polarization with epipolar constraints
from multiple views for 3D reconstruction. Polarization
reveals surface normal information, and is thus helpful
to propagate depth to featureless regions.