Architecting an Energy-Efficient DRAM System for GPUs

This paper proposes an energy-efficient, high-throughput DRAM architecture for GPUs and throughput processors. In these systems, requests from thousands of concurrent threads compete for a limited number of DRAM row buffers. As a result, only a fraction of the data fetched into a row buffer is used, leading to significant energy overheads. Our proposed DRAM architecture exploits the hierarchical organization of a DRAM bank to reduce the minimum row activation granularity.

Pruning Convolutional Neural Networks for Resource Efficient Inference

We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with fine-tuning by backpropagation, a computationally efficient procedure that maintains good generalization in the pruned network. We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters. We focus on transfer learning, where large pretrained networks are adapted to specialized tasks.

Network Endpoint Congestion Control for Fine-Grained Communication

Endpoint congestion in HPC networks creates tree saturation that is detrimental to performance. Endpoint congestion can be alleviated by reducing the injection rate of traffic sources, but requires fast reaction time to avoid congestion buildup. Congestion control becomes more challenging as application communication shift from traditional two-sided model to potentially fine-grained, one-sided communication embodied by various global address space programming models.

Graduate Fellowships Awarded for 2017-2018

Thursday, May 11, 2017

Eleven Graduate Fellowship winners were announced at GTC 2017 on May 11, 2017. They each receive a grant up to $50K toward their PhD research that involves GPU computing.

Reconstructing Intensity Images from Binary Spatial Gradient Cameras

Binary gradient cameras extract edge and temporal information directly on the sensor, allowing for low-power, low-bandwidth, and high-dynamic-range capabilities, which are all critical factors for the deployment of embedded computer vision systems. However, these types of images require specialized computer vision algorithms and are not easy to interpret by a human observer. In this paper we propose to recover an intensity image from a single binary spatial gradient image with a deep autoencoder.

A Lightweight Approach for On-the-Fly Reflectance Estimation

Estimating surface reflectance (BRDF) is one key component for complete 3D scene capture, with wide applications in virtual reality, augmented reality, and human computer interaction. Prior work is either limited to controlled environments (e.g. gonioreflectometers, light stages, or multi-camera domes), or requires the joint optimization of shape, illumination, and reflectance, which is often computationally too expensive (e.g. hours of running time) for real-time applications. Moreover, most prior work requires HDR images as input which further complicates the capture process.

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage ActorCritic
(A3C) algorithm, currently the state-of-the-art method in reinforcement
learning for various gaming tasks. We analyze its computational traits and concentrate
on aspects critical to leveraging the GPU’s computational power. We
introduce a system of queues and a dynamic scheduling strategy, potentially
helpful for other asynchronous algorithms as well.

Computational Zoom: A Framework for Post-Capture Image Composition

Capturing a picture that "tells a story" requires the ability to create the right composition. The two most important parameters controlling composition are the camera position and the focal length of the lens. The traditional paradigm is for a photographer to mentally visualize the desired picture, select the capture parameters to produce it, and finally take the photograph, thus committing to a particular composition. We propose to change this paradigm.

Loss Functions for Image Restoration with Neural Networks

Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems. The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is L2. In this paper, we bring attention to alternative choices for image restoration. In particular, we show the importance of perceptually-motivated losses when the resulting image is to be evaluated by a human observer.

Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder

We describe a machine learning technique for reconstructing image sequences rendered using Monte Carlo methods. Our primary focus is on reconstruction of global illumination with extremely low sampling budgets at interactive rates. Motivated by recent advances in image restoration with deep convolutional networks, we propose a variant of these networks better suited to the class of noise present in Monte Carlo rendering. We allow for much larger pixel neighborhoods to be taken into account, while also improving execution speed by an order of magnitude.


Subscribe to Research RSS