Fast Parallel Construction of High-Quality Bounding Volume Hierarchies

We propose a new massively parallel algorithm for constructing high-quality bounding volume hierarchies (BVHs) for ray tracing. The algorithm is based on modifying an existing BVH to improve its quality, and executes in linear time at a rate of almost 40M triangles/sec on NVIDIA GTX Titan. We also propose an improved approach for parallel splitting of triangles prior to tree construction. Averaged over 20 test scenes, the resulting trees offer over 90% of the ray tracing performance of the best offline construction method (SBVH), while previous fast GPU algorithms offer only about 50%.

GPU Ray Tracing

The NVIDIA® OptiX™ ray tracing engine is a programmable system designed for NVIDIA GPUs and other highly parallel architectures. The OptiX engine builds on the key observation that most ray tracing algorithms can be implemented using a small set of programmable operations. Consequently, the core of OptiX is a domain-specific just-in-time compiler that generates custom ray tracing kernels by combining user-supplied programs for ray generation, material shading, object intersection, and scene traversal.

Toward Practical Real-Time Photon Mapping: Efficient GPU Density Estimation

We describe the design space for real-time photon density estimation, the key step of rendering global illumination (GI) via photon mapping. We then detail and analyze efficient GPU implementations of four best-of-breed algorithms. All produce reasonable results on NVIDIA GeForce 670 at 1920x1080 for complex scenes with multiple-bounce diffuse effects, caustics, and glossy reflection in real-time. Across the designs we conclude that tiled, deferred photon gathering in a compute shader gives the best combination of performance and quality.

HDR Deghosting: How to Deal with Saturation?

We present a novel method for aligning images in an HDR (high-dynamic-range) image stack to produce a new exposure stack where all the images are aligned and appear as if they were taken simultaneously, even in the case of highly dynamic scenes. Our method produces plausible results even where the image used as a reference is either too dark or bright to allow for an accurate registration.

Octree-Based Sparse Voxelization Using The GPU Hardware Rasterizer

Discrete voxel representations are generating growing interest in a wide range of applications in computational sciences and particularly in computer graphics. In this chapter, we first describe an efficient OpenGL implementation of a simple surface voxelization algorithm that produces a regular 3D texture. This technique uses the GPU hardware rasterizer and the new image load/store interface exposed by OpenGL 4.2.

Exposure Stacks for Live Scenes with Hand-held Cameras

Many computational photography applications require the user to take multiple pictures of the same scene with different camera settings. While this allows to capture more information about the scene than what is possible with a single image, the approach is limited by the requirement that the images be perfectly registered. In a typical scenario the camera is hand-held and is therefore prone to moving during the capture of an image burst, while the scene is likely to contain moving objects.

Advanced Techniques for Realistic Real-Time Skin Rendering

 

GPU Gems 3 contains over 40 chapters and nearly 1000 pages full of the latest GPU programming techniques, and includes hundreds of full-color diagrams and pictures. GPU Gems 3 won the Game Developer Magazine's 2007 Front Line Award.

The Visual Vulnerability Spectrum: Characterizing Architectural Vulnerability for Graphics Hardware

 

With shrinking process technology, the primary cause of transient faults in semiconductors shifts away from highenergy cosmic particle strikes and toward more mundane and pervasive causes—power fluctuations, crosstalk, and other random noise. Smaller transistor features require a lower critical charge to hold and change bits, which leads to faster microprocessors, but which also leads to higher transient fault rates. Current trends, expected to continue, show soft error rates increasing exponentially at a rate of 8% per technology generation.

How GPUs Work

 

GPUs have moved away from the traditional fixed-function 3D graphics pipeline toward a flexible general-purpose computational engine. Today, GPUs can implement many parallel algorithms directly using graphics hardware. Well-suited algorithms that leverage all the underlying computational horsepower often achieve tremendous speedups. Truly, the GPU is the first widely deployed commodity desktop parallel computer

A Survey of General-Purpose Computation on Graphics Hardware

 

The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability,

have made graphics hardware a compelling platform for computationally demanding tasks in a wide variety

of application domains. In this report, we describe, summarize, and analyze the latest research in mapping

general-purpose computation to graphics hardware.

We begin with the technical motivations that underlie general-purpose computation on graphics processors