Skip to main content
Research
Search form
Search
Publications
Research Areas
People
About
Collaborations
Academic
AI Research Residency
Government
Graduate Fellowships
Careers
Internships
Research Scientists
Research Areas
High Performance Computing
Associated Publications
Scaling Implicit Parallelism via Dynamic Control Replication
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
Near-Memory Data Transformation for Efficient Sparse Matrix Multi-Vector Multiplication
GPU-Accelerated Atari Emulation for Reinforcement Learning
GPU Snapshot: Checkpoint Offloading for GPU-Dense Systems
On the Trend of Resilience for GPU-Dense Systems
NVGaze: An Anatomically-Informed Dataset for Low-Latency, Near-Eye Gaze Estimation
A Fast and Robust Method for Avoiding Self-Intersection
Massively Parallel Path Space Filtering
Metaoptimization on a Distributed System for Deep Reinforcement Learning
Massively Parallel Construction of Radix Tree Forests for the Efficient Sampling of Discrete Probability Distributions
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Massively Parallel Stackless Ray Tracing of Catmull-Clark Subdivision Surfaces
Fast, High Precision Ray/Fiber Intersection using Tight, Disjoint Bounding Volumes
CRUM: Checkpoint-Restart Support for CUDA's Unified Memory
Phantom Ray-Hair Intersector
Hamartia: A Fast and Accurate Error Injection Framework
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
Near-eye Light Field Holographic Rendering with Spherical Waves for Wide Field of View Interactive 3D Computer Graphics
Parallel Jaccard and Related Graph Clustering Techniques
Low Communication FMM-Accelerated FFT on GPUs
Fine-Grained DRAM: Energy-Efficient DRAM for Extreme Bandwidth Systems
Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form
Exploiting Budan-Fourier and Vincent’s Theorems for Ray Tracing 3D Bézier Curves
Parallel Modularity Clustering
Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
The Iray Light Transport Simulation and Rendering System
SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation
Parallel Depth-First Search for Directed Acyclic Graphs
Tensor Contractions with Extended BLAS Kernels on CPU and GPU
Tensor Contractions with Extended BLAS Kernels on CPU and GPU
Approxilyzer: Towards A Systematic Framework for Instruction-Level Approximate Computing and its Application to Hardware Resiliency
S-Step and Communication-Avoiding Iterative Methods
A Case for Toggle-Aware Compression for GPU Systems
Parallel Spectral Graph Partitioning
Network Endpoint Congestion Control for Fine-Grained Communication
The Light Field Stereoscope
Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU
Scaling the Power Wall: A Path to Exascale
Preconditioned Block-Iterative Methods on GPUs
Incomplete-LU and Cholesky Factorization in the Preconditioned Iterative Methods on the GPU
Efficient Parallel Merge Sort for Fixed and Variable Length Keys
Scalable GPU Graph Traversal
Allocation-oriented Algorithm Design with Application to GPU Computing, Ph.D. Dissertation
Thrust: A Productivity-Oriented Library for CUDA
High Performance and Scalable GPU Graph Traversal
Parallel Solution of Sparse Triangular Linear Systems in the Preconditioned Iterative Methods on the GPU
Sparse Matrix-Vector Multiplication on Multicore and Accelerators
Scalable Fluid Simulation using Anisotropic Turbulence Particles
Interactive Fluid-Particle Simulation using Translating Eulerian Grids
Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors
A Fast Double Precision CFD Code Using CUDA
Low Viscosity Flow Simulations for Animation
A Survey of General-Purpose Computation on Graphics Hardware
Researchers
Aamer Jaleel
Benjamin Klenk
Charles Loop
Cris Cecka
David Nellans
Donghyuk Lee
Eiman Ebrahimi
Evgeny Bolotin
Hans Eberle
Isaac Gelado
Iuri Frosio
Josef Spjut
Matthias Blumrich
Michael Bauer
Michael Garland
Mike O'Connor
Nikolaus Binder
Niladrish Chatterjee
Oreste Villa
Samuli Laine
Saurav Muralidharan
Sean Treichler
Siva Hari
Steve Keckler
Steven Dalton
Ted Jiang
Timothy Tsai
Wen-mei Hwu
William Dally
Yaosheng Fu
Zander Majercik