Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(2)
2024
(1)
2023
(5)
2022
(5)
2021
(9)
2020
(6)
2019
(15)
2018
(14)
2017
(15)
2016
(9)
2015
(4)
2014
(1)
2012
(4)
2011
(4)
2010
(3)
2009
(2)
2008
(1)
2005
(1)
Facet Publication Year
Research Areas
High Performance Computing
(101)
Computer Architecture
(30)
Algorithms and Numerical Methods
(25)
Artificial Intelligence and Machine Learning
(20)
Programming Languages, Systems and Tools
(17)
Computer Graphics
(16)
Real-Time Rendering
(12)
Resilience and Safety
(11)
Networking
(7)
Computer Vision
(3)
VR, AR and Display Technology
(3)
Computational Photography and Imaging
(2)
Autonomous Vehicles
(1)
Circuits and VLSI Design
(1)
Climate Simulation
(1)
Human Computer Interaction
(1)
Hyperscale Graphics
(1)
Robotics
(1)
Events
CORL
(1)
CVPR
(1)
SIGGRAPH
(1)
101 results found
High Performance Computing
Clear all
High Performance Computing
2017
Fine-Grained DRAM: Energy-Efficient DRAM for Extreme Bandwidth Systems
Mike O'Connor
,
Niladrish Chatterjee
,
Donghyuk Lee
,
John Wilson
, Aditya Agrawal,
Steve Keckler
,
William Dally
Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form
Maxim Naumov
Exploiting Budan-Fourier and Vincent’s Theorems for Ray Tracing 3D Bézier Curves
Alexander Reshetov
Parallel Modularity Clustering
Alexandre Fender, Nahid Emad, Serge Petiton, Maxim Naumov
Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
Benjamin Klenk, Holger Fröning,
Hans Eberle
,
Larry Dennison
Best Paper Award
The Iray Light Transport Simulation and Rendering System
Alex Keller
, Carsten Wächter, Matthias Raab, Daniel Seibert, Dietger van Antwerpen, Johann Korndörfer, Lutz Kettner
SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation
Siva Hari
, Timothy Tsai,
Mark Stephenson
,
Steve Keckler
,
Joel Emer
Parallel Depth-First Search for Directed Acyclic Graphs
Maxim Naumov, Alysson Vrielink,
Michael Garland
2016
Tensor Contractions with Extended BLAS Kernels on CPU and GPU
Yang Shi, U. N. Niranjan, Animashree Anandkumar,
Cris Cecka
vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design.
Minsoo Rhu, Natalia Gimelshein,
Jason Clemons
, Arslan Zulfiqar,
Steve Keckler
Approxilyzer: Towards A Systematic Framework for Instruction-Level Approximate Computing and its Application to Hardware Resiliency
Radha Venkatagiri, Abdulrahman Mahmoud,
Siva Hari
, Sarita Adve
All-Inclusive ECC: Thorough End-to-End Protection for Reliable Computer Memory
Jungrae Kim,
Michael B. Sullivan
, Sangkug Lym, Mattan Erez
S-Step and Communication-Avoiding Iterative Methods
Maxim Naumov
Selective GPU Caches to Eliminate CPU-GPU HW Cache Coherence
Neha Agarwal,
David Nellans
, Eiman Ebrahimi, Thomas F. Wenisch, John Danskin,
Steve Keckler
Towards High Performance Paged Memory for GPUs
Tianhao Zheng,
David Nellans
, Arslan Zulfiqar,
Mark Stephenson
,
Steve Keckler
A Case for Toggle-Aware Compression for GPU Systems
Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C. Mowry,
Steve Keckler
Parallel Spectral Graph Partitioning
Maxim Naumov, Timothy Moon
2015
Network Endpoint Congestion Control for Fine-Grained Communication
Ted Jiang
,
Larry Dennison
,
William Dally
The Light Field Stereoscope
Fu-Chung Huang,
David Luebke
, Gordon Wetzstein
Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU
Maxim Naumov, Patrice Castonguay, Jonathan Cohen
In-Memory Graph Databases for Web-Scale Data
Vito Giovanni Castellana, Alessandro Morari, Jesse Weaver, Antonino Time, David Haglin,
Oreste Villa
, John Feo
2014
Scaling the Power Wall: A Path to Exascale
Oreste Villa
, Daniel Johnson,
Mike O'Connor
, Evgeny Bolotin,
David Nellans
, Justin Luitjens, Nikolai Sakharnykh, Peng Wang, Paulius Micikevicius, Anthony Scudiero,
Steve Keckler
,
William Dally
2012
Preconditioned Block-Iterative Methods on GPUs
Maxim Naumov
Efficient Parallel Merge Sort for Fixed and Variable Length Keys
Andrew Davidson, David Tarjan,
Michael Garland
, John Owens
Incomplete-LU and Cholesky Factorization in the Preconditioned Iterative Methods on the GPU
Maxim Naumov
Scalable GPU Graph Traversal
Duane Merrill
,
Michael Garland
, Andrew Grimshaw
2011
Allocation-oriented Algorithm Design with Application to GPU Computing, Ph.D. Dissertation
Duane Merrill
Thrust: A Productivity-Oriented Library for CUDA
Nathan Bell,
Jared Hoberock
High Performance and Scalable GPU Graph Traversal
Duane Merrill,
Michael Garland
, Andrew Grimshaw
Parallel Solution of Sparse Triangular Linear Systems in the Preconditioned Iterative Methods on the GPU
Maxim Naumov
2010
Sparse Matrix-Vector Multiplication on Multicore and Accelerators
Sam Williams, Nathan Bell, Jee Whan Choi,
Michael Garland
, Leonid Oliker, Richard Vuduc
Scalable Fluid Simulation using Anisotropic Turbulence Particles
Tobias Pfaff, Nils Thurey, Jonathan Cohen, Sarah Tariq, Markus Gross
Pagination
First page
« First
Previous page
‹ Previous
Page
1
Page
2
Current page
3
Page
4
Next page
Next ›
Last page
Last »