Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(7)
2024
(1)
2023
(5)
2022
(5)
2021
(9)
2020
(6)
2019
(15)
2018
(14)
2017
(15)
2016
(9)
2015
(4)
2014
(1)
2012
(4)
2011
(4)
2010
(3)
2009
(2)
2008
(1)
2005
(1)
Facet Publication Year
Research Areas
High Performance Computing
(24)
Computer Architecture
(9)
Algorithms and Numerical Methods
(7)
Programming Languages, Systems and Tools
(6)
Artificial Intelligence and Machine Learning
(5)
Computer Graphics
(3)
Networking
(3)
Resilience and Safety
(3)
Real-Time Rendering
(2)
Autonomous Vehicles
(1)
VR, AR and Display Technology
(1)
Events
No Results Available
24 results found
High Performance Computing
Clear all
2021
2017
High Performance Computing
2021
GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management
Harini Muthukrishnan
,
Daniel Lustig
,
David Nellans
, Thomas Wenisch
Best Paper nominee
IEEE Micro Top Picks in Computer Architecture (Honorable Mention)
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs
Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
Suraksha: A Quantitative AV Safety Evaluation Framework to Analyze Safety Implications of Perception Design Choices
Hengyu Zhao,
Siva Hari
, Timothy Tsai,
Michael B. Sullivan
,
Steve Keckler
, Jishen Zhao
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
Harini Muthukrishnan
,
David Nellans
,
Daniel Lustig
, Jeffrey Fessler, Thomas Wenisch
Demystifying GPU Reliability: Comparing and Combining Beam Experiments, Fault Simulation, and Profiling
Fernando Fernandes dos Santos,
Siva Hari
, Pedro Martins Basso, Luigi Carro, Paolo Rech
Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
Atefeh Mehrabi,
Donghyuk Lee
,
Niladrish Chatterjee
, Danial J. Sorin, Benjamin C. Lee,
Mike O'Connor
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
Scaling Implicit Parallelism via Dynamic Control Replication
Michael Bauer
, Wonchan Lee, Elliott Slaughter, Zhihao Jia, Mario Di Renzo, Manolis Papadakis, Galen Shipman, Patrick McCormick,
Michael Garland
, Alex Aiken
2017
Integrating External Resources with a Task-Based Programming Model
Zhihao Jia, Sean Treichler, Galen Shipman,
Michael Bauer
, Noah Watkins, Carlos Maltzahn, Patrick McCormick, Alex Aiken
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
Aditya Devarakonda, Maxim Naumov,
Michael Garland
Near-eye Light Field Holographic Rendering with Spherical Waves for Wide Field of View Interactive 3D Computer Graphics
Liang Shi, Fu-Chung Huang,
Ward Lopes
, Wojciech Matusik,
David Luebke
A Novel Shard-Based Approach for Asynchronous Many-Task Models for In Situ Analysis
Philippe P. Pébaÿ, Giulio Borghesi, Hemanth Kolla, Janine C. Bennett, Sean Treichler
Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions
Elliott Slaughter, Wonchan Lee, Sean Treichler, Wen Zhang,
Michael Bauer
, Galen Shipman, Patrick McCormick, Alex Aiken
Low Communication FMM-Accelerated FFT on GPUs
Cris Cecka
Parallel Jaccard and Related Graph Clustering Techniques
Alexandre Fender, Nahid Emad, Serge Petiton, Joe Eaton, Maxim Naumov
Fine-Grained DRAM: Energy-Efficient DRAM for Extreme Bandwidth Systems
Mike O'Connor
,
Niladrish Chatterjee
,
Donghyuk Lee
,
John Wilson
, Aditya Agrawal,
Steve Keckler
,
William Dally
Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form
Maxim Naumov
Exploiting Budan-Fourier and Vincent’s Theorems for Ray Tracing 3D Bézier Curves
Alexander Reshetov
Parallel Modularity Clustering
Alexandre Fender, Nahid Emad, Serge Petiton, Maxim Naumov
Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
Benjamin Klenk, Holger Fröning,
Hans Eberle
,
Larry Dennison
Best Paper Award
The Iray Light Transport Simulation and Rendering System
Alex Keller
, Carsten Wächter, Matthias Raab, Daniel Seibert, Dietger van Antwerpen, Johann Korndörfer, Lutz Kettner
SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation
Siva Hari
, Timothy Tsai,
Mark Stephenson
,
Steve Keckler
,
Joel Emer
Parallel Depth-First Search for Directed Acyclic Graphs
Maxim Naumov, Alysson Vrielink,
Michael Garland