Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Artificial Intelligence Computing Leadership from NVIDIA
Login
Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Search
Search
Enter the terms you wish to search for.
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(11)
2024
(1)
2023
(5)
2022
(5)
2021
(9)
2020
(6)
2019
(15)
2018
(14)
2017
(15)
2016
(9)
2015
(4)
2014
(1)
2012
(4)
2011
(4)
2010
(3)
2009
(2)
2008
(1)
2005
(1)
Facet Publication Year
Research Areas
High Performance Computing
(110)
Algorithms and Numerical Methods
(31)
Computer Architecture
(30)
Artificial Intelligence and Machine Learning
(27)
Programming Languages, Systems and Tools
(19)
Computer Graphics
(17)
Real-Time Rendering
(12)
Resilience and Safety
(11)
Networking
(7)
Climate Simulation
(4)
Computer Vision
(3)
VR, AR and Display Technology
(3)
Computational Photography and Imaging
(2)
Generative AI
(2)
Robotics
(2)
Autonomous Vehicles
(1)
Circuits and VLSI Design
(1)
Human Computer Interaction
(1)
Hyperscale Graphics
(1)
Physical AI
(1)
Quantum Computing
(1)
Events
CORL
(1)
CVPR
(1)
ICML
(1)
PLDI
(1)
RSS
(1)
SIGGRAPH
(2)
110 results found
High Performance Computing
Clear all
High Performance Computing
2020
Locality-Centric Data and Threadblock Management for Massive GPUs
Mahmoud Khairy, Vadim Nikiforov,
David Nellans
, Timothy G. Rogers
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal In GPUs
Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
Esha Chouske,
Michael B. Sullivan
,
Mike O'Connor
, Mattan Erez, Jeff Pool,
David Nellans
,
Steve Keckler
An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives
Benjamin Klenk
,
Ted Jiang
, Greg Thorson,
Larry Dennison
NWChem: Past, Present, and Future
Edoardo Aprà, Many others, Oreste Villa, Many others
2019
Near-Memory Data Transformation for Efficient Sparse Matrix Multi-Vector Multiplication
Daichi Fujiki,
Niladrish Chatterjee
,
Donghyuk Lee
,
Mike O'Connor
Highly-scalable, Physics-informed GANs for Learning Solutions of Stochastic PDEs
Liu Yang, Sean Treichler, Thorsten Kurth, Keno Fischer, David Barajas-Solano, Josh Romero, Valentin Churavy, Alexandre Tartakovsky, Michael Houston, Prabhat, George Karniadakis
Exascale Deep Learning for Scientific Inverse Problems
Nouamane Laanait, Joshua Romero, Junqi Yin, M. Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, Michael Matheson
Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance
Elliott Slaughter, Wei Wu, Yuankun Fu, Legend Brandenburg, Nicolai Garcia, Wilhem Kautz, Emily Marx, Kaleb S. Morris, Qinglei Cao, George Bosilca, Seema Mirchandaney, Wonchan Lee, Sean Treichler, Patrick McCormick, Alex Aiken
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Saptadeep Pal, Eiman Ebrahimi, Arslan Zulfiqar,
Yaosheng Fu
, Victor Zhang, Szymon Migacz,
David Nellans
, Puneet Gupta
GPU-Accelerated Atari Emulation for Reinforcement Learning
Steven Dalton
,
Iuri Frosio
,
Michael Garland
GPU Snapshot: Checkpoint Offloading for GPU-Dense Systems
Kyushick Lee,
Michael B. Sullivan
,
Siva Hari
, Timothy Tsai,
Steve Keckler
, Mattan Erez
On the Trend of Resilience for GPU-Dense Systems
Kyushick Lee,
Michael B. Sullivan
,
Siva Hari
, Timothy Tsai,
Steve Keckler
, Mattan Erez
Best of SELSE (Workshop on Silicon Errors in Logic - System Effects)
NVGaze: An Anatomically-Informed Dataset for Low-Latency, Near-Eye Gaze Estimation
Joohwan Kim
,
Michael Stengel
, Alexander Majercik,
Shalini De Mello
, David Dunn,
Samuli Laine
, Morgan McGuire,
David Luebke
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
Esha Choukse,
Michael B. Sullivan
,
Mike O'Connor
, Mattan Erez, Jeff Pool,
David Nellans
, Stephen W. Keckler
DeLTA: GPU Performance Model for Deep Learning Applications with In-depth Memory System Traffic Analysis
Sangkug Lym,
Donghyuk Lee
,
Mike O'Connor
,
Niladrish Chatterjee
, Mattan Erez
A Fast and Robust Method for Avoiding Self-Intersection
Carsten Wächter,
Nikolaus Binder
Massively Parallel Path Space Filtering
Nikolaus Binder
, Sascha Fricke,
Alex Keller
Metaoptimization on a Distributed System for Deep Reinforcement Learning
Greg Heinrich,
Iuri Frosio
Massively Parallel Construction of Radix Tree Forests for the Efficient Sampling of Discrete Probability Distributions
Nikolaus Binder
,
Alex Keller
2018
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-based Runtimes
Wonchan Lee, Elliott Slaughter,
Michael Bauer
, Sean Treichler, Todd Warszawski,
Michael Garland
, Alex Aiken
Exascale Deep Learning for Climate Analytics
Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Chun-Kai Chang, Sangkug Lym, Nicholas Kelly,
Michael B. Sullivan
, Mattan Erez
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Matthias Blumrich
,
Ted Jiang
,
Larry Dennison
Fast, High Precision Ray/Fiber Intersection using Tight, Disjoint Bounding Volumes
Nikolaus Binder
,
Alex Keller
Massively Parallel Stackless Ray Tracing of Catmull-Clark Subdivision Surfaces
Nikolaus Binder
,
Alex Keller
Exascale Deep Learning for Climate Analytics
Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston
CRUM: Checkpoint-Restart Support for CUDA's Unified Memory
Rohan Garg, Apoorve Mohan,
Michael B. Sullivan
, Gene Cooperman
Phantom Ray-Hair Intersector
Alexander Reshetov
,
David Luebke
Hamartia: A Fast and Accurate Error Injection Framework
Chun-Kai Chang, Sangkug Lym, Nicholas Kelly,
Michael B. Sullivan
, Mattan Erez
Isometry: A Path-Based Distributed Data Transfer System
Zhihao Jia, Sean Treichler, Galen Shipman, Patrick McCormick, Alex Aiken
Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training
Maohua Zhu,
Jason Clemons
, Jeff Pool, Minsoo Rhu,
Steve Keckler
, Yuan Xie
Pagination
First page
« First
Previous page
‹ Previous
Page
1
Current page
2
Page
3
Page
4
Next page
Next ›
Last page
Last »