Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(7)
2024
(1)
2023
(5)
2022
(5)
2021
(9)
2020
(6)
2019
(15)
2018
(14)
2017
(15)
2016
(9)
2015
(4)
2014
(1)
2012
(4)
2011
(4)
2010
(3)
2009
(2)
2008
(1)
2005
(1)
Facet Publication Year
Research Areas
High Performance Computing
(15)
Computer Architecture
(10)
Artificial Intelligence and Machine Learning
(6)
Algorithms and Numerical Methods
(2)
Resilience and Safety
(2)
Networking
(1)
Programming Languages, Systems and Tools
(1)
Events
No Results Available
15 results found
High Performance Computing
Clear all
2020
2016
High Performance Computing
2020
Accelerating Reinforcement Learning through GPU Atari Emulation
Iuri Frosio
,
Steven Dalton
Locality-Centric Data and Threadblock Management for Massive GPUs
Mahmoud Khairy, Vadim Nikiforov,
David Nellans
, Timothy G. Rogers
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal In GPUs
Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
Esha Chouske,
Michael B. Sullivan
,
Mike O'Connor
, Mattan Erez, Jeff Pool,
David Nellans
,
Steve Keckler
An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives
Benjamin Klenk
,
Ted Jiang
, Greg Thorson,
Larry Dennison
NWChem: Past, Present, and Future
Edoardo Aprà, Many others,
Oreste Villa
, Many others
2016
Tensor Contractions with Extended BLAS Kernels on CPU and GPU
Yang Shi, U. N. Niranjan, Animashree Anandkumar,
Cris Cecka
vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design.
Minsoo Rhu, Natalia Gimelshein,
Jason Clemons
, Arslan Zulfiqar,
Steve Keckler
Approxilyzer: Towards A Systematic Framework for Instruction-Level Approximate Computing and its Application to Hardware Resiliency
Radha Venkatagiri, Abdulrahman Mahmoud,
Siva Hari
, Sarita Adve
All-Inclusive ECC: Thorough End-to-End Protection for Reliable Computer Memory
Jungrae Kim,
Michael B. Sullivan
, Sangkug Lym, Mattan Erez
S-Step and Communication-Avoiding Iterative Methods
Maxim Naumov
Selective GPU Caches to Eliminate CPU-GPU HW Cache Coherence
Neha Agarwal,
David Nellans
, Eiman Ebrahimi, Thomas F. Wenisch, John Danskin,
Steve Keckler
Towards High Performance Paged Memory for GPUs
Tianhao Zheng,
David Nellans
, Arslan Zulfiqar,
Mark Stephenson
,
Steve Keckler
A Case for Toggle-Aware Compression for GPU Systems
Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C. Mowry,
Steve Keckler
Parallel Spectral Graph Partitioning
Maxim Naumov, Timothy Moon