Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2023
(8)
2022
(24)
2021
(31)
2020
(22)
2019
(27)
2018
(25)
2017
(23)
2016
(25)
2015
(20)
2014
(7)
2013
(2)
2012
(2)
2011
(4)
2009
(1)
2007
(1)
Facet Publication Year
Research Areas
Computer Architecture
(31)
Artificial Intelligence and Machine Learning
(10)
High Performance Computing
(7)
Resilience and Safety
(7)
Programming Languages, Systems and Tools
(4)
Circuits and VLSI Design
(3)
Networking
(2)
Autonomous Vehicles
(1)
Computer Graphics
(1)
Real-Time Rendering
(1)
Events
No Results Available
31 results found
Computer Architecture
Clear all
2021
Computer Architecture
2021
GPU Domain Specialization via Composable On-Package Architecture
Yaosheng Fu
, Evgeny Bolotin,
Niladrish Chatterjee
,
David Nellans
,
Steve Keckler
Softermax: Hardware/Software Co-Design of an Efficient Softmax for Transformers
Jacob R. Stevens,
Rangharajan Venkatesan
,
Steve Dai
,
Brucek Khailany
, Anand Raghunathan
Evolution of the Graphics Processing Unit (GPU)
William Dally
,
Steve Keckler
, David B. Kirk
Optimizing Selective Protection for CNN Resilience
Abdulrahman Mahmoud,
Siva Hari
, Christopher W. Fletcher, Sarita V. Adve,
Charbel Sakr
, Naresh Shanbhag,
Pavlo Molchanov
,
Michael B. Sullivan
, Timothy Tsai,
Steve Keckler
Suraksha: A Framework to Analyze the Safety Implications of Perception Design Choices in AVs
Hengyu Zhao,
Siva Hari
, Timothy Tsai,
Michael B. Sullivan
,
Steve Keckler
, Jishen Zhao
GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management
Harini Muthukrishnan
,
Daniel Lustig
,
David Nellans
, Thomas Wenisch
Best Paper nominee
IEEE Micro Top Picks in Computer Architecture (Honorable Mention)
Characterizing and Mitigating Soft Errors in GPU DRAM
Michael B. Sullivan
, Nirmal Saxena,
Mike O'Connor
,
Donghyuk Lee
, Paul Racunas, Saurabh Hukerikar, Timothy Tsai,
Siva Hari
,
Steve Keckler
IEEE Micro Top Picks in Computer Architecture
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators
Geonhwa Jeong, Gokcen Kestor, Prasanth Chatarasi,
Angshuman Parashar
,
Po-An Tsai
, Sivasankaran Rajamanickam, Roberto Gioiosa, Tushar Krishna
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs
Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
NVBitFI: Dynamic Fault Injection for GPUs
Timothy Tsai,
Siva Hari
,
Michael B. Sullivan
,
Oreste Villa
,
Steve Keckler
SpZip: Architectural Support for Effective Data Compression in Irregular Applications
Yifan Yang,
Joel Emer
, Daniel Sanchez
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
Harini Muthukrishnan
,
David Nellans
,
Daniel Lustig
, Jeffrey Fessler, Thomas Wenisch
Simba: scaling deep-learning inference with chiplet-based architecture
Yakun Sophia Shao,
Jason Clemons
,
Rangharajan Venkatesan
,
Brian Zimmer
,
Matt Fojtik
,
Ted Jiang
,
Ben Keller
, Alicia Klinefelter,
Nathaniel Pinckney
, Priyanka Raina,
Stephen Tell
,
Yanqing Zhang
,
William Dally
,
Joel Emer
,
Tom Gray
,
Brucek Khailany
,
Steve Keckler
ACM Research Highlight
Demystifying GPU Reliability: Comparing and Combining Beam Experiments, Fault Simulation, and Profiling
Fernando Fernandes dos Santos,
Siva Hari
, Pedro Martins Basso, Luigi Carro, Paolo Rech
Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators
Yannan Nellie Wu,
Po-An Tsai
,
Angshuman Parashar
, Vivienne Sze,
Joel Emer
GAMMA: Exploiting Gustavson’s Algorithm to Accelerate Sparse Matrix Multiplication
Guowei Zhang, Nithya Attaluri,
Joel Emer
, Daniel Sanchez
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search
Kartik Hegde,
Po-An Tsai
, Sitao Huang, Vikas Chandra,
Angshuman Parashar
, Christopher W. Fletcher
GPU Domain Specialization via Composable On-Package Architecture
Yaosheng Fu
, Evgeny Bolotin,
Niladrish Chatterjee
,
David Nellans
,
Steve Keckler
VS-QUANT: Per-Vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Steve Dai
,
Rangharajan Venkatesan
,
Haoxing (Mark) Ren
,
Brian Zimmer
,
William Dally
,
Brucek Khailany
Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
Atefeh Mehrabi,
Donghyuk Lee
,
Niladrish Chatterjee
, Danial J. Sorin, Benjamin C. Lee,
Mike O'Connor
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
PGZ: Automatic Zero-Value Code Specialization
Mark Stephenson
, Ram Rangan
Making Convolutions Resilient via Algorithm-Based Error Detection Techniques
Siva Hari
,
Michael B. Sullivan
, Timothy Tsai,
Steve Keckler
Reduced Precision DWC: An Efficient Hardening Strategy for Mixed-Precision Architectures
Fernando F. dos Santos, Marcelo Brandalero,
Michael B. Sullivan
, Pedro M. Basso, Michael Hubner, Luigi Carro, Paolo Rech
P-OPT: Practical Optimal Cache Replacement for Graph Analytics
Vignesh Balaji,
Neal Crago
,
Aamer Jaleel
, Brandon Lucia
Best Paper nominee
Need for Speed: Experiences Building a Trustworthy System-Level GPU Simulator.
Oreste Villa
,
Daniel Lustig
,
Zi Yan
, Evgeny Bolotin,
Yaosheng Fu
,
Niladrish Chatterjee
,
Ted Jiang
,
David Nellans
Heterogeneous Dataflow Accelerators for Multi-DNN Workloads
Hyoukjun Kwon, Liangzhen Lai,
Michael Pellauer
, Tushar Krishna, Yu-Hsin Chen, Vikas Chandra
SNAP: An Efficient Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference
Jie-Fang Zhang, Ching-En Lee, Chester Liu, Yakun Sophia Shao,
Steve Keckler
, Zhengya Zhang
Hardware Abstractions for Targeting EDDO Architectures with the Polyhedral Model
Angshuman Parashar
, Prasanth Chatarasi,
Po-An Tsai
Flexion: A Quantitative Metric for Flexibility in DNN Accelerators
Hyoukjun Kwon,
Michael Pellauer
,
Angshuman Parashar
, Tushar Krishna