Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Artificial Intelligence Computing Leadership from NVIDIA
Login
Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Search
Search
Enter the terms you wish to search for.
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(1)
2023
(8)
2022
(24)
2021
(31)
2020
(22)
2019
(27)
2018
(25)
2017
(23)
2016
(25)
2015
(20)
2014
(7)
2013
(2)
2012
(2)
2011
(4)
2009
(1)
2007
(1)
Facet Publication Year
Research Areas
Computer Architecture
(39)
Artificial Intelligence and Machine Learning
(12)
High Performance Computing
(8)
Resilience and Safety
(8)
Programming Languages, Systems and Tools
(6)
Circuits and VLSI Design
(5)
Networking
(2)
Robotics
(2)
Autonomous Vehicles
(1)
Computer Graphics
(1)
Generative AI
(1)
Real-Time Rendering
(1)
Events
IROS
(1)
PLDI
(1)
39 results found
Computer Architecture
Clear all
2023
2021
Computer Architecture
2023
Unity ECC: Unified Memory Protection Against Bit and Chip Errors
Dongwhee Kim, Jaeyoon Lee, Wonyeong Jung,
Michael B. Sullivan
, Jungrae Kim
VaPr: Variable-Precision Tensors to Accelerate Robot Motion Planning
Yu-Shun Hsiao,
Siva Hari
,
Balakumar Sundaralingam
, Jason Yik, Thierry Tambe,
Charbel Sakr
,
Steve Keckler
, Vijay Janapa Reddi
IROS
Efficient Transformer Inference with Statically Structured Sparse Attention
Steve Dai
, Hasan Genc,
Rangharajan Venkatesan
,
Brucek Khailany
Implicit Memory Tagging: No-Overhead Memory Safety Using Alias-Free Tagged ECC
Michael B. Sullivan
,
Mohamed Tarek Ibn Ziad
,
Aamer Jaleel
,
Stephen W. Keckler
cuCatch: A Debugging Tool for Efficiently Catching Memory Safety Violations in CUDA Applications
Mohamed Tarek Ibn Ziad
,
Sana Damani
,
Aamer Jaleel
,
Stephen W. Keckler
,
Mark Stephenson
PLDI
CuRobo: Parallelized Collision-Free Robot Motion Generation
Balakumar Sundaralingam
,
Siva Hari
, Adam Fishman,
Caelan Garrett
, Karl Van Wyk,
Valts Blukis
, Alexander Millane, Helen Oleynikova, Ankur Handa,
Fabio Ramos
, Nathan Ratliff, Dieter Fox
Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows
Vijay Kandiah,
Daniel Lustig
,
Oreste Villa
,
David Nellans
, Nikos Hardavellas
A 95.6-TOPS/W Deep Learning Inference Accelerator With Per-Vector Scaled 4-bit Quantization in 5 nm
Ben Keller
,
Rangharajan Venkatesan
,
Steve Dai
,
Stephen Tell
,
Brian Zimmer
,
Charbel Sakr
,
William Dally
,
Tom Gray
,
Brucek Khailany
2021
GPU Domain Specialization via Composable On-Package Architecture
Yaosheng Fu
, Evgeny Bolotin,
Niladrish Chatterjee
,
David Nellans
,
Steve Keckler
Softermax: Hardware/Software Co-Design of an Efficient Softmax for Transformers
Jacob R. Stevens,
Rangharajan Venkatesan
,
Steve Dai
,
Brucek Khailany
, Anand Raghunathan
Evolution of the Graphics Processing Unit (GPU)
William Dally
,
Steve Keckler
, David B. Kirk
Optimizing Selective Protection for CNN Resilience
Abdulrahman Mahmoud,
Siva Hari
, Christopher W. Fletcher, Sarita V. Adve,
Charbel Sakr
, Naresh Shanbhag,
Pavlo Molchanov
,
Michael B. Sullivan
, Timothy Tsai,
Steve Keckler
Suraksha: A Framework to Analyze the Safety Implications of Perception Design Choices in AVs
Hengyu Zhao,
Siva Hari
, Timothy Tsai,
Michael B. Sullivan
,
Steve Keckler
, Jishen Zhao
GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management
Harini Muthukrishnan
,
Daniel Lustig
,
David Nellans
, Thomas Wenisch
Best Paper nominee
IEEE Micro Top Picks in Computer Architecture (Honorable Mention)
Characterizing and Mitigating Soft Errors in GPU DRAM
Michael B. Sullivan
, Nirmal Saxena,
Mike O'Connor
,
Donghyuk Lee
, Paul Racunas, Saurabh Hukerikar, Timothy Tsai,
Siva Hari
,
Steve Keckler
IEEE Micro Top Picks in Computer Architecture
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators
Geonhwa Jeong, Gokcen Kestor, Prasanth Chatarasi,
Angshuman Parashar
,
Po-An Tsai
, Sivasankaran Rajamanickam, Roberto Gioiosa, Tushar Krishna
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs
Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
NVBitFI: Dynamic Fault Injection for GPUs
Timothy Tsai,
Siva Hari
,
Michael B. Sullivan
,
Oreste Villa
,
Steve Keckler
SpZip: Architectural Support for Effective Data Compression in Irregular Applications
Yifan Yang,
Joel Emer
, Daniel Sanchez
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
Harini Muthukrishnan
,
David Nellans
,
Daniel Lustig
, Jeffrey Fessler, Thomas Wenisch
Simba: scaling deep-learning inference with chiplet-based architecture
Yakun Sophia Shao,
Jason Clemons
,
Rangharajan Venkatesan
,
Brian Zimmer
,
Matt Fojtik
,
Ted Jiang
,
Ben Keller
, Alicia Klinefelter,
Nathaniel Pinckney
, Priyanka Raina,
Stephen Tell
,
Yanqing Zhang
,
William Dally
,
Joel Emer
,
Tom Gray
,
Brucek Khailany
,
Steve Keckler
ACM Research Highlight
Demystifying GPU Reliability: Comparing and Combining Beam Experiments, Fault Simulation, and Profiling
Fernando Fernandes dos Santos,
Siva Hari
, Pedro Martins Basso, Luigi Carro, Paolo Rech
Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators
Yannan Nellie Wu,
Po-An Tsai
,
Angshuman Parashar
, Vivienne Sze,
Joel Emer
GAMMA: Exploiting Gustavson’s Algorithm to Accelerate Sparse Matrix Multiplication
Guowei Zhang, Nithya Attaluri,
Joel Emer
, Daniel Sanchez
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search
Kartik Hegde,
Po-An Tsai
, Sitao Huang, Vikas Chandra,
Angshuman Parashar
, Christopher W. Fletcher
GPU Domain Specialization via Composable On-Package Architecture
Yaosheng Fu
, Evgeny Bolotin,
Niladrish Chatterjee
,
David Nellans
,
Steve Keckler
VS-QUANT: Per-Vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Steve Dai
,
Rangharajan Venkatesan
,
Haoxing (Mark) Ren
,
Brian Zimmer
,
William Dally
,
Brucek Khailany
Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
Atefeh Mehrabi,
Donghyuk Lee
,
Niladrish Chatterjee
, Danial J. Sorin, Benjamin C. Lee,
Mike O'Connor
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
PGZ: Automatic Zero-Value Code Specialization
Mark Stephenson
, Ram Rangan
Making Convolutions Resilient via Algorithm-Based Error Detection Techniques
Siva Hari
,
Michael B. Sullivan
, Timothy Tsai,
Steve Keckler
Pagination
Current page
1
Page
2
Next page
Next ›
Last page
Last »