Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2023
(2)
2022
(2)
2021
(4)
2020
(3)
2019
(2)
2018
(1)
2017
(2)
2015
(2)
2014
(2)
2013
(1)
Facet Publication Year
Research Areas
Computer Architecture
(21)
Programming Languages, Systems and Tools
(21)
Artificial Intelligence and Machine Learning
(4)
High Performance Computing
(3)
Networking
(2)
Computer Graphics
(1)
Real-Time Rendering
(1)
Resilience and Safety
(1)
Events
PLDI
(1)
21 results found
Computer Architecture
Programming Languages, Systems and Tools
Clear all
Computer Architecture
Programming Languages, Systems and Tools
2023
cuCatch: A Debugging Tool for Efficiently Catching Memory Safety Violations in CUDA Applications
Mohamed Tarek Ibn Ziad
,
Sana Damani
,
Aamer Jaleel
,
Stephen W. Keckler
,
Mark Stephenson
PLDI
Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows
Vijay Kandiah,
Daniel Lustig
,
Oreste Villa
,
David Nellans
, Nikos Hardavellas
2022
Demystifying Map Space Exploration for NPUs
Sheng-Chun Kao,
Angshuman Parashar
,
Po-An Tsai
, Tushar Krishna
Marvel: A Data-Centric Approach for Mapping Deep Learning Operators on Spatial Accelerators
Prasanth Chatarasi, Hyoukjun Kwon,
Angshuman Parashar
,
Michael Pellauer
, Tushar Krishna, Vivek Sarkar
2021
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators
Geonhwa Jeong, Gokcen Kestor, Prasanth Chatarasi,
Angshuman Parashar
,
Po-An Tsai
, Sivasankaran Rajamanickam, Roberto Gioiosa, Tushar Krishna
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
Harini Muthukrishnan
,
David Nellans
,
Daniel Lustig
, Jeffrey Fessler, Thomas Wenisch
PGZ: Automatic Zero-Value Code Specialization
Mark Stephenson
, Ram Rangan
Hardware Abstractions for Targeting EDDO Architectures with the Polyhedral Model
Angshuman Parashar
, Prasanth Chatarasi,
Po-An Tsai
2020
Locality-Centric Data and Threadblock Management for Massive GPUs
Mahmoud Khairy, Vadim Nikiforov,
David Nellans
, Timothy G. Rogers
There’s Plenty of Room at the Top: What Will Drive Computer Performance after Moore’s Law?
Charles E. Leiserson, Neil C. Thompson,
Joel Emer
, Bradley C. Kuszmaul, Butler W. Lampson, Daniel Sanchez , Tao B. Schardl
Speculative Reconvergence for Improved SIMT Efficiency
Sana Damani, Daniel Johnson,
Mark Stephenson
, Eddie Yan, Olivier Giroux, Michael McKeown,
Steve Keckler
2019
NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs
Oreste Villa
,
Mark Stephenson
,
David Nellans
,
Steve Keckler
Timeloop: A Systematic Approach to DNN Accelerator Evaluation
Angshuman Parashar
, Priyanka Raina, Yakun Sophia Shao, Yu-Hsin Chen, Victor A. Ying, Anurag Mukkara,
Rangharajan Venkatesan
,
Brucek Khailany
,
Steve Keckler
,
Joel Emer
2018
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud,
Siva Hari
,
Michael B. Sullivan
, Timothy Tsai,
Steve Keckler
2017
Automated Synthesis of Comprehensive Memory Model Litmus Test Suites
Daniel Lustig
, Andrew Wright, Alexandros Papakonstantinou, Olivier Giroux
TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA
Caroline Trippel, Yatin A. Manerkar,
Daniel Lustig
,
Michael Pellauer
, Margaret Martonosi
IEEE Micro Top Picks in Computer Architecture
2015
MemcachedGPU: Scaling-up Scale-out Key-value Stores
Tayler Hetherington,
Mike O'Connor
, Tor Aamodt
Flexible Software Profiling of GPU Architectures
Mark Stephenson
,
Siva Hari
, Yunsup Lee, Eiman Ebrahimi, Daniel Johnson,
David Nellans
,
Mike O'Connor
,
Steve Keckler
2014
Exploring the Design Space of SPMD Divergence Management on Data-Parallel Architectures
Yunsup Lee, Vinod Grover, Ronny Krashinsky,
Mark Stephenson
,
Steve Keckler
, Krste Asanovic
Scaling Irregular Applications through Data Aggregation and Software Multithreading
Alessandro Morari, Antonino Tumeo, Daniel Chavarria-Miranda,
Oreste Villa
, Mateo Valero
2013
Convergence and Scalarization for Data-Parallel Architectures
Yunsup Lee, Ronny Krashinsky, Vinod Grover,
Steve Keckler
, Krste Asanovic