Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Artificial Intelligence Computing Leadership from NVIDIA
Login
Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Search
Search
Enter the terms you wish to search for.
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(2)
2023
(3)
2021
(2)
2020
(1)
2019
(1)
2018
(4)
2017
(4)
Facet Publication Year
Research Areas
High Performance Computing
(17)
Programming Languages, Systems and Tools
(17)
Computer Architecture
(3)
Networking
(2)
Computer Graphics
(1)
Events
No Results Available
17 results found
High Performance Computing
Programming Languages, Systems and Tools
Clear all
High Performance Computing
Programming Languages, Systems and Tools
2025
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav, Shiv Sundrum, Wonchan Lee,
Michael Garland
,
Michael Bauer
, Alex Aiken, Fredrik Kjolstad
Automatic Tracing in Task-Based Runtime Systems
Rohan Yadav,
Michael Bauer
, David Broman,
Michael Garland
, Alex Aiken, Fredrik Kjolstad
2023
Legate Sparse: Distributed Sparse Computing in Python
Rohan Yadav, Wonchan Lee,
Melih Elibol
,
Taylor Patti
, Manolis Papadakis,
Michael Garland
, Alex Aiken, Fredrik Kjolstad,
Michael Bauer
Visibility Algorithms for Dynamic Dependence Analysis and Distributed Coherence
Michael Bauer
, Elliott Slaughter, Sean Treichler, Wonchan Lee,
Michael Garland
, Alex Aiken
Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows
Vijay Kandiah,
Daniel Lustig
,
Oreste Villa
,
David Nellans
, Nikos Hardavellas
2021
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
Harini Muthukrishnan
,
David Nellans
,
Daniel Lustig
, Jeffrey Fessler, Thomas Wenisch
Scaling Implicit Parallelism via Dynamic Control Replication
Michael Bauer
, Wonchan Lee, Elliott Slaughter, Zhihao Jia, Mario Di Renzo, Manolis Papadakis, Galen Shipman, Patrick McCormick,
Michael Garland
, Alex Aiken
2020
Locality-Centric Data and Threadblock Management for Massive GPUs
Mahmoud Khairy, Vadim Nikiforov,
David Nellans
, Timothy G. Rogers
2019
Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance
Elliott Slaughter, Wei Wu, Yuankun Fu, Legend Brandenburg, Nicolai Garcia, Wilhem Kautz, Emily Marx, Kaleb S. Morris, Qinglei Cao, George Bosilca, Seema Mirchandaney, Wonchan Lee, Sean Treichler, Patrick McCormick, Alex Aiken
2018
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-based Runtimes
Wonchan Lee, Elliott Slaughter,
Michael Bauer
, Sean Treichler, Todd Warszawski,
Michael Garland
, Alex Aiken
Isometry: A Path-Based Distributed Data Transfer System
Zhihao Jia, Sean Treichler, Galen Shipman, Patrick McCormick, Alex Aiken
Scalable Collectives for Distributed Asynchronous Many-Task Runtimes
Matthew Whitlock, Hemanth Kolla, Sean Treichler, Philippe Pebay, Janine C. Bennett
BabelFlow: An Embedded Domain Specific Language for Parallel Analysis and Visualization
Steve Petruzza, Sean Treichler, Valerio Pascucci, Peer-Timo Bremer
2017
Integrating External Resources with a Task-Based Programming Model
Zhihao Jia, Sean Treichler, Galen Shipman,
Michael Bauer
, Noah Watkins, Carlos Maltzahn, Patrick McCormick, Alex Aiken
A Novel Shard-Based Approach for Asynchronous Many-Task Models for In Situ Analysis
Philippe P. Pébaÿ, Giulio Borghesi, Hemanth Kolla, Janine C. Bennett, Sean Treichler
Control Replication: Compiling Implicit Parallelism to Efficient SPMD with Logical Regions
Elliott Slaughter, Wonchan Lee, Sean Treichler, Wen Zhang,
Michael Bauer
, Galen Shipman, Patrick McCormick, Alex Aiken
Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
Benjamin Klenk, Holger Fröning,
Hans Eberle
,
Larry Dennison
Best Paper Award