Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(7)
2024
(1)
2023
(5)
2022
(5)
2021
(9)
2020
(6)
2019
(15)
2018
(14)
2017
(15)
2016
(9)
2015
(4)
2014
(1)
2012
(4)
2011
(4)
2010
(3)
2009
(2)
2008
(1)
2005
(1)
Facet Publication Year
Research Areas
High Performance Computing
(21)
Programming Languages, Systems and Tools
(8)
Algorithms and Numerical Methods
(5)
Artificial Intelligence and Machine Learning
(5)
Computer Graphics
(4)
Real-Time Rendering
(3)
Resilience and Safety
(3)
Climate Simulation
(2)
Generative AI
(2)
Computer Architecture
(1)
Networking
(1)
Events
ICML
(1)
PLDI
(1)
SIGGRAPH
(1)
21 results found
High Performance Computing
Clear all
2025
2018
High Performance Computing
2025
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
Boris Bonev
, Thorsten Kurth, Ankur Mahesh, Mauro Bisson,
Jean Kossaifi
, Karthik Kashinath, Anima Anandkumar, William D. Collins,
Mike Pritchard
,
Alex Keller
Task-Based Tensor Computations on Modern GPUs
Rohan Yadav,
Michael Garland
, Alex Aiken,
Michael Bauer
PLDI
Beyond the Buzz: A Pragmatic Take on Inference Disaggregation
Tiyasa Mitra, Ritika Borkar, Nidhi Bhatia, Ramon Matas, Shivam Raj, Dheevatsa Mudigere, Ritchie Zhao, Maximilian Golub, Arpan Dutta, Sailaja Madduri, Dharmesh Jani, Brian Pharris, Bita Darvish Rouhani
SLIM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression
Mohammad Mozaffari , Amir Yazdanbakhsh,
Maryam Mehri Dehnavi
ICML
Adaptive Algebraic Reuse of Reordering in Cholesky Factorizations with Dynamic Sparsity Patterns
Behrooz Zarebavani, Danny Kaufman, David Levin,
Maryam Mehri Dehnavi
SIGGRAPH
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav, Shiv Sundrum, Wonchan Lee,
Michael Garland
,
Michael Bauer
, Alex Aiken, Fredrik Kjolstad
Automatic Tracing in Task-Based Runtime Systems
Rohan Yadav,
Michael Bauer
, David Broman,
Michael Garland
, Alex Aiken, Fredrik Kjolstad
2018
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-based Runtimes
Wonchan Lee, Elliott Slaughter,
Michael Bauer
, Sean Treichler, Todd Warszawski,
Michael Garland
, Alex Aiken
Exascale Deep Learning for Climate Analytics
Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Chun-Kai Chang, Sangkug Lym, Nicholas Kelly,
Michael B. Sullivan
, Mattan Erez
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Matthias Blumrich
,
Ted Jiang
,
Larry Dennison
Fast, High Precision Ray/Fiber Intersection using Tight, Disjoint Bounding Volumes
Nikolaus Binder
,
Alex Keller
Massively Parallel Stackless Ray Tracing of Catmull-Clark Subdivision Surfaces
Nikolaus Binder
,
Alex Keller
Exascale Deep Learning for Climate Analytics
Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston
CRUM: Checkpoint-Restart Support for CUDA's Unified Memory
Rohan Garg, Apoorve Mohan,
Michael B. Sullivan
, Gene Cooperman
Phantom Ray-Hair Intersector
Alexander Reshetov
,
David Luebke
Hamartia: A Fast and Accurate Error Injection Framework
Chun-Kai Chang, Sangkug Lym, Nicholas Kelly,
Michael B. Sullivan
, Mattan Erez
Isometry: A Path-Based Distributed Data Transfer System
Zhihao Jia, Sean Treichler, Galen Shipman, Patrick McCormick, Alex Aiken
Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training
Maohua Zhu,
Jason Clemons
, Jeff Pool, Minsoo Rhu,
Steve Keckler
, Yuan Xie
Scalable Collectives for Distributed Asynchronous Many-Task Runtimes
Matthew Whitlock, Hemanth Kolla, Sean Treichler, Philippe Pebay, Janine C. Bennett
BabelFlow: An Embedded Domain Specific Language for Parallel Analysis and Visualization
Steve Petruzza, Sean Treichler, Valerio Pascucci, Peer-Timo Bremer