Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(7)
2024
(1)
2023
(5)
2022
(5)
2021
(9)
2020
(6)
2019
(15)
2018
(14)
2017
(15)
2016
(9)
2015
(4)
2014
(1)
2012
(4)
2011
(4)
2010
(3)
2009
(2)
2008
(1)
2005
(1)
Facet Publication Year
Research Areas
High Performance Computing
(16)
Artificial Intelligence and Machine Learning
(6)
Computer Architecture
(6)
Algorithms and Numerical Methods
(4)
Programming Languages, Systems and Tools
(4)
Generative AI
(2)
Resilience and Safety
(2)
Climate Simulation
(1)
Computer Graphics
(1)
Events
ICML
(1)
PLDI
(1)
SIGGRAPH
(1)
16 results found
High Performance Computing
Clear all
2025
2016
High Performance Computing
2025
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
Boris Bonev
, Thorsten Kurth, Ankur Mahesh, Mauro Bisson,
Jean Kossaifi
, Karthik Kashinath, Anima Anandkumar, William D. Collins,
Mike Pritchard
,
Alex Keller
Task-Based Tensor Computations on Modern GPUs
Rohan Yadav,
Michael Garland
, Alex Aiken,
Michael Bauer
PLDI
Beyond the Buzz: A Pragmatic Take on Inference Disaggregation
Tiyasa Mitra, Ritika Borkar, Nidhi Bhatia, Ramon Matas, Shivam Raj, Dheevatsa Mudigere, Ritchie Zhao, Maximilian Golub, Arpan Dutta, Sailaja Madduri, Dharmesh Jani, Brian Pharris, Bita Darvish Rouhani
SLIM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression
Mohammad Mozaffari , Amir Yazdanbakhsh,
Maryam Mehri Dehnavi
ICML
Adaptive Algebraic Reuse of Reordering in Cholesky Factorizations with Dynamic Sparsity Patterns
Behrooz Zarebavani, Danny Kaufman, David Levin,
Maryam Mehri Dehnavi
SIGGRAPH
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav, Shiv Sundrum, Wonchan Lee,
Michael Garland
,
Michael Bauer
, Alex Aiken, Fredrik Kjolstad
Automatic Tracing in Task-Based Runtime Systems
Rohan Yadav,
Michael Bauer
, David Broman,
Michael Garland
, Alex Aiken, Fredrik Kjolstad
2016
Tensor Contractions with Extended BLAS Kernels on CPU and GPU
Yang Shi, U. N. Niranjan, Animashree Anandkumar,
Cris Cecka
vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design.
Minsoo Rhu, Natalia Gimelshein,
Jason Clemons
, Arslan Zulfiqar,
Steve Keckler
Approxilyzer: Towards A Systematic Framework for Instruction-Level Approximate Computing and its Application to Hardware Resiliency
Radha Venkatagiri, Abdulrahman Mahmoud,
Siva Hari
, Sarita Adve
All-Inclusive ECC: Thorough End-to-End Protection for Reliable Computer Memory
Jungrae Kim,
Michael B. Sullivan
, Sangkug Lym, Mattan Erez
S-Step and Communication-Avoiding Iterative Methods
Maxim Naumov
Selective GPU Caches to Eliminate CPU-GPU HW Cache Coherence
Neha Agarwal,
David Nellans
, Eiman Ebrahimi, Thomas F. Wenisch, John Danskin,
Steve Keckler
Towards High Performance Paged Memory for GPUs
Tianhao Zheng,
David Nellans
, Arslan Zulfiqar,
Mark Stephenson
,
Steve Keckler
A Case for Toggle-Aware Compression for GPU Systems
Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C. Mowry,
Steve Keckler
Parallel Spectral Graph Partitioning
Maxim Naumov, Timothy Moon