Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Artificial Intelligence Computing Leadership from NVIDIA
Login
Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Search
Search
Enter the terms you wish to search for.
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(7)
2024
(1)
2023
(5)
2022
(5)
2021
(9)
2020
(6)
2019
(15)
2018
(14)
2017
(15)
2016
(9)
2015
(4)
2014
(1)
2012
(4)
2011
(4)
2010
(3)
2009
(2)
2008
(1)
2005
(1)
Facet Publication Year
Research Areas
High Performance Computing
(16)
Computer Architecture
(7)
Programming Languages, Systems and Tools
(6)
Artificial Intelligence and Machine Learning
(4)
Algorithms and Numerical Methods
(2)
Generative AI
(2)
Networking
(2)
Resilience and Safety
(2)
Autonomous Vehicles
(1)
Climate Simulation
(1)
Computer Graphics
(1)
Events
ICML
(1)
PLDI
(1)
SIGGRAPH
(1)
16 results found
High Performance Computing
Clear all
2025
2021
High Performance Computing
2025
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
Boris Bonev
, Thorsten Kurth, Ankur Mahesh, Mauro Bisson,
Jean Kossaifi
, Karthik Kashinath, Anima Anandkumar, William D. Collins,
Mike Pritchard
,
Alex Keller
Task-Based Tensor Computations on Modern GPUs
Rohan Yadav,
Michael Garland
, Alex Aiken,
Michael Bauer
PLDI
Beyond the Buzz: A Pragmatic Take on Inference Disaggregation
Tiyasa Mitra, Ritika Borkar, Nidhi Bhatia, Ramon Matas, Shivam Raj, Dheevatsa Mudigere, Ritchie Zhao, Maximilian Golub, Arpan Dutta, Sailaja Madduri, Dharmesh Jani, Brian Pharris, Bita Darvish Rouhani
SLIM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression
Mohammad Mozaffari , Amir Yazdanbakhsh,
Maryam Mehri Dehnavi
ICML
Adaptive Algebraic Reuse of Reordering in Cholesky Factorizations with Dynamic Sparsity Patterns
Behrooz Zarebavani, Danny Kaufman, David Levin,
Maryam Mehri Dehnavi
SIGGRAPH
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav, Shiv Sundrum, Wonchan Lee,
Michael Garland
,
Michael Bauer
, Alex Aiken, Fredrik Kjolstad
Automatic Tracing in Task-Based Runtime Systems
Rohan Yadav,
Michael Bauer
, David Broman,
Michael Garland
, Alex Aiken, Fredrik Kjolstad
2021
GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management
Harini Muthukrishnan
,
Daniel Lustig
,
David Nellans
, Thomas Wenisch
Best Paper nominee
IEEE Micro Top Picks in Computer Architecture (Honorable Mention)
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs
Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
Suraksha: A Quantitative AV Safety Evaluation Framework to Analyze Safety Implications of Perception Design Choices
Hengyu Zhao,
Siva Hari
, Timothy Tsai,
Michael B. Sullivan
,
Steve Keckler
, Jishen Zhao
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
Harini Muthukrishnan
,
David Nellans
,
Daniel Lustig
, Jeffrey Fessler, Thomas Wenisch
Demystifying GPU Reliability: Comparing and Combining Beam Experiments, Fault Simulation, and Profiling
Fernando Fernandes dos Santos,
Siva Hari
, Pedro Martins Basso, Luigi Carro, Paolo Rech
Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
Atefeh Mehrabi,
Donghyuk Lee
,
Niladrish Chatterjee
, Danial J. Sorin, Benjamin C. Lee,
Mike O'Connor
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen,
Wen-mei Hwu
Scaling Implicit Parallelism via Dynamic Control Replication
Michael Bauer
, Wonchan Lee, Elliott Slaughter, Zhihao Jia, Mario Di Renzo, Manolis Papadakis, Galen Shipman, Patrick McCormick,
Michael Garland
, Alex Aiken