  Michael Garland  

 



  ![](/sites/default/files/person/Garland_3x4_0.jpg)

  

 Michael Garland is the Senior Director of Programming Systems and Applications

research at NVIDIA. He completed his Ph.D. at Carnegie Mellon University, and

was previously on the faculty of the Department of Computer Science of the

University of Illinois at Urbana-Champaign. He joined NVIDIA in 2006 as one

of the first members of NVIDIA Research, and has been working to develop

effective parallel programming systems ever since. His research goal is to

develop tools and techniques that will equip programmers to realize the full

potential of modern, massively parallel, computing systems.



   Research Area(s)

[Algorithms and Numerical Methods](/index.php/research-area/algorithms)

[High Performance Computing](/index.php/research-area/high-performance-computing)

[Programming Languages, Systems and Tools](/index.php/research-area/programming-languages-systems)

 

 

  

 Main Field of Interest

[Programming Languages, Systems and Tools](/index.php/research-area/programming-languages-systems)

 

  

 Google Scholar

<https://scholar.google.com/citations?user=6oB5ju0AAAAJ>

 

  

 

 

 



 ### Publications

 

### 2025 

[Task-Based Tensor Computations on Modern GPUs](/index.php/publication/2025-06_task-based-tensor-computations-modern-gpus)

Rohan Yadav, [Michael Garland](/index.php/person/michael-garland), Alex Aiken, [Michael Bauer](/index.php/person/mike-bauer)



[PLDI](https://pldi25.sigplan.org/)









[Composing Distributed Computations Through Task and Kernel Fusion](/index.php/publication/2025-03_composing-distributed-computations-through-task-and-kernel-fusion)

Rohan Yadav, Shiv Sundrum, Wonchan Lee, [Michael Garland](/index.php/person/michael-garland), [Michael Bauer](/index.php/person/mike-bauer), Alex Aiken, Fredrik Kjolstad



[ASPLOS](https://www.asplos-conference.org/asplos2025/)









[Automatic Tracing in Task-Based Runtime Systems](/index.php/publication/2025-03_automatic-tracing-task-based-runtime-systems)

Rohan Yadav, [Michael Bauer](/index.php/person/mike-bauer), David Broman, [Michael Garland](/index.php/person/michael-garland), Alex Aiken, Fredrik Kjolstad



[ASPLOS](https://www.asplos-conference.org/asplos2025/)









### 2023 

[Legate Sparse: Distributed Sparse Computing in Python](/publication/2023-11_legate-sparse-distributed-sparse-computing-python)

Rohan Yadav, Wonchan Lee, [Melih Elibol](/person/melih-elibol), [Taylor Patti](/person/taylor-patti), Manolis Papadakis, [Michael Garland](/person/michael-garland), Alex Aiken, Fredrik Kjolstad, [Michael Bauer](/person/mike-bauer)



[Supercomputing](https://sc23.supercomputing.org/presentation/?id=pap119&sess=sess172)









[Visibility Algorithms for Dynamic Dependence Analysis and Distributed Coherence](/index.php/publication/2023-02_visibility-algorithms-dynamic-dependence-analysis-and-distributed-coherence)

[Michael Bauer](/index.php/person/mike-bauer), Elliott Slaughter, Sean Treichler, Wonchan Lee, [Michael Garland](/index.php/person/michael-garland), Alex Aiken



[PPoPP](https://conf.researchr.org/home/ppopp-2023)









### 2021 

[Scaling Implicit Parallelism via Dynamic Control Replication](/index.php/publication/2021-02_scaling-implicit-parallelism-dynamic-control-replication)

[Michael Bauer](/index.php/person/mike-bauer), Wonchan Lee, Elliott Slaughter, Zhihao Jia, Mario Di Renzo, Manolis Papadakis, Galen Shipman, Patrick McCormick, [Michael Garland](/index.php/person/michael-garland), Alex Aiken



[Principles and Practices of Parallel Programming (PPoPP)](https://ppopp21.sigplan.org/)









### 2020 

[A Programmable Approach to Neural Network Compression](/index.php/publication/2020-10_programmable-approach-neural-network-compression)

Vinu Joseph, Ganesh L. Gopalakrishnan, [Saurav Muralidharan](/index.php/person/saurav-muralidharan), [Michael Garland](/index.php/person/michael-garland), Animesh Garg



[IEEE Micro: Special Issue on Machine Learning for Systems](https://ieeexplore.ieee.org/document/9151283)









### 2019 

[Legate NumPy: Accelerated and Distributed Array Computing](/publication/2019-11_legate-numpy-accelerated-and-distributed-array-computing)

[Michael Bauer](/person/mike-bauer), [Michael Garland](/person/michael-garland)



[The International Conference for High Performance Computing, Networking, Storag…](https://sc19.supercomputing.org/presentation/?id=pap271&sess=sess163)









[GPU-Accelerated Atari Emulation for Reinforcement Learning](/publication/2019-07_gpu-accelerated-atari-emulation-reinforcement-learning)

[Steven Dalton](/person/steven-dalton), [Iuri Frosio](/person/iuri-frosio), [Michael Garland](/person/michael-garland)



[Arxiv](https://arxiv.org/abs/1907.08467)









[Throughput-oriented GPU memory allocation](/publication/2019-02_throughput-oriented-gpu-memory-allocation)

[Isaac Gelado](/person/isaac-gelado), [Michael Garland](/person/michael-garland)



[Proceedings of the 24th Symposium on Principles and Practice of Parallel Progra…](https://dl.acm.org/citation.cfm?id=3295727)









### 2018 

[Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-based Runtimes](/index.php/publication/2018-11_dynamic-tracing-memoization-task-graphs-dynamic-task-based-runtimes)

Wonchan Lee, Elliott Slaughter, [Michael Bauer](/index.php/person/mike-bauer), Sean Treichler, Todd Warszawski, [Michael Garland](/index.php/person/michael-garland), Alex Aiken



[International Conference for High Performance Computing and Communications (SC'…](https://dl.acm.org/doi/10.5555/3291656.3291702)









### 2017 

[AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks](/publication/2017-12_adabatch-adaptive-batch-sizes-training-deep-neural-networks)

Aditya Devarakonda, Maxim Naumov, [Michael Garland](/person/michael-garland)



[arXiv:1712.02029 \[cs.LG\]](https://arxiv.org/abs/1712.02029)









[Parallel Depth-First Search for Directed Acyclic Graphs](/publication/2017-03_parallel-depth-first-search-directed-acyclic-graphs)

Maxim Naumov, Alysson Vrielink, [Michael Garland](/person/michael-garland)



Technical Report NVR-2017-001









### 2016 

[Single-pass Parallel Prefix Scan with Decoupled Look-back](/publication/2016-03_single-pass-parallel-prefix-scan-decoupled-look-back)

[Duane Merrill](/person/duane-merrill%2520iii), [Michael Garland](/person/michael-garland)













### 2014 

[A Decomposition for In-place Array Transposition](/index.php/publication/2014-02_decomposition-place-array-transposition)

Bryan Catanzaro, [Alex Keller](/index.php/person/alex-keller), [Michael Garland](/index.php/person/michael-garland)



[PPoPP 2014](http://dx.doi.org/10.1145/2555243.2555253)









### 2012 

[Efficient Parallel Merge Sort for Fixed and Variable Length Keys ](/publication/2012-05_efficient-parallel-merge-sort-fixed-and-variable-length-keys)

Andrew Davidson, David Tarjan, [Michael Garland](/person/michael-garland), John Owens



[Proc. Innovative Parallel Computing](http://innovativeparallel.org/)









[Policy-based Tuning for Performance Portability and Library Co-optimization](/publication/2012-05_policy-based-tuning-performance-portability-and-library-co-optimization)

[Duane Merrill](/person/duane-merrill%2520iii), [Michael Garland](/person/michael-garland), Andrew Grimshaw



[Proc. Innovative Parallel Computing](http://innovativeparallel.org/)









[Scalable GPU Graph Traversal](/publication/2012-02_scalable-gpu-graph-traversal)

[Duane Merrill](/person/duane-merrill%2520iii), [Michael Garland](/person/michael-garland), Andrew Grimshaw



[17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (…](http://dynopt.org/ppopp-2012/)









### 2011 

[GPUs and the Future of Parallel Computing](/index.php/publication/2011-09_gpus-and-future-parallel-computing)

[Steve Keckler](/index.php/person/stephen-keckler), [William Dally](/index.php/person/william-dally), [Brucek Khailany](/index.php/person/brucek-khailany), [Michael Garland](/index.php/person/michael-garland), David Glasco



[IEEE Micro](http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6045685&tag=1)









[High Performance and Scalable GPU Graph Traversal](/publication/2011-08_high-performance-and-scalable-gpu-graph-traversal)

Duane Merrill, [Michael Garland](/person/michael-garland), Andrew Grimshaw



[Technical Report CS-2011-05, Department of Computer Science, University of Virg…](http://www.cs.virginia.edu)









[Copperhead: Compiling an Embedded Data Parallel Language](/publication/2011-02_copperhead-compiling-embedded-data-parallel-language)

Bryan Catanzaro, [Michael Garland](/person/michael-garland), Kurt Keutzer



[16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (…](http://ppopp11.ac.uma.es)









### 2010 

[Sparse Matrix-Vector Multiplication on Multicore and Accelerators](/publication/2010-12_sparse-matrix-vector-multiplication-multicore-and-accelerators)

Sam Williams, Nathan Bell, Jee Whan Choi, [Michael Garland](/person/michael-garland), Leonid Oliker, Richard Vuduc



[ Scientific Computing on Multicore and Accelerators](http://www.crcpress.com/product/isbn/9781439825365)









### 2009 

[Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors](/publication/2009-11_implementing-sparse-matrix-vector-multiplication-throughput-oriented-processors)

Nathan Bell, [Michael Garland](/person/michael-garland)



[Proc. Supercomputing '09](http://sc09.supercomputing.org/)









[Solving Computational Problems with GPU Computing](/publication/2009-09_solving-computational-problems-gpu-computing)

Jonathan Cohen, [Michael Garland](/person/michael-garland)



[Computing in Science and Engineering](http://www.computer.org/portal/web/cise/home)









[Designing Efficient Sorting Algorithms for Manycore GPUs](/publication/2009-05_designing-efficient-sorting-algorithms-manycore-gpus)

Nadathur Satish, Mark Harris, [Michael Garland](/person/michael-garland)



[Proc. IEEE International Symposium on Parallel &amp; Distributed Processing](http://www.ipdps.org/ipdps2009/2009_advance_program.html)









[Fast BVH Construction on GPUs](/publication/2009-03_fast-bvh-construction-gpus)

Christian Lauterbach, [Michael Garland](/person/michael-garland), Shubhabrata Sengupta, [David Luebke](/person/david-luebke), Dinesh Manocha



[Proc. Eurographics 2009](http://www.eurographics2009.de/)









### 2008 

[Efficient Parallel Scan Algorithms for GPUs](/publication/2008-12_efficient-parallel-scan-algorithms-gpus)

Shubhabrata Sengupta, Mark Harris, [Michael Garland](/person/michael-garland)



[NVIDIA Technical Report NVR-2008-003](http://research.nvidia.com/publication/efficient-parallel-scan-algorithms-gpus)









[Efficient Sparse Matrix-Vector Multiplication on CUDA](/publication/2008-12_efficient-sparse-matrix-vector-multiplication-cuda)

Nathan Bell, [Michael Garland](/person/michael-garland)



[NVIDIA Technical Report NVR-2008-004](http://research.nvidia.com/publication/efficient-sparse-matrix-vector-multiplication-cuda)









[On the Visualization of Social and Other Scale-Free Networks](/publication/2008-11_visualization-social-and-other-scale-free-networks)

Yuntao Jia, Jared Hoberock, [Michael Garland](/person/michael-garland), John C. Hart



[Proc. Infovis 2008](http://vis.computer.org/VisWeek2008/infovis/)









[Rapid Multipole Graph Drawing on the GPU](/publication/2008-09_rapid-multipole-graph-drawing-gpu)

Apeksha Godiyal, Jared Hoberock, [Michael Garland](/person/michael-garland), John C. Hart



[Proc. Graph Drawing 2008](http://www.ics.forth.gr/gd2008/)









[Parallel Computing Experiences with CUDA](/publication/2008-08_parallel-computing-experiences-cuda)

[Michael Garland](/person/michael-garland), Scott Le Grand, John Nickolls, Joshua Anderson, Jim Hardwick, Scott Morton, Everett Phillips, Yao Zhang, Vasily Volkov



[IEEE Micro](http://www.computer.org/micro)









[Free-form Motion Processing](/publication/2008-04_free-form-motion-processing)

Scott Kircher, [Michael Garland](/person/michael-garland)



[ACM Trans. on Graphics](http://tog.acm.org/)









[Scalable Parallel Programming with CUDA](/publication/2008-03_scalable-parallel-programming-cuda)

John Nickolls, Ian Buck, [Michael Garland](/person/michael-garland), Kevin Skadron



[Queue](http://queue.acm.org/)









### 2007 

[Iterative Methods for Improving Mesh Parameterizations](/publication/2007-06_iterative-methods-improving-mesh-parameterizations)

Shen Dong, [Michael Garland](/person/michael-garland)



[IEEE Shape Modeling International 2007](http://smi07.liris.cnrs.fr/)