  Yaosheng Fu  

 



  ![](/sites/default/files/person/visa%20%282%29_1.jpg)

  

 Yaosheng Fu joined NVIDIA in September, 2017 as a member of the architecture research team. His current interests include computer architecture, memory systems and parallel computing. Yaosheng received his Ph.D. degree in Electrical Engineering at Princeton University, NJ in 2017 and B.S. in Electronic Engineering at Tsinghua University, China in 2010.



   Research Area(s)

[Computer Architecture](/index.php/research-area/computer-architecture)

[High Performance Computing](/index.php/research-area/high-performance-computing)

[Networking](/index.php/research-area/networking)

[Programming Languages, Systems and Tools](/index.php/research-area/programming-languages-systems)

 

 

  

 

 

 



 ### Publications

 

### 2021 

[GPU Domain Specialization via Composable On-Package Architecture](/publication/2021-12_gpu-domain-specialization-composable-package-architecture)

[Yaosheng Fu](/person/yaosheng-fu), Evgeny Bolotin, [Niladrish Chatterjee](/person/niladrish-chatterjee), [David Nellans](/person/david-nellans), [Steve Keckler](/person/stephen-keckler)



[ACM Transactions on Architecture and Code Optimization (TACO)](https://dl.acm.org/doi/full/10.1145/3484505)









[GPU Domain Specialization via Composable On-Package Architecture](/index.php/publication/2021-04_gpu-domain-specialization-composable-package-architecture)

[Yaosheng Fu](/index.php/person/yaosheng-fu), Evgeny Bolotin, [Niladrish Chatterjee](/index.php/person/niladrish-chatterjee), [David Nellans](/index.php/person/david-nellans), [Steve Keckler](/index.php/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/2104.02188)









[Need for Speed: Experiences Building a Trustworthy System-Level GPU Simulator.](/publication/2021-02_need-speed-experiences-building-trustworthy-system-level-gpu-simulator)

Oreste Villa, [Daniel Lustig](/person/daniel-lustig), [Zi Yan](/person/zi-yan), Evgeny Bolotin, [Yaosheng Fu](/person/yaosheng-fu), [Niladrish Chatterjee](/person/niladrish-chatterjee), [Ted Jiang](/person/ted-jiang), [David Nellans](/person/david-nellans)



[International Symposium on High Performance Computer Architecture (HPCA)](https://doi.org/10.1109/HPCA51647.2021.00077)









### 2020 

[The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems](/index.php/publication/2020-12_architectural-implications-distributed-reinforcement-learning-cpu-gpu-systems)

Ahmet Inci, Evgeny Bolotin, [Yaosheng Fu](/index.php/person/yaosheng-fu), [Gal Dalal](/index.php/person/gal-dalal), [Shie Mannor](/index.php/person/shie-mannor), [David Nellans](/index.php/person/david-nellans), Diana Marculescu



[Workshop on Energy Efficient Machine Learning and Cognitive Computing (EMC2)](https://www.emc2-ai.org/virtual-20)









[BYOC: A "Bring Your Own Core" Framework for Heterogeneous-ISA Research](/publication/2020-03_byoc-bring-your-own-core-framework-heterogeneous-isa-research)

Jonathan Balkind, Katie Lim, Michael Schaffner, Fei Gao, Grigory Chirkov, Ang Li, Alexey Lavrov, Tri M. Nguyen, [Yaosheng Fu](/person/yaosheng-fu), Florian Zaruba, Kunal Gulati, Luca Benini, David Wentzlaf



[International Conference on Architectural Support for Programming Languages and…](https://dl.acm.org/doi/10.1145/3373376.3378479)









### 2019 

[Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training](/publication/2019-08_optimizing-multi-gpu-parallelization-strategies-deep-learning-training)

Saptadeep Pal, Eiman Ebrahimi, Arslan Zulfiqar, [Yaosheng Fu](/person/yaosheng-fu), Victor Zhang, Szymon Migacz, [David Nellans](/person/david-nellans), Puneet Gupta



[IEEE MICRO: Special Edition on Machine Learning Acceleration](https://ieeexplore.ieee.org/document/8805338)









[Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training](/publication/2019-07_optimizing-multi-gpu-parallelization-strategies-deep-learning-training)

Saptadeep Pal, Eiman Ebrahimi, Arslan Zulfiqar, [Yaosheng Fu](/person/yaosheng-fu), Victor Zhang, Szymon Migacz, [David Nellans](/person/david-nellans), Puneet Gupta 



[arXiv](https://arxiv.org/abs/1907.13257)