1. [Publications](/publications)
2. Composing Distributed Computations Through Task and Kernel Fusion
 
 # Composing Distributed Computations Through Task and Kernel Fusion

  ![Publication image](/sites/default/files/styles/wide/public/default_images/default.jpeg?itok=qUFsuJCP "Publication image")

 We introduce Diffuse, a system that dynamically performs task and kernel fusion in distributed, task-based runtime systems. The key component of Diffuse is an intermediate representation of distributed computation that enables the necessary analyses for the fusion of distributed tasks to be performed in a scalable manner. We pair task fusion with a JIT compiler to fuse together the kernels within fused tasks. We show empirically that Diffuse’s intermediate representation is general enough to be a target for two real-world, task-based libraries (cuPyNumeric and Legate Sparse), letting Diffuse find optimization opportunities across function and library boundaries. Diffuse accelerates unmodified applications developed by composing task-based libraries by 1.86x on average (geo-mean), and by between 0.93x–10.7x on up to 128 GPUs. Diffuse also finds optimization opportunities missed by the original application developers, enabling high-level Python programs to match or exceed the performance of an explicitly parallel MPI library.



 ## Authors



Rohan Yadav (Stanford University)

Shiv Sundrum (Stanford University)

Wonchan Lee (NVIDIA)

[Michael Garland](/person/michael-garland)

[Michael Bauer](/person/mike-bauer)

Alex Aiken (Stanford University)

Fredrik Kjolstad (Stanford University)

 

 

 ## Publication Date



Sunday, March 30, 2025

 

 ## Published in



[ASPLOS](https://www.asplos-conference.org/asplos2025/)

 

 ## Research Area



[High Performance Computing](/research-area/high-performance-computing)

[Programming Languages, Systems and Tools](/research-area/programming-languages-systems)

 

 

 ## Uploaded Files



[Legate\_Kernel\_Fusion\_\_\_ASPLOS\_2025.pdf](https://d1qx31qr3h6wln.cloudfront.net/publications/Legate_Kernel_Fusion___ASPLOS_2025.pdf "Open file in new window")828.02 KB