1. [Publications](/publications)
2. Beyond the Buzz: A Pragmatic Take on Inference Disaggregation
 
 # Beyond the Buzz: A Pragmatic Take on Inference Disaggregation

  ![Publication image](/sites/default/files/styles/wide/public/default_images/default.jpeg?itok=qUFsuJCP "Publication image")

 As inference scales to multi-node deployments, disaggregation—splitting inference into distinct phases—offers a promising path to improving the throughput-interactivity Pareto frontier. Despite growing enthusiasm and a surge of open-source efforts, practical deployment of disaggregated serving remains limited due to the complexity of the optimization search space and system-level coordination. In this paper, we present the first systematic study of disaggregated inference at scale, evaluating hundreds of thousands of design points across diverse workloads and hardware configurations. We find that disaggregation is most effective for prefill-heavy traffic patterns and larger models. Our results highlight the critical role of dynamic rate matching and elastic scaling in achieving Pareto-optimal performance. Our findings offer actionable insights for efficient disaggregated deployments to navigate the trade-off between system throughput and interactivity.



 ## Authors



Tiyasa Mitra

Ritika Borkar

Nidhi Bhatia

Ramon Matas

 Shivam Raj

Dheevatsa Mudigere

Ritchie Zhao

Maximilian Golub

Arpan Dutta

Sailaja Madduri

Dharmesh Jani

Brian Pharris

Bita Darvish Rouhani 

 

 

 ## Publication Date



Friday, June 6, 2025

 

 ## Published in



[Arxiv](https://arxiv.org/abs/2506.05508)

 

 ## Research Area



[Artificial Intelligence and Machine Learning ](/research-area/machine-learning-artificial-intelligence)

[Generative AI](/research-area/generative-ai)

[High Performance Computing](/research-area/high-performance-computing)

 

 

 ## Uploaded Files



[Beyond the Buzz A Pragmatic Take on Inference Disaggregation.pdf](https://d1qx31qr3h6wln.cloudfront.net/publications/Beyond%20the%20Buzz%20A%20Pragmatic%20Take%20on%20Inference%20Disaggregation_0.pdf "Open file in new window")1.28 MB