1. [Publications](/publications)
2. Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond
 
 # Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond

  ![](/sites/default/files/styles/wide/public/publications/Audio-SDS%20TLDR%20Overview.jpg?itok=zRHqAMV5)

 We introduce Audio-SDS, a generalization of Score Distillation Sampling (SDS) to text-conditioned audio diffusion models. While SDS was initially designed for text-to-3D generation using image diffusion, its core idea of distilling a powerful generative prior into a separate parametric representation extends to the audio domain. Leveraging a single pretrained model, Audio-SDS enables a broad range of tasks without requiring specialized datasets. In particular, we demonstrate how Audio-SDS can guide physically informed impact sound simulations, calibrate FM-synthesis parameters, and perform prompt-specified source separation. Our findings illustrate the versatility of distillation-based methods across modalities and establish a robust foundation for future work using generative priors in audio tasks. Accepted to ICML 2025.


 ## Authors


Jessie Richter-Powell (NVIDIA, MIT)

Antonio Torralba (NVIDIA)

Jonathan Lorraine (NVIDIA)

 
 ## Publication Date


Wednesday, May 7, 2025

 
 ## Published in


[Arxiv](https://arxiv.org/pdf/2505.04621)

 
 ## Research Area


[Artificial Intelligence and Machine Learning ](/research-area/machine-learning-artificial-intelligence)

[Generative AI](/research-area/generative-ai)

 
 ## External Links


[Project Page](https://research.nvidia.com/labs/toronto-ai/Audio-SDS/)

[Overview Video](https://youtu.be/NCk-d1FTcsc?si=LNpHzpSM5IYNgPpH)