
Sreyan Ghosh’s research focuses on enhancing how AI models process, understand, and reason about audio—including speech, sounds, and music. Audio understanding faces unique challenges, such as limited data and the complexity of real-world audio signals. To address these issues, his work focuses on introducing scalable strategies for generating synthetic training data, improving audio representation learning, and designing efficient architectures for integrating audio with Large Language Models. By leveraging these approaches, he aims to push the boundaries of expert-level audio understanding and reasoning—including long-form audios—thereby enabling AI systems to better comprehend and interact with the world around them.
Sreyan Ghosh is a 3rd year PhD student in Computer Science at the University of Maryland, College Park, advised by Professors Dinesh Manocha and Ramani Duraiswami. His research focuses on enhancing AI models’ ability to understand and reason about audio, with an emphasis on multimodal and low-resource learning. He earned his Master’s degree from the University of Maryland and his Bachelor’s degree from Christ University in India. He has interned at Adobe, Microsoft, and NVIDIA and has been recognized with the Outstanding Graduate Assistant Award.