We propose a unified and latency-flexible framework for multichannel speech enhancement in CHiME-9 Task 2 (ECHI), built upon a decoupled Shell-Core architecture. A latency-aware DenseNet-style Shell performs local spectral-spatial modeling across …
Recent Mamba-based models have shown promise in speech enhancement by efficiently modeling long-range temporal dependencies. However, models like Speech Enhancement Mamba (SEMamba) remain limited to single-speaker scenarios and struggle in complex …
The transformative capabilities of language models (LMs) have intensified the demand for their deployment on everyday devices, necessitating efficient processing for on-device language tasks. To address this, we propose Hymba, a new family of small …