We propose a unified and latency-flexible framework for multichannel speech enhancement in CHiME-9 Task 2 (ECHI), built upon a decoupled Shell-Core architecture. A latency-aware DenseNet-style Shell performs local spectral-spatial modeling across …
Recent Mamba-based models have shown promise in speech enhancement by efficiently modeling long-range temporal dependencies. However, models like Speech Enhancement Mamba (SEMamba) remain limited to single-speaker scenarios and struggle in complex …
Speech quality estimation has recently undergone a paradigm shift from human- hearing expert designs to machine-learning models. However, current models rely mainly on supervised learning, which is time-consuming and expensive for label collection. …