Mamba

A Unified Latency-Flexible Framework with a Causal Mamba Core for Multichannel Speech Enhancement

We propose a unified and latency-flexible framework for multichannel speech enhancement in CHiME-9 Task 2 (ECHI), built upon a decoupled Shell-Core architecture. A latency-aware DenseNet-style Shell performs local spectral-spatial modeling across …

Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement

Recent Mamba-based models have shown promise in speech enhancement by efficiently modeling long-range temporal dependencies. However, models like Speech Enhancement Mamba (SEMamba) remain limited to single-speaker scenarios and struggle in complex …

Hymba: A Hybrid-head Architecture for Small Language Models

The transformative capabilities of language models (LMs) have intensified the demand for their deployment on everyday devices, necessitating efficient processing for on-device language tasks. To address this, we propose Hymba, a new family of small …