Home
News
Members
Publications
NVIDIA Research
Light
Dark
Automatic
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong
,
Yonggan Fu
,
Shizhe Diao
,
Wonmin Byeon
,
Zijia Chen
,
Ameya Sunil Mahabaleshwarkar
,
Shih-Yang Liu
,
Matthijs Van Keirsbilck
,
Min-Hung Chen
,
Yoshi Suhara
,
Yingyan Celine Lin
,
Jan Kautz
,
Pavlo Molchanov
April 2025
Cite
arXiv
Type
Conference paper
Publication
International Conference on Learning Representations (ICLR)
Xin Dong
Shizhe Diao
Wonmin Byeon
Jan Kautz
Team Leader
Pavlo Molchanov
Related
LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
Cite
×