Home
News
Members
Publications
NVIDIA Research
Light
Dark
Automatic
CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao
,
Yu Yang
,
Yonggan Fu
,
Xin Dong
,
Dan Su
,
Markus Kliegl
,
Zijia Chen
,
Peter Belcak
,
Yoshi Suhara
,
Hongxu (Danny) Yin
,
Mostofa Patwary
,
Yingyan Celine Lin
,
Jan Kautz
,
Pavlo Molchanov
December 2025
Cite
arXiv
Type
Conference paper
Publication
Advances in Neural Information Processing Systems (NeurIPS)
Shizhe Diao
Xin Dong
Peter Belcak
Hongxu (Danny) Yin
Jan Kautz
Team Leader
Pavlo Molchanov
Related
Hymba: A Hybrid-head Architecture for Small Language Models
Fast-SLM: Towards Latency-Optimal Hybrid Small Language Models
LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
Cite
×