Home
News
Members
Publications
NVIDIA Research
Light
Dark
Automatic
Yonggan Fu
Latest
CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Fast-SLM: Towards Latency-Optimal Hybrid Small Language Models
LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
Hymba: A Hybrid-head Architecture for Small Language Models
LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
Cite
×