Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

Publication
International Conference on Machine Learning (ICML)