LLM Pruning and Distillation in Practice: The Minitron Approach

Publication image

Authors

Sharath Turuvekere Sreenivas

Saurav Muralidharan

Raviraj Joshi

Marcin Chochowski

Mostofa Patwary (NVIDIA)

Mohammad Shoeybi (NVIDIA)

Bryan Catanzaro (NVIDIA)

Pavlo Molchanov

Publication Date

Wednesday, August 21, 2024

Research Area

Artificial Intelligence and Machine Learning

Uploaded Files

LLM Pruning and Distillation in Practice: The Minitron Approach2.26 MB