NVIDIA Research Taiwan
NVIDIA Research Taiwan
Home
News
Members
Research
Publications
Contact
Light
Dark
Automatic
Efficient DL
EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
While post-training compression techniques effectively reduce the memory footprint, latency, and power consumption of Large Language Models (LLMs), they often result in noticeable accuracy degradation and remain limited by hardware and kernel …
Cite
×