Efficient AI
Efficient AI
News
Members
Publications
Light
Dark
Automatic
ICLR2025
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Diffusion models have been proven highly effective at generating high-quality images. However, as these models grow larger, they …
Muyang Li
,
Yujun Lin
,
Zhekai Zhang
,
Tianle Cai
,
Xiuyu Li
,
Junxian Guo
,
Enze Xie
,
Chenlin Meng
,
Jun-Yan Zhu
,
Song Han
PDF
Cite
Code
Project
Video
Demo
Blog
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
FP8 training has emerged as a promising method for improving training efficiency. Existing frameworks accelerate training by applying …
Haocheng Xi
,
Han Cai
,
Ligeng Zhu
,
Yao (Jason) Lu
,
Kurt Keutzer
,
Jianfei Chen
,
Song Han
PDF
Cite
Code
Project
Demo
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. …
Junyu Chen
,
Han Cai
,
Junsong Chen
,
Enze Xie
,
Shang Yang
,
Haotian Tang
,
Muyang Li
,
Yao (Jason) Lu
,
Song Han
PDF
Cite
Code
Project
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Deploying long-context large language models (LLMs) is essential but poses significant computational and memory challenges. Caching all …
Guangxuan Xiao
,
Jiaming Tang
,
Jingwei Zuo
,
Junxian Guo
,
Shang Yang
,
Haotian Tang
,
Yao Fu
,
Song Han
PDF
Cite
Code
Project
Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096x4096 resolution. Sana can synthesize …
Enze Xie
,
Junsong Chen
,
Junyu Chen
,
Han Cai
,
Haotian Tang
,
Yujun Lin
,
Zhekai Zhang
,
Muyang Li
,
Ligeng Zhu
,
Yao (Jason) Lu
,
Song Han
PDF
Cite
Code
Project
MIT Project
Demo
Cite
×