Efficient AI
Efficient AI
News
Publications
Light
Dark
Automatic
ICLR2026
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Post-training quantization (PTQ) compresses the weights and activations of large language models (LLMs) into low-precision …
Yesheng Liang
,
Haisheng Chen
,
Song Han
,
Zhijian Liu
PDF
Cite
Project
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Advancing machine intelligence requires developing the ability to perceive across multiple modalities, much as humans sense the world. …
Hanrong Ye
,
Chao-Han Huck Yang
,
Arushi Goel
,
Wei Huang
,
Zhen Wan
,
Jinchuan Tian
,
An-Chieh Cheng
,
Ligeng Zhu
,
Yuanhang Su
,
Yuming Lou
,
Yong-Xiang Lin
,
Dong Yang
,
Sreyan Ghosh
,
Zhijian Liu
,
Yukang Chen
,
Ehsan Jahangiri
,
Ambrish Dantrey
,
Daguang Xu
,
Ehsan Hosseini-Asl
,
Seyed Danial Mohseni Taheri
,
Vidya Nariyambut Murali
,
Sifei Liu
,
Yao (Jason) Lu
,
Oluwatobi Olabiyi
,
Yu-Chiang Frank Wang
,
Rafael Valle
,
Bryan Catanzaro
,
Andrew Tao
,
Song Han
,
Jan Kautz
,
Hongxu Yin
,
Pavlo Molchanov
PDF
Cite
Code
Project
Video
LongLive: Real-time Interactive Long Video Generation
We present LongLive, a frame-level autoregressive (AR) framework for real-time and interactive long video generation. Long video …
Shuai Yang
,
Wei Huang
,
Ruihang Chu
,
Yicheng Xiao
,
Yuyang Zhao
,
Xianbang Wang
,
Muyang Li
,
Enze Xie
,
Ying-Cong Chen
,
Yao (Jason) Lu
,
Song Han
,
Yukang Chen
PDF
Cite
Code
Project
Video
QeRL: Beyond Efficiency - Quantization-enhanced Reinforcement Learning for LLMs
We propose QeRL, a Quantization-enhanced Reinforcement Learning framework for large language models (LLMs). While RL is essential for …
Wei Huang
,
Yi Ge
,
Shuai Yang
,
Yicheng Xiao
,
Huizi Mao
,
Yujun Lin
,
Hanrong Ye
,
Sifei Liu
,
Ka Chun Cheung
,
Hongxu Yin
,
Yao (Jason) Lu
,
Xiaojuan Qi
,
Song Han
,
Yukang Chen
PDF
Cite
Code
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Vision-language models (VLMs) could power real-time assistants and autonomous agents, but they face a critical challenge: understanding …
Ruyi Xu
,
Guangxuan Xiao
,
Yukang Chen
,
Liuning He
,
Kelly Peng
,
Yao (Jason) Lu
,
Song Han
PDF
Cite
Code
Project
Slides
Demo
Fast-dLLM v2: Efficient Block-Diffusion LLM
Autoregressive (AR) large language models (LLMs) have achieved remarkable performance across a wide range of natural language tasks, …
Chengyue Wu
,
Hao Zhang
,
Shuchen Xue
,
Shizhe Diao
,
Yonggan Fu
,
Zhijian Liu
,
Pavlo Molchanov
,
Ping Luo
,
Song Han
,
Enze Xie
PDF
Cite
Code
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
We introduce SANA-Video, a small diffusion model that can efficiently generate videos up to 720×1280 resolution and minute-length …
Junsong Chen
,
Yuyang Zhao
,
Jincheng YU
,
Ruihang Chu
,
Junyu Chen
,
Shuai Yang
,
Xianbang Wang
,
Yicheng Pan
,
Daquan Zhou
,
Huan Ling
,
Haozhe Liu
,
Hongwei Yi
,
Hao Zhang
,
Muyang Li
,
Yukang Chen
,
Han Cai
,
Sanja Fidler
,
Ping Luo
,
Song Han
,
Enze Xie
PDF
Cite
Code
Project
Demo
3D Aware Region Prompted Vision Language Model
We present Spatial Region 3D (SR-3D) aware vision-language model that connects single-view 2D images and multi-view 3D data through a …
An-Chieh Cheng
,
Yang Fu
,
Yukang Chen
,
Zhijian Liu
,
Xiaolong Li
,
Subhashree Radhakrishnan
,
Song Han
,
Yao (Jason) Lu
,
Jan Kautz
,
Pavlo Molchanov
,
Hongxu Yin
,
Xiaolong Wang
,
Sifei Liu
PDF
Cite
Code
Dataset
Project
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation
We present Locality-aware Parallel Decoding (LPD) to accelerate autoregressive image generation. Traditional autoregressive image …
Zhuoyang Zhang
,
Luke J. Huang
,
Chengyue Wu
,
Shang Yang
,
Kelly Peng
,
Yao (Jason) Lu
,
Song Han
PDF
Cite
Code
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Diffusion-based large language models (Diffusion LLMs) have shown promise for non-autoregressive text generation. However, the …
Chengyue Wu
,
Hao Zhang
,
Shuchen Xue
,
Zhijian Liu
,
Shizhe Diao
,
Ligeng Zhu
,
Ping Luo
,
Song Han
,
Enze Xie
PDF
Cite
Code
Project
Demo
Cite
×