Efficient AI
Efficient AI
News
Members
Publications
Light
Dark
Automatic
Guangxuan Xiao
Latest
XAttention: Block Sparse Attention with Antidiagonal Scoring
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Cite
×