Search

News
Publications

Light Dark Automatic

Junxian Guo

Latest

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
XAttention: Block Sparse Attention with Antidiagonal Scoring
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

© 2026 NVIDIA. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.

Cite