Efficient AI
Efficient AI
News
Publications
Light
Dark
Automatic
Shang Yang
Latest
Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation
NVILA: Efficient Frontier Visual Language Models
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Cite
×