Projects

NVIDIA Nemotron 3 Ultra

Published:

We are releasing NVIDIA Nemotron 3 Ultra - our largest and most capable model yet. Nemotron 3 Ultra is a 55B active 550B total parameter Mixture-of-Experts hybrid Mamba-Transformer model that leverages Latent MoE, includes MTP Layers, and was pre-trained in NVFP4.

Accelerating RL Post-Training with Speculative Decoding in NeMo RL

Published:

We integrate speculative decoding into NeMo RL with a vLLM backend to accelerate rollout generation while preserving verifier-side training semantics. On 8B reasoning workloads, this yields up to 1.8x faster rollout generation and up to 1.4x faster RL steps, with projected gains of roughly 2.5x at 235B scale.

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Published:

We introduce Nemotron-Cascade 2, an open 30B MoE model with 3B activated parameters that delivers best-in-class reasoning and strong agentic capabilities. It is the second open-weight LLM, after DeepSeek-V3.2-Speciale-671B-A37B, to achieve Gold Medal-level 🏅 performance in 2025 International Mathematical Olympiad (IMO), the International Olympiad in Informatics (IOI), and the ICPC World Finals.

NVIDIA Nemotron 3 Super

Published:

We are releasing NVIDIA Nemotron 3 Super, a 12B active 120B total parameter Mixture-of-Experts hybrid Mamba-Transformer model. Nemotron 3 Super is the first model in the Nemotron 3 series that leverages Latent MoE, includes MTP Layers, and was pre-trained in NVFP4.