Projects

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Published: March 16, 2026

We introduce Nemotron-Cascade 2, an open 30B MoE model with 3B activated parameters that delivers best-in-class reasoning and strong agentic capabilities. It is the second open-weight LLM, after DeepSeek-V3.2-Speciale-671B-A37B, to achieve Gold Medal-level 🏅 performance in 2025 International Mathematical Olympiad (IMO), the International Olympiad in Informatics (IOI), and the ICPC World Finals.

NVIDIA Nemotron 3 Super

Published: March 10, 2026

We are releasing NVIDIA Nemotron 3 Super, a 12B active 120B total parameter Mixture-of-Experts hybrid Mamba-Transformer model. Nemotron 3 Super is the first model in the Nemotron 3 series that leverages Latent MoE, includes MTP Layers, and was pre-trained in NVFP4.

Enable NVFP4 Inference for Nemotron with Quantization-Aware Distillation

Published: January 28, 2026

QAD Tech Report

Nemotron-3-Nano-30B-A3B-NVFP4 Model Card

Think Smart About Sparse Compute: LatentMoE for Higher Accuracy per FLOP and per Parameter

Published: January 27, 2026

Read the Paper

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Published: December 15, 2025

We scale up cascaded reinforcement learning (Cascade RL) to develop general purpose reasoning models, Nemotron-Cascade, capable of operating in both instruct and deep thinking modes. Our 14B model can outperform its SFT teacher and achieves silver-medal performance in IOI 2025.

NVIDIA Nemotron 3 Family of Models

Published: December 15, 2025