NVIDIA Nemotron 3 Ultra
Published:
Models Ultra Tech Report Nemotron 3 Blog

We present our most capable model yet – Nemotron 3 Ultra with 550 billion total and 55 billion active parameters. Nemotron 3 Ultra is the final and best model of the Nemotron 3 family of models.
Key Features
- Employs Mixture-of-Experts Hybrid Mamba-Attention architecture.
- Leverages LatentMoE for improved accuracy.
- Includes MTP layers for faster inference through native speculative decoding.
- Supports inference time reasoning budget control.
- Pretrained in NVFP4.
- Post-trained with enhanced pipeline involving Supervised Fine Tuning (SFT), Reinforcement Learning (RL), and Multi-teacher On-Policy Distillation (MOPD) for improved model accuracy.
Key Highlights

- Nemotron 3 Ultra achieves 5.9x, 4.8x, and 1.6x higher inference throughput compared to GLM-5.1-754B-A40B, Kimi-K2.6-1T-A32B, and Qwen-3.5-397B-17B respectively on the 8k token input / 64k token output setting.
- Nemotron 3 Ultra achieves on-par accuracies compared to other state-of-the-art open LLMs across a diverse set of benchmarks.
- Supports context length of up to 1M tokens while outperforming state-of-the-art open LLMs on RULER at 1M context length.
Open Source
We are releasing the pre-trained, post-trained, and quantized checkpoints along with the datasets used for training.
Checkpoints:
- Nemotron 3 Ultra 550B-A55B NVFP4: post-trained and NVFP4 quantized model
- Nemotron 3 Ultra 550B-A55B BF16: post-trained model
- Nemotron 3 Ultra 550B-A55B Base BF16: base model
- Nemotron 3 Ultra 550B-A55B GenRM: GenRM used for RLHF
Data:
- Nemotron-Pretraining-Code-v3: 173B tokens of fresh code data from GitHub through September 30, 2025.
- Nemotron-Pretraining-Legal-v1: A collection of synthetic datasets intended to improve the legal capabilities of LLMs.
- Nemotron-Pretraining-Specialized-v1.2: A collection of synthetic datasets aimed to improve LLM capabilities on factual recall, moral scenarios, and diverse generative and multiple choice questions.
- Nemotron-Posttraining-v3: A collection of post-training datasets for improving agentic, reasoning, and general model capabilities during SFT and RL.
Model Recipes:
Tech Report
More technical details in the Tech Report