Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting

The recent revolution in data-driven methods for weather forecasting has lead to a fragmented landscape of complex, bespoke architectures and training strategies, obscuring the fundamental drivers of forecast accuracy. Here, we demonstrate that state-of-the-art probabilistic skill requires neither intricate architectural constraints nor specialized training heuristics. We introduce a scalable framework for learning multi-scale atmospheric dynamics by combining a directly downsampled latent space with a history-conditioned local projector that resolves high-resolution physics. We find that this simple design is robust to the choice of probabilistic estimator, seamlessly supporting stochastic interpolants, diffusion models, and CRPS-based ensemble training. Validated against the Integrated Forecasting System and the deep learning probabilistic model GenCast, our framework achieves statistically significant improvements on most of the variables. These results suggest that for medium-range prediction, scaling a unified, general-purpose model can be more effective than relying on domain-specific complexity or probabilistic estimation framework.

Authors

Suman Ravuri (NVIDIA)
Ira Shokar (NVIDIA)
Edoardo Calvello (NVIDIA)
Mohammad Shoaib Abbas (NVIDIA)
Peter Harrington (NVIDIA)
Ashay Subramaniam (NVIDIA)

Publication Date

Uploaded Files