{"title":"CoDiCast: Conditional Diffusion Model for Weather Prediction with Uncertainty Quantification","authors":"Jimeng Shi, Bowen Jin, Jiawei Han, Giri Narasimhan","doi":"arxiv-2409.05975","DOIUrl":null,"url":null,"abstract":"Accurate weather forecasting is critical for science and society. Yet,\nexisting methods have not managed to simultaneously have the properties of high\naccuracy, low uncertainty, and high computational efficiency. On one hand, to\nquantify the uncertainty in weather predictions, the strategy of ensemble\nforecast (i.e., generating a set of diverse predictions) is often employed.\nHowever, traditional ensemble numerical weather prediction (NWP) is\ncomputationally intensive. On the other hand, most existing machine\nlearning-based weather prediction (MLWP) approaches are efficient and accurate.\nNevertheless, they are deterministic and cannot capture the uncertainty of\nweather forecasting. In this work, we propose CoDiCast, a conditional diffusion\nmodel to generate accurate global weather prediction, while achieving\nuncertainty quantification with ensemble forecasts and modest computational\ncost. The key idea is to simulate a conditional version of the reverse\ndenoising process in diffusion models, which starts from pure Gaussian noise to\ngenerate realistic weather scenarios for a future time point. Each denoising\nstep is conditioned on observations from the recent past. Ensemble forecasts\nare achieved by repeatedly sampling from stochastic Gaussian noise to represent\nuncertainty quantification. CoDiCast is trained on a decade of ERA5 reanalysis\ndata from the European Centre for Medium-Range Weather Forecasts (ECMWF).\nExperimental results demonstrate that our approach outperforms several existing\ndata-driven methods in accuracy. Our conditional diffusion model, CoDiCast, can\ngenerate 3-day global weather forecasts, at 6-hour steps and $5.625^\\circ$\nlatitude-longitude resolution, for over 5 variables, in about 12 minutes on a\ncommodity A100 GPU machine with 80GB memory. The open-souced code is provided\nat \\url{https://github.com/JimengShi/CoDiCast}.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Atmospheric and Oceanic Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate weather forecasting is critical for science and society. Yet,
existing methods have not managed to simultaneously have the properties of high
accuracy, low uncertainty, and high computational efficiency. On one hand, to
quantify the uncertainty in weather predictions, the strategy of ensemble
forecast (i.e., generating a set of diverse predictions) is often employed.
However, traditional ensemble numerical weather prediction (NWP) is
computationally intensive. On the other hand, most existing machine
learning-based weather prediction (MLWP) approaches are efficient and accurate.
Nevertheless, they are deterministic and cannot capture the uncertainty of
weather forecasting. In this work, we propose CoDiCast, a conditional diffusion
model to generate accurate global weather prediction, while achieving
uncertainty quantification with ensemble forecasts and modest computational
cost. The key idea is to simulate a conditional version of the reverse
denoising process in diffusion models, which starts from pure Gaussian noise to
generate realistic weather scenarios for a future time point. Each denoising
step is conditioned on observations from the recent past. Ensemble forecasts
are achieved by repeatedly sampling from stochastic Gaussian noise to represent
uncertainty quantification. CoDiCast is trained on a decade of ERA5 reanalysis
data from the European Centre for Medium-Range Weather Forecasts (ECMWF).
Experimental results demonstrate that our approach outperforms several existing
data-driven methods in accuracy. Our conditional diffusion model, CoDiCast, can
generate 3-day global weather forecasts, at 6-hour steps and $5.625^\circ$
latitude-longitude resolution, for over 5 variables, in about 12 minutes on a
commodity A100 GPU machine with 80GB memory. The open-souced code is provided
at \url{https://github.com/JimengShi/CoDiCast}.