Björn Lütjens, Raffaele Ferrari, Duncan Watson-Parris, Noelle Selin
{"title":"The impact of internal variability on benchmarking deep learning climate emulators","authors":"Björn Lütjens, Raffaele Ferrari, Duncan Watson-Parris, Noelle Selin","doi":"arxiv-2408.05288","DOIUrl":null,"url":null,"abstract":"Full-complexity Earth system models (ESMs) are computationally very\nexpensive, limiting their use in exploring the climate outcomes of multiple\nemission pathways. More efficient emulators that approximate ESMs can directly\nmap emissions onto climate outcomes, and benchmarks are being used to evaluate\ntheir accuracy on standardized tasks and datasets. We investigate a popular\nbenchmark in data-driven climate emulation, ClimateBench, on which deep\nlearning-based emulators are currently achieving the best performance. We\nimplement a linear regression-based emulator, akin to pattern scaling, and find\nthat it outperforms the incumbent 100M-parameter deep learning foundation\nmodel, ClimaX, on 3 out of 4 regionally-resolved surface-level climate\nvariables. While emulating surface temperature is expected to be predominantly\nlinear, this result is surprising for emulating precipitation. We identify that\nthis outcome is a result of high levels of internal variability in the\nbenchmark targets. To address internal variability, we update the benchmark\ntargets with ensemble averages from the MPI-ESM1.2-LR model that contain 50\ninstead of 3 climate simulations per emission pathway. Using the new targets,\nwe show that linear pattern scaling continues to be more accurate on\ntemperature, but can be outperformed by a deep learning-based model for\nemulating precipitation. We publish our code, data, and an interactive tutorial\nat github.com/blutjens/climate-emulator.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Engineering, Finance, and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05288","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Full-complexity Earth system models (ESMs) are computationally very
expensive, limiting their use in exploring the climate outcomes of multiple
emission pathways. More efficient emulators that approximate ESMs can directly
map emissions onto climate outcomes, and benchmarks are being used to evaluate
their accuracy on standardized tasks and datasets. We investigate a popular
benchmark in data-driven climate emulation, ClimateBench, on which deep
learning-based emulators are currently achieving the best performance. We
implement a linear regression-based emulator, akin to pattern scaling, and find
that it outperforms the incumbent 100M-parameter deep learning foundation
model, ClimaX, on 3 out of 4 regionally-resolved surface-level climate
variables. While emulating surface temperature is expected to be predominantly
linear, this result is surprising for emulating precipitation. We identify that
this outcome is a result of high levels of internal variability in the
benchmark targets. To address internal variability, we update the benchmark
targets with ensemble averages from the MPI-ESM1.2-LR model that contain 50
instead of 3 climate simulations per emission pathway. Using the new targets,
we show that linear pattern scaling continues to be more accurate on
temperature, but can be outperformed by a deep learning-based model for
emulating precipitation. We publish our code, data, and an interactive tutorial
at github.com/blutjens/climate-emulator.