The impact of internal variability on benchmarking deep learning climate emulators

arXiv - CS - Computational Engineering, Finance, and Science Pub Date : 2024-08-09 DOI:arxiv-2408.05288

Björn Lütjens, Raffaele Ferrari, Duncan Watson-Parris, Noelle Selin

{"title":"The impact of internal variability on benchmarking deep learning climate emulators","authors":"Björn Lütjens, Raffaele Ferrari, Duncan Watson-Parris, Noelle Selin","doi":"arxiv-2408.05288","DOIUrl":null,"url":null,"abstract":"Full-complexity Earth system models (ESMs) are computationally very\nexpensive, limiting their use in exploring the climate outcomes of multiple\nemission pathways. More efficient emulators that approximate ESMs can directly\nmap emissions onto climate outcomes, and benchmarks are being used to evaluate\ntheir accuracy on standardized tasks and datasets. We investigate a popular\nbenchmark in data-driven climate emulation, ClimateBench, on which deep\nlearning-based emulators are currently achieving the best performance. We\nimplement a linear regression-based emulator, akin to pattern scaling, and find\nthat it outperforms the incumbent 100M-parameter deep learning foundation\nmodel, ClimaX, on 3 out of 4 regionally-resolved surface-level climate\nvariables. While emulating surface temperature is expected to be predominantly\nlinear, this result is surprising for emulating precipitation. We identify that\nthis outcome is a result of high levels of internal variability in the\nbenchmark targets. To address internal variability, we update the benchmark\ntargets with ensemble averages from the MPI-ESM1.2-LR model that contain 50\ninstead of 3 climate simulations per emission pathway. Using the new targets,\nwe show that linear pattern scaling continues to be more accurate on\ntemperature, but can be outperformed by a deep learning-based model for\nemulating precipitation. We publish our code, data, and an interactive tutorial\nat github.com/blutjens/climate-emulator.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Engineering, Finance, and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05288","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Full-complexity Earth system models (ESMs) are computationally very expensive, limiting their use in exploring the climate outcomes of multiple emission pathways. More efficient emulators that approximate ESMs can directly map emissions onto climate outcomes, and benchmarks are being used to evaluate their accuracy on standardized tasks and datasets. We investigate a popular benchmark in data-driven climate emulation, ClimateBench, on which deep learning-based emulators are currently achieving the best performance. We implement a linear regression-based emulator, akin to pattern scaling, and find that it outperforms the incumbent 100M-parameter deep learning foundation model, ClimaX, on 3 out of 4 regionally-resolved surface-level climate variables. While emulating surface temperature is expected to be predominantly linear, this result is surprising for emulating precipitation. We identify that this outcome is a result of high levels of internal variability in the benchmark targets. To address internal variability, we update the benchmark targets with ensemble averages from the MPI-ESM1.2-LR model that contain 50 instead of 3 climate simulations per emission pathway. Using the new targets, we show that linear pattern scaling continues to be more accurate on temperature, but can be outperformed by a deep learning-based model for emulating precipitation. We publish our code, data, and an interactive tutorial at github.com/blutjens/climate-emulator.

查看原文本刊更多论文

内部变异对深度学习气候模拟器基准测试的影响

全复杂地球系统模型（ESM）的计算成本非常高，限制了它们在探索多种排放路径的气候结果方面的应用。近似 ESM 的更高效模拟器可以直接将排放映射到气候结果上，目前正在使用基准来评估它们在标准化任务和数据集上的准确性。我们研究了数据驱动气候模拟的一个流行基准--ClimateBench，基于深度学习的模拟器目前在该基准上取得了最佳性能。我们实施了一个基于线性回归的仿真器，类似于模式缩放，发现它在 4 个区域分辨地表级气候变量中的 3 个上优于现有的 1 亿参数深度学习基础模型 ClimaX。虽然模拟地表温度预计会占主导地位，但这一结果在模拟降水方面却出人意料。我们发现，这一结果是由于基准目标的内部变异水平较高造成的。为了解决内部变异问题，我们用 MPI-ESM1.2-LR 模型的集合平均值更新了基准目标，每个排放途径包含 50 个而不是 3 个气候模拟。使用新目标后，我们发现线性模式缩放对温度的影响仍然更准确，但基于深度学习的降水预测模型的表现可能会更好。我们在 github.com/blutjens/climate-emulator 上发布了我们的代码、数据和互动教程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Computational Engineering, Finance, and Science

自引率

0.00%

发文量