Explaining heatwaves with machine learning

IF 3 3区 地球科学 Q2 METEOROLOGY & ATMOSPHERIC SCIENCES
Sebastian Buschow, Jan Keller, Sabrina Wahl
{"title":"Explaining heatwaves with machine learning","authors":"Sebastian Buschow, Jan Keller, Sabrina Wahl","doi":"10.1002/qj.4642","DOIUrl":null,"url":null,"abstract":"Heatwaves are known to arise from the interplay between large-scale climate variability, synoptic weather patterns and regional to local scale surface processes. While recent research has made important progress for each individual contributing factor, ways to properly incorporate multiple or all of them in a unified analysis are still lacking. In this study, we consider a wide range of possible predictor variables from the ERA5 reanalysis, and ask, how much information on heatwave occurrence in Europe <i>can be learned</i>\nfrom each of them. To simplify the problem, we first adapt the recently developed logistic principal component analysis to the task of compressing large binary heatwave fields to a small number of interpretable principal components. The relationships between heatwaves and various climate variables can then be learned by a neural network. Starting from the simple notion that the importance of a variable is given by its impact on the performance of our statistical model, we arrive naturally at the definition of Shapley values. Classic results of game theory show that this is the only fair way of distributing the overall success of a model among its inputs. We find a non linear model that explains 70 % of reduced heatwave variability. The biggest individual contribution (27 % of the 70 %) comes from upper level geopotential, top level soil moisture is in second place (15 %). Beyond this decomposition, Shapley interaction values enable us to quantify overlapping information and positive synergies between all pairs of predictors.","PeriodicalId":49646,"journal":{"name":"Quarterly Journal of the Royal Meteorological Society","volume":"13 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quarterly Journal of the Royal Meteorological Society","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1002/qj.4642","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Heatwaves are known to arise from the interplay between large-scale climate variability, synoptic weather patterns and regional to local scale surface processes. While recent research has made important progress for each individual contributing factor, ways to properly incorporate multiple or all of them in a unified analysis are still lacking. In this study, we consider a wide range of possible predictor variables from the ERA5 reanalysis, and ask, how much information on heatwave occurrence in Europe can be learned from each of them. To simplify the problem, we first adapt the recently developed logistic principal component analysis to the task of compressing large binary heatwave fields to a small number of interpretable principal components. The relationships between heatwaves and various climate variables can then be learned by a neural network. Starting from the simple notion that the importance of a variable is given by its impact on the performance of our statistical model, we arrive naturally at the definition of Shapley values. Classic results of game theory show that this is the only fair way of distributing the overall success of a model among its inputs. We find a non linear model that explains 70 % of reduced heatwave variability. The biggest individual contribution (27 % of the 70 %) comes from upper level geopotential, top level soil moisture is in second place (15 %). Beyond this decomposition, Shapley interaction values enable us to quantify overlapping information and positive synergies between all pairs of predictors.
用机器学习解释热浪
众所周知,热浪产生于大尺度气候变异、同步天气模式和区域到局部尺度地表过程之间的相互作用。尽管最近的研究在每个单独的成因方面都取得了重要进展,但仍然缺乏将多个或所有成因纳入统一分析的方法。在本研究中,我们考虑了ERA5 再分析中的一系列可能的预测变量,并提出了一个问题:从每一个预测变量中可以了解到多少有关欧洲热浪发生的信息。为了简化问题,我们首先调整了最近开发的逻辑主成分分析法,将大量二元热浪场压缩为少量可解释的主成分。热浪与各种气候变量之间的关系可以通过神经网络来学习。一个变量的重要性取决于它对统计模型性能的影响,从这个简单的概念出发,我们很自然地得出了夏普利值的定义。博弈论的经典结果表明,这是在输入之间分配模型整体成功率的唯一公平方法。我们发现一个非线性模型可以解释 70% 的热浪减少变化。最大的单项贡献(占 70% 的 27%)来自高层位势,高层土壤湿度位居第二(15%)。除此分解外,夏普利交互值还能量化所有预测因子对之间的重叠信息和积极协同作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
16.80
自引率
4.50%
发文量
163
审稿时长
3-8 weeks
期刊介绍: The Quarterly Journal of the Royal Meteorological Society is a journal published by the Royal Meteorological Society. It aims to communicate and document new research in the atmospheric sciences and related fields. The journal is considered one of the leading publications in meteorology worldwide. It accepts articles, comprehensive review articles, and comments on published papers. It is published eight times a year, with additional special issues. The Quarterly Journal has a wide readership of scientists in the atmospheric and related fields. It is indexed and abstracted in various databases, including Advanced Polymers Abstracts, Agricultural Engineering Abstracts, CAB Abstracts, CABDirect, COMPENDEX, CSA Civil Engineering Abstracts, Earthquake Engineering Abstracts, Engineered Materials Abstracts, Science Citation Index, SCOPUS, Web of Science, and more.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信