Unveiling flood-generating mechanisms using circular statistics-based machine learning approach without the need for discharge data during inference

IF 2.6 4区 环境科学与生态学 Q2 WATER RESOURCES
Zhi Zhang, Dagang Wang, Xinxin Wu, Yiwen Mei, Jianxiu Qiu, Jinxin Zhu
{"title":"Unveiling flood-generating mechanisms using circular statistics-based machine learning approach without the need for discharge data during inference","authors":"Zhi Zhang, Dagang Wang, Xinxin Wu, Yiwen Mei, Jianxiu Qiu, Jinxin Zhu","doi":"10.2166/nh.2023.058","DOIUrl":null,"url":null,"abstract":"Understanding the drivers of flooding is essential for flood disaster prevention. However, conventional flood prediction methods are hindered by their reliance on local discharge data, which can be constrained by limited spatial resolution. To address this limitation, we present a machine learning model that can categorize floods without requiring discharge data during inference. We first use circular statistics to calculate the relative importance of three candidate flood-generating mechanisms. Global land areas are classified into three primary categories and eight sub-categories based on the proportion of relative importance. A random forest model is then applied to identify the flood types by assuming that the discharge data is unavailable. The findings from circular statistics highlight that globally, soil moisture excess is the most influential driver of floods followed by extreme precipitation and snowmelt, with an average relative importance of 0.535, 0.387, and 0.078, respectively. The RF model performs well in resembling the three primary flood categories with an accuracy of 0.701 and a F1-score of 0.692 in 10-fold cross-validation. The trained gridded-based model provides a swift and efficient approach for analyzing flood mechanisms, even in limited discharge scenarios, allowing for rapid insights.","PeriodicalId":13096,"journal":{"name":"Hydrology Research","volume":"47 1","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hydrology Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2166/nh.2023.058","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"WATER RESOURCES","Score":null,"Total":0}
引用次数: 0

Abstract

Understanding the drivers of flooding is essential for flood disaster prevention. However, conventional flood prediction methods are hindered by their reliance on local discharge data, which can be constrained by limited spatial resolution. To address this limitation, we present a machine learning model that can categorize floods without requiring discharge data during inference. We first use circular statistics to calculate the relative importance of three candidate flood-generating mechanisms. Global land areas are classified into three primary categories and eight sub-categories based on the proportion of relative importance. A random forest model is then applied to identify the flood types by assuming that the discharge data is unavailable. The findings from circular statistics highlight that globally, soil moisture excess is the most influential driver of floods followed by extreme precipitation and snowmelt, with an average relative importance of 0.535, 0.387, and 0.078, respectively. The RF model performs well in resembling the three primary flood categories with an accuracy of 0.701 and a F1-score of 0.692 in 10-fold cross-validation. The trained gridded-based model provides a swift and efficient approach for analyzing flood mechanisms, even in limited discharge scenarios, allowing for rapid insights.
使用基于循环统计的机器学习方法揭示洪水产生机制,而不需要在推理过程中使用流量数据
了解洪水的驱动因素对防洪至关重要。然而,传统的洪水预测方法依赖于局部流量数据,而这些数据受限于有限的空间分辨率。为了解决这一限制,我们提出了一种机器学习模型,可以在推理期间不需要流量数据的情况下对洪水进行分类。我们首先使用循环统计来计算三种候选洪水发生机制的相对重要性。根据相对重要性的比例,将全球陆地面积划分为3个主要类别和8个次级类别。然后,假设流量数据不可用,应用随机森林模型来识别洪水类型。循环统计结果表明,在全球范围内,土壤水分过剩是洪水的最大驱动因素,其次是极端降水和融雪,其平均相对重要性分别为0.535、0.387和0.078。在10倍交叉验证中,RF模型对三种主要洪水类别具有较好的相似性,准确率为0.701,f1得分为0.692。经过训练的基于网格的模型提供了一种快速有效的方法来分析洪水机制,即使在有限的流量情况下,也可以快速洞察。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Hydrology Research
Hydrology Research WATER RESOURCES-
CiteScore
5.00
自引率
7.40%
发文量
0
审稿时长
3.8 months
期刊介绍: Hydrology Research provides international coverage on all aspects of hydrology in its widest sense, and welcomes the submission of papers from across the subject. While emphasis is placed on studies of the hydrological cycle, the Journal also covers the physics and chemistry of water. Hydrology Research is intended to be a link between basic hydrological research and the practical application of scientific results within the broad field of water management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信