Prediction of flood sensitivity based on Logistic Regression, eXtreme Gradient Boosting, and Random Forest modeling methods

Ying Wu, Zhiming Zhang, Xiaotian Qi, Wenhan Hu, Shuai Si
{"title":"Prediction of flood sensitivity based on Logistic Regression, eXtreme Gradient Boosting, and Random Forest modeling methods","authors":"Ying Wu, Zhiming Zhang, Xiaotian Qi, Wenhan Hu, Shuai Si","doi":"10.2166/wst.2024.146","DOIUrl":null,"url":null,"abstract":"\n \n Floods are one of the most destructive disasters that cause loss of life and property worldwide every year. In this study, the aim was to find the best-performing model in flood sensitivity assessment and analyze key characteristic factors, the spatial pattern of flood sensitivity was evaluated using three machine learning (ML) models: Logistic Regression (LR), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF). Suqian City in Jiangsu Province was selected as the study area, and a random sample dataset of historical flood points was constructed. Fifteen different meteorological, hydrological, and geographical spatial variables were considered in the flood sensitivity assessment, 12 variables were selected based on the multi-collinearity study. Among the results of comparing the selected ML models, the RF method had the highest AUC value, accuracy, and comprehensive evaluation effect, and is a reliable and effective flood risk assessment model. As the main output of this study, the flood sensitivity map is divided into five categories, ranging from very low to very high sensitivity. Using the RF model (i.e., the highest accuracy of the model), the high-risk area covers about 44% of the study area, mainly concentrated in the central, eastern, and southern parts of the old city area.","PeriodicalId":298320,"journal":{"name":"Water Science & Technology","volume":"11 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Science & Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2166/wst.2024.146","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Floods are one of the most destructive disasters that cause loss of life and property worldwide every year. In this study, the aim was to find the best-performing model in flood sensitivity assessment and analyze key characteristic factors, the spatial pattern of flood sensitivity was evaluated using three machine learning (ML) models: Logistic Regression (LR), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF). Suqian City in Jiangsu Province was selected as the study area, and a random sample dataset of historical flood points was constructed. Fifteen different meteorological, hydrological, and geographical spatial variables were considered in the flood sensitivity assessment, 12 variables were selected based on the multi-collinearity study. Among the results of comparing the selected ML models, the RF method had the highest AUC value, accuracy, and comprehensive evaluation effect, and is a reliable and effective flood risk assessment model. As the main output of this study, the flood sensitivity map is divided into five categories, ranging from very low to very high sensitivity. Using the RF model (i.e., the highest accuracy of the model), the high-risk area covers about 44% of the study area, mainly concentrated in the central, eastern, and southern parts of the old city area.
基于逻辑回归、极梯度提升和随机森林建模方法的洪水敏感性预测
洪水是最具破坏性的灾害之一,每年都会在全球范围内造成生命和财产损失。本研究旨在找到洪水敏感性评估中表现最佳的模型,并分析关键特征因素,使用三种机器学习(ML)模型对洪水敏感性的空间模式进行评估:使用三种机器学习(ML)模型:逻辑回归(LR)、梯度提升(XGBoost)和随机森林(RF),对洪水敏感性的空间模式进行了评估。选择江苏省宿迁市作为研究区域,并构建了历史洪水点的随机样本数据集。在洪水敏感性评估中考虑了 15 个不同的气象、水文和地理空间变量,其中 12 个变量是在多重共线性研究的基础上选出的。在所选 ML 模型的比较结果中,RF 方法的 AUC 值、准确度和综合评价效果最高,是一种可靠有效的洪水风险评估模型。作为本研究的主要成果,洪水敏感性图被分为五个类别,敏感性从非常低到非常高不等。使用 RF 模型(即模型的最高精度),高风险区约占研究区域的 44%,主要集中在老城区的中部、东部和南部。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信