Landslide Modeling in a Tropical Mountain Basin Using Machine Learning Algorithms and Shapley Additive Explanations

IF 3.5 Q2 ENVIRONMENTAL SCIENCES
J. Vega, F. H. Sepúlveda-Murillo, M. Parra
{"title":"Landslide Modeling in a Tropical Mountain Basin Using Machine Learning Algorithms and Shapley Additive Explanations","authors":"J. Vega, F. H. Sepúlveda-Murillo, M. Parra","doi":"10.1177/11786221231195824","DOIUrl":null,"url":null,"abstract":"Landslides are a geological hazard commonly induced by rainfall, earthquakes, deforestation, or human activity causing loss of human life every year specially on highlands or mountain slopes with serious impacts that threaten communities and its infrastructure. The incidence and recurrence of landslides are conditioned by several aspects related to soil properties, geological structure, climatic conditions, soil cover, and water flow. Precisely, Colombia is one of the most affected by this type of natural hazard, as well as by floods, since they are the natural phenomena that bring with them the most severe risks for communities. In this work, we articulated the statistical approach of the landslide conditioning factors, Machine Learning Algorithms (MLA), and Geographic Information System (GIS), evaluating a flexible and agile methodology to estimate the landslide susceptibility defining areas prone to the landslide occurrence. The MLA were validated in a case study in the “La Liboriana” River basin, located in the Municipality of Salgar in the Colombian mountains Andes where Landslide Susceptibility Maps (LSMs) were obtained. The obtained MLA results hold immense potential in the field of regional landslide mapping, facilitating the development of effective strategies aimed at minimizing the devastating impacts on human lives, infrastructure, and the natural environment. By leveraging these findings, proactive measures can be devised to safeguard vulnerable areas, mitigate risks, and ensure the safety and well-being of communities. Seven supervised MLA were employed, two regression algorithms (Logistic) and five decision tree algorithms (Recursive Partitioning and Regression Trees [RPART], Conditional Inference Trees [CTREE], Random Forest [RF], Ranger, and Extreme Gradient Boosting Algorithm [XGBoost]). The LSMs were produced for each MLA. Considering different performance metrics, the RF model yields the best classification accuracy with an area under receiver operating characteristic (ROC) curve of 95% and 90% of accuracy, providing the most representative results. Finally, the contribution of each landslide conditioning factor on predictions with RF model is explained using the SHAP method.","PeriodicalId":44801,"journal":{"name":"Air Soil and Water Research","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Air Soil and Water Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11786221231195824","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Landslides are a geological hazard commonly induced by rainfall, earthquakes, deforestation, or human activity causing loss of human life every year specially on highlands or mountain slopes with serious impacts that threaten communities and its infrastructure. The incidence and recurrence of landslides are conditioned by several aspects related to soil properties, geological structure, climatic conditions, soil cover, and water flow. Precisely, Colombia is one of the most affected by this type of natural hazard, as well as by floods, since they are the natural phenomena that bring with them the most severe risks for communities. In this work, we articulated the statistical approach of the landslide conditioning factors, Machine Learning Algorithms (MLA), and Geographic Information System (GIS), evaluating a flexible and agile methodology to estimate the landslide susceptibility defining areas prone to the landslide occurrence. The MLA were validated in a case study in the “La Liboriana” River basin, located in the Municipality of Salgar in the Colombian mountains Andes where Landslide Susceptibility Maps (LSMs) were obtained. The obtained MLA results hold immense potential in the field of regional landslide mapping, facilitating the development of effective strategies aimed at minimizing the devastating impacts on human lives, infrastructure, and the natural environment. By leveraging these findings, proactive measures can be devised to safeguard vulnerable areas, mitigate risks, and ensure the safety and well-being of communities. Seven supervised MLA were employed, two regression algorithms (Logistic) and five decision tree algorithms (Recursive Partitioning and Regression Trees [RPART], Conditional Inference Trees [CTREE], Random Forest [RF], Ranger, and Extreme Gradient Boosting Algorithm [XGBoost]). The LSMs were produced for each MLA. Considering different performance metrics, the RF model yields the best classification accuracy with an area under receiver operating characteristic (ROC) curve of 95% and 90% of accuracy, providing the most representative results. Finally, the contribution of each landslide conditioning factor on predictions with RF model is explained using the SHAP method.
使用机器学习算法和Shapley加法解释的热带山地盆地滑坡建模
滑坡是一种地质灾害,通常由降雨、地震、森林砍伐或人类活动引起,每年都会造成人员生命损失,尤其是在高地或山坡上,其严重影响威胁到社区及其基础设施。滑坡的发生和复发取决于与土壤性质、地质结构、气候条件、土壤覆盖和水流有关的几个方面。确切地说,哥伦比亚是受这类自然灾害和洪水影响最大的国家之一,因为洪水是给社区带来最严重风险的自然现象。在这项工作中,我们阐述了滑坡条件因素的统计方法、机器学习算法(MLA)和地理信息系统(GIS),评估了一种灵活敏捷的方法来估计滑坡易发性,确定了滑坡易发区域。MLA在“La Liboriana”河流域的一个案例研究中得到了验证,该流域位于哥伦比亚安第斯山脉的萨尔加市,在那里获得了滑坡易感性图(LSM)。所获得的MLA结果在区域滑坡测绘领域具有巨大潜力,有助于制定有效的战略,最大限度地减少对人类生活、基础设施和自然环境的破坏性影响。通过利用这些发现,可以制定积极主动的措施来保护脆弱地区,减轻风险,并确保社区的安全和福祉。采用了七种监督MLA、两种回归算法(Logistic)和五种决策树算法(递归划分和回归树[RPART]、条件推理树[CTREE]、随机森林[RF]、Ranger和极限梯度提升算法[XGBoost])。LSM是为每个MLA制作的。考虑到不同的性能指标,RF模型产生了最佳的分类精度,接收器工作特性下面积(ROC)曲线的精度分别为95%和90%,提供了最具代表性的结果。最后,利用SHAP方法解释了每个滑坡条件因子对RF模型预测的贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Air Soil and Water Research
Air Soil and Water Research ENVIRONMENTAL SCIENCES-
CiteScore
7.80
自引率
5.30%
发文量
27
审稿时长
8 weeks
期刊介绍: Air, Soil & Water Research is an open access, peer reviewed international journal covering all areas of research into soil, air and water. The journal looks at each aspect individually, as well as how they interact, with each other and different components of the environment. This includes properties (including physical, chemical, biochemical and biological), analysis, microbiology, chemicals and pollution, consequences for plants and crops, soil hydrology, changes and consequences of change, social issues, and more. The journal welcomes readerships from all fields, but hopes to be particularly profitable to analytical and water chemists and geologists as well as chemical, environmental, petrochemical, water treatment, geophysics and geological engineers. The journal has a multi-disciplinary approach and includes research, results, theory, models, analysis, applications and reviews. Work in lab or field is applicable. Of particular interest are manuscripts relating to environmental concerns. Other possible topics include, but are not limited to: Properties and analysis covering all areas of research into soil, air and water individually as well as how they interact with each other and different components of the environment Soil hydrology and microbiology Changes and consequences of environmental change, chemicals and pollution.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信