Landslide susceptibility assessment in Eastern Himalayas, India: a comprehensive exploration of four novel hybrid ensemble data driven techniques integrating explainable artificial intelligence approach

IF 2.8 4区 环境科学与生态学 Q3 ENVIRONMENTAL SCIENCES
Sumon Dey, Swarup Das, Sujit Kumar Roy
{"title":"Landslide susceptibility assessment in Eastern Himalayas, India: a comprehensive exploration of four novel hybrid ensemble data driven techniques integrating explainable artificial intelligence approach","authors":"Sumon Dey,&nbsp;Swarup Das,&nbsp;Sujit Kumar Roy","doi":"10.1007/s12665-024-11945-z","DOIUrl":null,"url":null,"abstract":"<div><p>In the field of landslide susceptibility, the utilization of data driven methodologies has seen a significant breakthrough. However, the performance of the models depends on the geo-environmental factors, and the selection of factors vary from one location to another, and this leads to a persistent lacuna for the present exploration. This study was aimed to assess landslide susceptibility for Darjeeling hills in Eastern Himalayan region with sixteen causative geo-environmental factors. The selection of causal factors was performed through a two-stage procedure, namely Pearson’s correlation coefficient (PCC) and Boruta algorithm (PCC-BA). The dataset associated with the research was split randomly into 70:30 ratio for train and test data. In addition, 30% of the training data was taken as validation dataset. Four advanced data-driven models namely K-nearest neighbour (KNN), Boosted Tree (BT), Gradient Boosting Machines (GBM) and ensembled Neural Network with Principal Component Analysis (PCA-NN) were taken up and four advanced novel ensembles namely KNN-BT, PCA-NN-BT, GBM-KNN and GBM-PCA-NN were constructed. The susceptibility maps were grouped into five divisions, viz., very low (VL), low (L), medium (M), high (H), and very high (VH) susceptibility. Through area under receiver operation characteristics curve, the accomplishment of constructed susceptibility models was substantiated with training, testing and validation dataset, where KNN-BT attained 0.943, 0.889 and 0.944 respectively, PCA-NN-BT attained 0.934, 0.876 and 0.943 respectively; GBM-KNN attained 0.959, 0.897 and 0.957 respectively; and GBM-PCA-NN attained 0.956, 0.889 and 0.962 respectively. The researchers have utilized an extensive explainable artificial intelligence (ex-AI) method, partial dependence profile (PDP) to quantify the effect of causal factors on all the four ensembled models. The study was aimed to demonstrate a significant capacity to substantially optimize disaster mitigation policies with a constituent endeavour to bridge the chasm between contemporary machine learning approaches and geo-spatial applications, and thereby paving the way to enhance the resilience of inhabitants in landslide prone areas of hilly portion of Darjeeling district.</p></div>","PeriodicalId":542,"journal":{"name":"Environmental Earth Sciences","volume":"83 22","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Earth Sciences","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s12665-024-11945-z","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

In the field of landslide susceptibility, the utilization of data driven methodologies has seen a significant breakthrough. However, the performance of the models depends on the geo-environmental factors, and the selection of factors vary from one location to another, and this leads to a persistent lacuna for the present exploration. This study was aimed to assess landslide susceptibility for Darjeeling hills in Eastern Himalayan region with sixteen causative geo-environmental factors. The selection of causal factors was performed through a two-stage procedure, namely Pearson’s correlation coefficient (PCC) and Boruta algorithm (PCC-BA). The dataset associated with the research was split randomly into 70:30 ratio for train and test data. In addition, 30% of the training data was taken as validation dataset. Four advanced data-driven models namely K-nearest neighbour (KNN), Boosted Tree (BT), Gradient Boosting Machines (GBM) and ensembled Neural Network with Principal Component Analysis (PCA-NN) were taken up and four advanced novel ensembles namely KNN-BT, PCA-NN-BT, GBM-KNN and GBM-PCA-NN were constructed. The susceptibility maps were grouped into five divisions, viz., very low (VL), low (L), medium (M), high (H), and very high (VH) susceptibility. Through area under receiver operation characteristics curve, the accomplishment of constructed susceptibility models was substantiated with training, testing and validation dataset, where KNN-BT attained 0.943, 0.889 and 0.944 respectively, PCA-NN-BT attained 0.934, 0.876 and 0.943 respectively; GBM-KNN attained 0.959, 0.897 and 0.957 respectively; and GBM-PCA-NN attained 0.956, 0.889 and 0.962 respectively. The researchers have utilized an extensive explainable artificial intelligence (ex-AI) method, partial dependence profile (PDP) to quantify the effect of causal factors on all the four ensembled models. The study was aimed to demonstrate a significant capacity to substantially optimize disaster mitigation policies with a constituent endeavour to bridge the chasm between contemporary machine learning approaches and geo-spatial applications, and thereby paving the way to enhance the resilience of inhabitants in landslide prone areas of hilly portion of Darjeeling district.

印度东喜马拉雅山滑坡易发性评估:四种新型混合集合数据驱动技术与可解释人工智能方法的综合探索
在滑坡易发性领域,数据驱动方法的利用取得了重大突破。然而,模型的性能取决于地质环境因素,而且不同地点对因素的选择也不尽相同,这导致目前的探索一直存在空白。本研究旨在利用 16 个地质环境因素评估东喜马拉雅地区大吉岭山体的滑坡易发性。因果因素的选择通过两个阶段的程序进行,即皮尔逊相关系数(PCC)和博鲁塔算法(PCC-BA)。与研究相关的数据集按 70:30 的比例随机分为训练数据和测试数据。此外,训练数据的 30% 被用作验证数据集。研究采用了四种先进的数据驱动模型,即 K-近邻(KNN)、助推树(BT)、梯度助推机(GBM)和主成分分析神经网络(PCA-NN),并构建了四种先进的新型集合,即 KNN-BT、PCA-NN-BT、GBM-KNN 和 GBM-PCA-NN。易感性图被分为五类,即极低易感性(VL)、低易感性(L)、中等易感性(M)、高易感性(H)和极高易感性(VH)。通过训练、测试和验证数据集的接收者操作特征曲线下面积,证实了所构建的易感性模型的成就,其中 KNN-BT 分别为 0.943、0.889 和 0.944,PCA-NN-BT 分别为 0.934、0.876 和 0.943;GBM-KNN 分别为 0.959、0.897 和 0.957;GBM-PCA-NN 分别为 0.956、0.889 和 0.962。研究人员利用了一种广泛的可解释人工智能(ex-AI)方法--部分依赖性轮廓(PDP)来量化因果因素对所有四个集合模型的影响。这项研究旨在展示大幅优化减灾政策的重要能力,并努力弥合当代机器学习方法与地理空间应用之间的鸿沟,从而为提高大吉岭地区丘陵地带滑坡易发区居民的抗灾能力铺平道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Environmental Earth Sciences
Environmental Earth Sciences 环境科学-地球科学综合
CiteScore
5.10
自引率
3.60%
发文量
494
审稿时长
8.3 months
期刊介绍: Environmental Earth Sciences is an international multidisciplinary journal concerned with all aspects of interaction between humans, natural resources, ecosystems, special climates or unique geographic zones, and the earth: Water and soil contamination caused by waste management and disposal practices Environmental problems associated with transportation by land, air, or water Geological processes that may impact biosystems or humans Man-made or naturally occurring geological or hydrological hazards Environmental problems associated with the recovery of materials from the earth Environmental problems caused by extraction of minerals, coal, and ores, as well as oil and gas, water and alternative energy sources Environmental impacts of exploration and recultivation – Environmental impacts of hazardous materials Management of environmental data and information in data banks and information systems Dissemination of knowledge on techniques, methods, approaches and experiences to improve and remediate the environment In pursuit of these topics, the geoscientific disciplines are invited to contribute their knowledge and experience. Major disciplines include: hydrogeology, hydrochemistry, geochemistry, geophysics, engineering geology, remediation science, natural resources management, environmental climatology and biota, environmental geography, soil science and geomicrobiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信