Enhancing landslide susceptibility mapping in the Himalayas: geospatial and machine learning with explainable AI (XAI)

IF 7.2 1区地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY

Gondwana Research Pub Date : 2025-09-06 DOI:10.1016/j.gr.2025.08.003

Manas Utthasini , Idhayachandhiran Ilampooranan , Suraj Kumar Singh , Shruti Kanga , Pankaj Kumar , Krishnagopal Halder , Biswajeet Pradhan , Amit Kumar Srivastava , Ranit Sundar Chatterjee , Rabin Chakrabortty , Tarig Ali , Gowhar Meraj

{"title":"Enhancing landslide susceptibility mapping in the Himalayas: geospatial and machine learning with explainable AI (XAI)","authors":"Manas Utthasini , Idhayachandhiran Ilampooranan , Suraj Kumar Singh , Shruti Kanga , Pankaj Kumar , Krishnagopal Halder , Biswajeet Pradhan , Amit Kumar Srivastava , Ranit Sundar Chatterjee , Rabin Chakrabortty , Tarig Ali , Gowhar Meraj","doi":"10.1016/j.gr.2025.08.003","DOIUrl":null,"url":null,"abstract":"<div><div>Landslides present a critical hazard in the Himalayas, where steep topography, intense rainfall, and tectonic activity converge to destabilize slopes. Accurate delineation of high-susceptibility zones is essential to safeguard lives, infrastructure, and ecosystems. Here, we construct a comprehensive Landslide Susceptibility Map (LSM) for Uttarakhand, a landslide-prone state in northern India, by integrating advanced ensemble machine learning (ML) with explainable AI. Our analysis comprises 35 geo-environmental variables, ranging from historical landslide inventories and remote sensing data to GIS-based geomorphological, hydrological, and anthropogenic layers. We evaluate six ML models (Logistic Regression, Support Vector Machine, Random Forest, Extra Trees, Gradient Boosting, and eXtreme Gradient Boosting) before consolidating them into a stacking ensemble (SE), achieving an Area Under the Curve (AUC) of 0.987 on the training set and 0.979 on the test set. Across models, false-negative rates were low; Extra Trees minimized missed events (FNR = 3.5 %) but with a high false-positive rate (23.6 %), whereas XGBoost and the SE achieved a better sensitivity–specificity balance (FNR = 5.6 and 5.5 %, respectively) with comparatively lower false positives, favoring operational use. Spatial transferability to Sikkim was strong (Uttarakhand test accuracies 0.864–0.917; Sikkim 0.905–0.971), with XGBoost yielding the highest Sikkim test accuracy (0.971) and ensemble approaches (GB, XGBoost, SE) all exceeding 0.96, highlighting robust generalization across different Himalayan regions. Our ensemble model surpasses all individual models and classifies the study area into five susceptibility zones (very low to very high), with 18.20 % of Uttarakhand, particularly in Pithoragarh, Chamoli, and Rudraprayag districts, falling under high-susceptibility zones. Further interpretability is provided by SHapley Additive exPlanations (SHAP), which highlight key drivers of slope failure, including slope angle, fault proximity, and rainfall. Our findings highlight the value of combining robust ML techniques with geoscientific data, thereby enhancing hazard assessments and informing disaster risk reduction across the Himalayas and similarly vulnerable terrains worldwide.</div></div>","PeriodicalId":12761,"journal":{"name":"Gondwana Research","volume":"149 ","pages":"Pages 262-290"},"PeriodicalIF":7.2000,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gondwana Research","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1342937X25002679","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Landslides present a critical hazard in the Himalayas, where steep topography, intense rainfall, and tectonic activity converge to destabilize slopes. Accurate delineation of high-susceptibility zones is essential to safeguard lives, infrastructure, and ecosystems. Here, we construct a comprehensive Landslide Susceptibility Map (LSM) for Uttarakhand, a landslide-prone state in northern India, by integrating advanced ensemble machine learning (ML) with explainable AI. Our analysis comprises 35 geo-environmental variables, ranging from historical landslide inventories and remote sensing data to GIS-based geomorphological, hydrological, and anthropogenic layers. We evaluate six ML models (Logistic Regression, Support Vector Machine, Random Forest, Extra Trees, Gradient Boosting, and eXtreme Gradient Boosting) before consolidating them into a stacking ensemble (SE), achieving an Area Under the Curve (AUC) of 0.987 on the training set and 0.979 on the test set. Across models, false-negative rates were low; Extra Trees minimized missed events (FNR = 3.5 %) but with a high false-positive rate (23.6 %), whereas XGBoost and the SE achieved a better sensitivity–specificity balance (FNR = 5.6 and 5.5 %, respectively) with comparatively lower false positives, favoring operational use. Spatial transferability to Sikkim was strong (Uttarakhand test accuracies 0.864–0.917; Sikkim 0.905–0.971), with XGBoost yielding the highest Sikkim test accuracy (0.971) and ensemble approaches (GB, XGBoost, SE) all exceeding 0.96, highlighting robust generalization across different Himalayan regions. Our ensemble model surpasses all individual models and classifies the study area into five susceptibility zones (very low to very high), with 18.20 % of Uttarakhand, particularly in Pithoragarh, Chamoli, and Rudraprayag districts, falling under high-susceptibility zones. Further interpretability is provided by SHapley Additive exPlanations (SHAP), which highlight key drivers of slope failure, including slope angle, fault proximity, and rainfall. Our findings highlight the value of combining robust ML techniques with geoscientific data, thereby enhancing hazard assessments and informing disaster risk reduction across the Himalayas and similarly vulnerable terrains worldwide.

Abstract Image

查看原文本刊更多论文

加强喜马拉雅地区滑坡易感性制图：地理空间和机器学习与可解释的人工智能（XAI）

在喜马拉雅山，陡峭的地形、强烈的降雨和构造活动汇聚在一起，使山坡不稳定，山体滑坡是一个严重的危险。准确划定高易感区对于保护生命、基础设施和生态系统至关重要。在这里，我们通过将先进的集成机器学习（ML）与可解释的人工智能相结合，为印度北部易发生滑坡的北阿坎德邦构建了一个全面的滑坡易感性地图（LSM）。我们的分析包括35个地质环境变量，从历史滑坡清单和遥感数据到基于gis的地貌、水文和人为层。我们评估了6个ML模型（逻辑回归、支持向量机、随机森林、额外树、梯度增强和极端梯度增强），然后将它们合并到一个堆叠集成（SE）中，在训练集和测试集上实现了0.987和0.979的曲线下面积（AUC）。在所有模型中，假阴性率都很低；额外的树最小化了遗漏事件（FNR = 3.5%），但假阳性率很高（23.6%），而XGBoost和SE实现了更好的敏感性-特异性平衡（FNR分别= 5.6%和5.5%），假阳性率相对较低，有利于操作使用。对锡金的空间可转移性较强（北阿坎德邦测试精度0.864-0.917，锡金测试精度0.905-0.971），其中XGBoost测试在锡金测试精度最高（0.971），集合方法（GB、XGBoost、SE）均超过0.96，在喜马拉雅不同地区具有较强的泛化能力。我们的整体模型优于所有个体模型，并将研究区划分为5个易感性区（非常低到非常高），其中北阿坎德邦18.20%的地区，特别是Pithoragarh、Chamoli和Rudraprayag地区属于高易感性区。SHapley加性解释（SHAP）提供了进一步的可解释性，它突出了斜坡破坏的关键驱动因素，包括斜坡角度、断层接近度和降雨量。我们的研究结果强调了将强大的机器学习技术与地球科学数据相结合的价值，从而加强了危害评估，并为喜马拉雅山脉和全球类似脆弱地区的灾害风险减少提供了信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Gondwana Research 地学-地球科学综合

CiteScore

12.90

自引率

6.60%

发文量

298

审稿时长

65 days

期刊介绍： Gondwana Research (GR) is an International Journal aimed to promote high quality research publications on all topics related to solid Earth, particularly with reference to the origin and evolution of continents, continental assemblies and their resources. GR is an "all earth science" journal with no restrictions on geological time, terrane or theme and covers a wide spectrum of topics in geosciences such as geology, geomorphology, palaeontology, structure, petrology, geochemistry, stable isotopes, geochronology, economic geology, exploration geology, engineering geology, geophysics, and environmental geology among other themes, and provides an appropriate forum to integrate studies from different disciplines and different terrains. In addition to regular articles and thematic issues, the journal invites high profile state-of-the-art reviews on thrust area topics for its column, ''GR FOCUS''. Focus articles include short biographies and photographs of the authors. Short articles (within ten printed pages) for rapid publication reporting important discoveries or innovative models of global interest will be considered under the category ''GR LETTERS''.