Evaluating Water‐Induced Soil Erosion Using Machine Learning: XGBoost as the Most Effective Model

IF 3.7 2区 农林科学 Q2 ENVIRONMENTAL SCIENCES
Ihtisham Khan, Kashif Khan, Kazimierz Bęcek, Muhammad Fahad Bilal
{"title":"Evaluating Water‐Induced Soil Erosion Using Machine Learning: XGBoost as the Most Effective Model","authors":"Ihtisham Khan, Kashif Khan, Kazimierz Bęcek, Muhammad Fahad Bilal","doi":"10.1002/ldr.70152","DOIUrl":null,"url":null,"abstract":"Soil erosion is a significant environmental concern that threatens agricultural activities, reduces soil fertility, and eventually impacts productivity. Assessing soil erosion is essential for effective planning and conservation initiatives in a basin or watershed. This study evaluates water‐induced soil erosion susceptibility using machine learning models, with a focus on the comparative performance of Random Forest (RF), k‐Nearest Neighbors (kNN), and Extreme Gradient Boosting (XGBoost). Unlike conventional approaches, this study emphasizes the effectiveness of ML‐based predictive modeling, rather than re‐identifying well‐established erosion‐controlling factors. A comprehensive dataset comprising topographic, climatic, and land use parameters was used to train and validate the models (80% training, 20% testing). The models were assessed based on multiple performance metrics, including sensitivity, specificity, Kappa coefficient, and area under the curve (AUC). Among the tested models, XGBoost demonstrated the highest predictive performance with an AUC of 0.91, sensitivity of 0.91, specificity of 0.89, and a Kappa index of 0.80. RF and kNN also performed well, with AUC values of 0.87 and 0.89, and Kappa values of 0.80 and 0.73, respectively. Field validation showed that XGBoost correctly predicted 78.7% of high‐risk erosion sites. The final susceptibility map classified 21.3% of the area as high‐risk, mainly concentrated in steep, sparsely vegetated uplands. These findings confirm the effectiveness of machine learning—particularly XGBoost—for accurate erosion risk mapping in data‐scarce, topographically diverse regions. The findings contribute to sustainable land management strategies, offering a scalable and adaptable approach for erosion risk assessment in diverse environmental settings.","PeriodicalId":203,"journal":{"name":"Land Degradation & Development","volume":"53 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Land Degradation & Development","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1002/ldr.70152","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Soil erosion is a significant environmental concern that threatens agricultural activities, reduces soil fertility, and eventually impacts productivity. Assessing soil erosion is essential for effective planning and conservation initiatives in a basin or watershed. This study evaluates water‐induced soil erosion susceptibility using machine learning models, with a focus on the comparative performance of Random Forest (RF), k‐Nearest Neighbors (kNN), and Extreme Gradient Boosting (XGBoost). Unlike conventional approaches, this study emphasizes the effectiveness of ML‐based predictive modeling, rather than re‐identifying well‐established erosion‐controlling factors. A comprehensive dataset comprising topographic, climatic, and land use parameters was used to train and validate the models (80% training, 20% testing). The models were assessed based on multiple performance metrics, including sensitivity, specificity, Kappa coefficient, and area under the curve (AUC). Among the tested models, XGBoost demonstrated the highest predictive performance with an AUC of 0.91, sensitivity of 0.91, specificity of 0.89, and a Kappa index of 0.80. RF and kNN also performed well, with AUC values of 0.87 and 0.89, and Kappa values of 0.80 and 0.73, respectively. Field validation showed that XGBoost correctly predicted 78.7% of high‐risk erosion sites. The final susceptibility map classified 21.3% of the area as high‐risk, mainly concentrated in steep, sparsely vegetated uplands. These findings confirm the effectiveness of machine learning—particularly XGBoost—for accurate erosion risk mapping in data‐scarce, topographically diverse regions. The findings contribute to sustainable land management strategies, offering a scalable and adaptable approach for erosion risk assessment in diverse environmental settings.
利用机器学习评估水引起的土壤侵蚀:XGBoost是最有效的模型
土壤侵蚀是一个重大的环境问题,威胁农业活动,降低土壤肥力,并最终影响生产力。评估土壤侵蚀对流域或分水岭的有效规划和保护举措至关重要。本研究使用机器学习模型评估了水引起的土壤侵蚀敏感性,重点研究了随机森林(RF)、k近邻(kNN)和极端梯度增强(XGBoost)的比较性能。与传统方法不同,本研究强调基于机器学习的预测建模的有效性,而不是重新识别已经建立的侵蚀控制因素。使用包含地形、气候和土地利用参数的综合数据集来训练和验证模型(80%训练,20%测试)。根据多种性能指标对模型进行评估,包括敏感性、特异性、Kappa系数和曲线下面积(AUC)。其中,XGBoost预测效果最好,AUC为0.91,灵敏度为0.91,特异性为0.89,Kappa指数为0.80。RF和kNN也表现良好,AUC值分别为0.87和0.89,Kappa值分别为0.80和0.73。现场验证表明,XGBoost正确预测了78.7%的高风险侵蚀地点。最终的易感性图将21.3%的地区划分为高风险地区,主要集中在陡峭、植被稀疏的高地。这些发现证实了机器学习(尤其是xgboost)在数据稀缺、地形多样的地区准确绘制侵蚀风险地图的有效性。研究结果有助于制定可持续土地管理战略,为不同环境下的侵蚀风险评估提供了一种可扩展和适应性强的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Land Degradation & Development
Land Degradation & Development 农林科学-环境科学
CiteScore
7.70
自引率
8.50%
发文量
379
审稿时长
5.5 months
期刊介绍: Land Degradation & Development is an international journal which seeks to promote rational study of the recognition, monitoring, control and rehabilitation of degradation in terrestrial environments. The journal focuses on: - what land degradation is; - what causes land degradation; - the impacts of land degradation - the scale of land degradation; - the history, current status or future trends of land degradation; - avoidance, mitigation and control of land degradation; - remedial actions to rehabilitate or restore degraded land; - sustainable land management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信