Ihtisham Khan, Kashif Khan, Kazimierz Bęcek, Muhammad Fahad Bilal
{"title":"利用机器学习评估水引起的土壤侵蚀:XGBoost是最有效的模型","authors":"Ihtisham Khan, Kashif Khan, Kazimierz Bęcek, Muhammad Fahad Bilal","doi":"10.1002/ldr.70152","DOIUrl":null,"url":null,"abstract":"Soil erosion is a significant environmental concern that threatens agricultural activities, reduces soil fertility, and eventually impacts productivity. Assessing soil erosion is essential for effective planning and conservation initiatives in a basin or watershed. This study evaluates water‐induced soil erosion susceptibility using machine learning models, with a focus on the comparative performance of Random Forest (RF), k‐Nearest Neighbors (kNN), and Extreme Gradient Boosting (XGBoost). Unlike conventional approaches, this study emphasizes the effectiveness of ML‐based predictive modeling, rather than re‐identifying well‐established erosion‐controlling factors. A comprehensive dataset comprising topographic, climatic, and land use parameters was used to train and validate the models (80% training, 20% testing). The models were assessed based on multiple performance metrics, including sensitivity, specificity, Kappa coefficient, and area under the curve (AUC). Among the tested models, XGBoost demonstrated the highest predictive performance with an AUC of 0.91, sensitivity of 0.91, specificity of 0.89, and a Kappa index of 0.80. RF and kNN also performed well, with AUC values of 0.87 and 0.89, and Kappa values of 0.80 and 0.73, respectively. Field validation showed that XGBoost correctly predicted 78.7% of high‐risk erosion sites. The final susceptibility map classified 21.3% of the area as high‐risk, mainly concentrated in steep, sparsely vegetated uplands. These findings confirm the effectiveness of machine learning—particularly XGBoost—for accurate erosion risk mapping in data‐scarce, topographically diverse regions. The findings contribute to sustainable land management strategies, offering a scalable and adaptable approach for erosion risk assessment in diverse environmental settings.","PeriodicalId":203,"journal":{"name":"Land Degradation & Development","volume":"53 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Water‐Induced Soil Erosion Using Machine Learning: XGBoost as the Most Effective Model\",\"authors\":\"Ihtisham Khan, Kashif Khan, Kazimierz Bęcek, Muhammad Fahad Bilal\",\"doi\":\"10.1002/ldr.70152\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Soil erosion is a significant environmental concern that threatens agricultural activities, reduces soil fertility, and eventually impacts productivity. Assessing soil erosion is essential for effective planning and conservation initiatives in a basin or watershed. This study evaluates water‐induced soil erosion susceptibility using machine learning models, with a focus on the comparative performance of Random Forest (RF), k‐Nearest Neighbors (kNN), and Extreme Gradient Boosting (XGBoost). Unlike conventional approaches, this study emphasizes the effectiveness of ML‐based predictive modeling, rather than re‐identifying well‐established erosion‐controlling factors. A comprehensive dataset comprising topographic, climatic, and land use parameters was used to train and validate the models (80% training, 20% testing). The models were assessed based on multiple performance metrics, including sensitivity, specificity, Kappa coefficient, and area under the curve (AUC). Among the tested models, XGBoost demonstrated the highest predictive performance with an AUC of 0.91, sensitivity of 0.91, specificity of 0.89, and a Kappa index of 0.80. RF and kNN also performed well, with AUC values of 0.87 and 0.89, and Kappa values of 0.80 and 0.73, respectively. Field validation showed that XGBoost correctly predicted 78.7% of high‐risk erosion sites. The final susceptibility map classified 21.3% of the area as high‐risk, mainly concentrated in steep, sparsely vegetated uplands. These findings confirm the effectiveness of machine learning—particularly XGBoost—for accurate erosion risk mapping in data‐scarce, topographically diverse regions. The findings contribute to sustainable land management strategies, offering a scalable and adaptable approach for erosion risk assessment in diverse environmental settings.\",\"PeriodicalId\":203,\"journal\":{\"name\":\"Land Degradation & Development\",\"volume\":\"53 1\",\"pages\":\"\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Land Degradation & Development\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.1002/ldr.70152\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Land Degradation & Development","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1002/ldr.70152","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Evaluating Water‐Induced Soil Erosion Using Machine Learning: XGBoost as the Most Effective Model
Soil erosion is a significant environmental concern that threatens agricultural activities, reduces soil fertility, and eventually impacts productivity. Assessing soil erosion is essential for effective planning and conservation initiatives in a basin or watershed. This study evaluates water‐induced soil erosion susceptibility using machine learning models, with a focus on the comparative performance of Random Forest (RF), k‐Nearest Neighbors (kNN), and Extreme Gradient Boosting (XGBoost). Unlike conventional approaches, this study emphasizes the effectiveness of ML‐based predictive modeling, rather than re‐identifying well‐established erosion‐controlling factors. A comprehensive dataset comprising topographic, climatic, and land use parameters was used to train and validate the models (80% training, 20% testing). The models were assessed based on multiple performance metrics, including sensitivity, specificity, Kappa coefficient, and area under the curve (AUC). Among the tested models, XGBoost demonstrated the highest predictive performance with an AUC of 0.91, sensitivity of 0.91, specificity of 0.89, and a Kappa index of 0.80. RF and kNN also performed well, with AUC values of 0.87 and 0.89, and Kappa values of 0.80 and 0.73, respectively. Field validation showed that XGBoost correctly predicted 78.7% of high‐risk erosion sites. The final susceptibility map classified 21.3% of the area as high‐risk, mainly concentrated in steep, sparsely vegetated uplands. These findings confirm the effectiveness of machine learning—particularly XGBoost—for accurate erosion risk mapping in data‐scarce, topographically diverse regions. The findings contribute to sustainable land management strategies, offering a scalable and adaptable approach for erosion risk assessment in diverse environmental settings.
期刊介绍:
Land Degradation & Development is an international journal which seeks to promote rational study of the recognition, monitoring, control and rehabilitation of degradation in terrestrial environments. The journal focuses on:
- what land degradation is;
- what causes land degradation;
- the impacts of land degradation
- the scale of land degradation;
- the history, current status or future trends of land degradation;
- avoidance, mitigation and control of land degradation;
- remedial actions to rehabilitate or restore degraded land;
- sustainable land management.