{"title":"利用叠加系综模型的SHapley加性解释(SHAP)方法预测沟道侵蚀敏感性","authors":"Jeongho Han , Jorge A. Guzman , Maria L. Chu","doi":"10.1016/j.jenvman.2025.125478","DOIUrl":null,"url":null,"abstract":"<div><div>This study develops a novel explainable stacking ensemble model that combines the stacked generalization ensemble method with SHapley Additive exPlanations (SHAP) to enhance the prediction and interpretation of gully erosion susceptibility. Applied to Jefferson County, Illinois, our approach leverages Random Forest (RF), Gradient Boosting Machine (GBM), Logistic Regression (LR), and Deep Neural Networks (DNN) as both base and meta-learners in various configurations, resulting in 44 distinct stacking models. The comparative analysis demonstrated the superior predictive performance of the stacked models when evaluated at 200 randomly gully sites selected points based on LiDAR difference observations; all but three exceeded the highest area under the curve (AUC) value of 0.86 achieved by the best-performing base model (GBM). The LR stacking model, combining RF and GBM as base models with LR as the meta-learner, emerged as the most effective, achieving an AUC of 0.916. The resulting gully erosion susceptibility map by the LR stacking model classified 33 % of the agricultural land (89,208 ha) as the “very high” class, compared to 27 %, 87 %, 27 %, and 55 % predicted by individual RF, LR, GBM, and DNN models, respectively. Crucially, SHAP analysis elucidated how changes in feature values influence model behavior, considering feature interactions within both the base models and the meta-learner. The SHAP identified the annual leaf area index (LAI) as the most influential feature in both RF and GBM base models. Additionally, it highlights the significance of the GBM model in comparison to the RF base model in the final decision-making process of the stacking model. By offering a transparent mechanism to evaluate how different features and models contribute to final decisions, this approach can be extended to broader environmental management and policy-making contexts, facilitating more informed and responsible resource allocation.</div></div>","PeriodicalId":356,"journal":{"name":"Journal of Environmental Management","volume":"383 ","pages":"Article 125478"},"PeriodicalIF":8.0000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of gully erosion susceptibility through the lens of the SHapley Additive exPlanations (SHAP) method using a stacking ensemble model\",\"authors\":\"Jeongho Han , Jorge A. Guzman , Maria L. Chu\",\"doi\":\"10.1016/j.jenvman.2025.125478\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study develops a novel explainable stacking ensemble model that combines the stacked generalization ensemble method with SHapley Additive exPlanations (SHAP) to enhance the prediction and interpretation of gully erosion susceptibility. Applied to Jefferson County, Illinois, our approach leverages Random Forest (RF), Gradient Boosting Machine (GBM), Logistic Regression (LR), and Deep Neural Networks (DNN) as both base and meta-learners in various configurations, resulting in 44 distinct stacking models. The comparative analysis demonstrated the superior predictive performance of the stacked models when evaluated at 200 randomly gully sites selected points based on LiDAR difference observations; all but three exceeded the highest area under the curve (AUC) value of 0.86 achieved by the best-performing base model (GBM). The LR stacking model, combining RF and GBM as base models with LR as the meta-learner, emerged as the most effective, achieving an AUC of 0.916. The resulting gully erosion susceptibility map by the LR stacking model classified 33 % of the agricultural land (89,208 ha) as the “very high” class, compared to 27 %, 87 %, 27 %, and 55 % predicted by individual RF, LR, GBM, and DNN models, respectively. Crucially, SHAP analysis elucidated how changes in feature values influence model behavior, considering feature interactions within both the base models and the meta-learner. The SHAP identified the annual leaf area index (LAI) as the most influential feature in both RF and GBM base models. Additionally, it highlights the significance of the GBM model in comparison to the RF base model in the final decision-making process of the stacking model. By offering a transparent mechanism to evaluate how different features and models contribute to final decisions, this approach can be extended to broader environmental management and policy-making contexts, facilitating more informed and responsible resource allocation.</div></div>\",\"PeriodicalId\":356,\"journal\":{\"name\":\"Journal of Environmental Management\",\"volume\":\"383 \",\"pages\":\"Article 125478\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Environmental Management\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0301479725014549\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Management","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0301479725014549","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Prediction of gully erosion susceptibility through the lens of the SHapley Additive exPlanations (SHAP) method using a stacking ensemble model
This study develops a novel explainable stacking ensemble model that combines the stacked generalization ensemble method with SHapley Additive exPlanations (SHAP) to enhance the prediction and interpretation of gully erosion susceptibility. Applied to Jefferson County, Illinois, our approach leverages Random Forest (RF), Gradient Boosting Machine (GBM), Logistic Regression (LR), and Deep Neural Networks (DNN) as both base and meta-learners in various configurations, resulting in 44 distinct stacking models. The comparative analysis demonstrated the superior predictive performance of the stacked models when evaluated at 200 randomly gully sites selected points based on LiDAR difference observations; all but three exceeded the highest area under the curve (AUC) value of 0.86 achieved by the best-performing base model (GBM). The LR stacking model, combining RF and GBM as base models with LR as the meta-learner, emerged as the most effective, achieving an AUC of 0.916. The resulting gully erosion susceptibility map by the LR stacking model classified 33 % of the agricultural land (89,208 ha) as the “very high” class, compared to 27 %, 87 %, 27 %, and 55 % predicted by individual RF, LR, GBM, and DNN models, respectively. Crucially, SHAP analysis elucidated how changes in feature values influence model behavior, considering feature interactions within both the base models and the meta-learner. The SHAP identified the annual leaf area index (LAI) as the most influential feature in both RF and GBM base models. Additionally, it highlights the significance of the GBM model in comparison to the RF base model in the final decision-making process of the stacking model. By offering a transparent mechanism to evaluate how different features and models contribute to final decisions, this approach can be extended to broader environmental management and policy-making contexts, facilitating more informed and responsible resource allocation.
期刊介绍:
The Journal of Environmental Management is a journal for the publication of peer reviewed, original research for all aspects of management and the managed use of the environment, both natural and man-made.Critical review articles are also welcome; submission of these is strongly encouraged.