Tingting Xu , Aohua Tian , Jay Gao , Haoze Yan , Chang Liu
{"title":"利用 K-means 和 XGBoost-SHAP 算法分析西藏自治区冰川融化的空间异质性及其影响因素","authors":"Tingting Xu , Aohua Tian , Jay Gao , Haoze Yan , Chang Liu","doi":"10.1016/j.envsoft.2024.106194","DOIUrl":null,"url":null,"abstract":"<div><p>This study employed machine learning to comprehensively analyze glacier melting in Tibet Autonomous Region (TAR) and its vital influencing factors. Existing machine learning research often lacks detailed explanations, leading to generalized predictions without considering essential driving factors necessary for yielding an insightful understanding of glacier melting dynamics. To overcome these limitations and fulfill multi-level analysis requirements for comprehending glacier melting, this study identifies factors contributing to glacier melting heterogeneity and assesses distinct melting causes in three spatial melted glacier clusters. We utilized K-means unsupervised classification to cluster Tibet melted glaciers into three categories based on temperature, sunshine hours, evapotranspiration, precipitation, normalized vegetation index, and slope. XGBoost algorithm explores the nonlinear relationships of glacier melting with these features and Shapley values were used for model transparency, quantifying feature's influence on the melting process. Investigating geographical heterogeneity among clusters enhanced our understanding of the observed changes. High fitting accuracy (>0.98) enhanced the result reliability, as well. The results show that Tibetan glaciers melt significantly from 2010 to 2020, and the cluster analysis reveals its unique melting characteristics. Melting glaciers in the same cluster are not only similar in characteristics, but also in spatial and geographical distribution, with two of the clusters concentrating in the eastern part of TAR, and the third cluster scattered in the western part of the country. the XGBoost-SHAP analysis efficiently quantifies the contribution of each cluster feature to the glacier melting, revealing the different roles of different clustered features.</p></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"182 ","pages":"Article 106194"},"PeriodicalIF":4.8000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analysis of the spatial heterogeneity of glacier melting in Tibet Autonomous Region and its influential factors using the K-means and XGBoost-SHAP algorithms\",\"authors\":\"Tingting Xu , Aohua Tian , Jay Gao , Haoze Yan , Chang Liu\",\"doi\":\"10.1016/j.envsoft.2024.106194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This study employed machine learning to comprehensively analyze glacier melting in Tibet Autonomous Region (TAR) and its vital influencing factors. Existing machine learning research often lacks detailed explanations, leading to generalized predictions without considering essential driving factors necessary for yielding an insightful understanding of glacier melting dynamics. To overcome these limitations and fulfill multi-level analysis requirements for comprehending glacier melting, this study identifies factors contributing to glacier melting heterogeneity and assesses distinct melting causes in three spatial melted glacier clusters. We utilized K-means unsupervised classification to cluster Tibet melted glaciers into three categories based on temperature, sunshine hours, evapotranspiration, precipitation, normalized vegetation index, and slope. XGBoost algorithm explores the nonlinear relationships of glacier melting with these features and Shapley values were used for model transparency, quantifying feature's influence on the melting process. Investigating geographical heterogeneity among clusters enhanced our understanding of the observed changes. High fitting accuracy (>0.98) enhanced the result reliability, as well. The results show that Tibetan glaciers melt significantly from 2010 to 2020, and the cluster analysis reveals its unique melting characteristics. Melting glaciers in the same cluster are not only similar in characteristics, but also in spatial and geographical distribution, with two of the clusters concentrating in the eastern part of TAR, and the third cluster scattered in the western part of the country. the XGBoost-SHAP analysis efficiently quantifies the contribution of each cluster feature to the glacier melting, revealing the different roles of different clustered features.</p></div>\",\"PeriodicalId\":310,\"journal\":{\"name\":\"Environmental Modelling & Software\",\"volume\":\"182 \",\"pages\":\"Article 106194\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Modelling & Software\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S136481522400255X\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S136481522400255X","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Analysis of the spatial heterogeneity of glacier melting in Tibet Autonomous Region and its influential factors using the K-means and XGBoost-SHAP algorithms
This study employed machine learning to comprehensively analyze glacier melting in Tibet Autonomous Region (TAR) and its vital influencing factors. Existing machine learning research often lacks detailed explanations, leading to generalized predictions without considering essential driving factors necessary for yielding an insightful understanding of glacier melting dynamics. To overcome these limitations and fulfill multi-level analysis requirements for comprehending glacier melting, this study identifies factors contributing to glacier melting heterogeneity and assesses distinct melting causes in three spatial melted glacier clusters. We utilized K-means unsupervised classification to cluster Tibet melted glaciers into three categories based on temperature, sunshine hours, evapotranspiration, precipitation, normalized vegetation index, and slope. XGBoost algorithm explores the nonlinear relationships of glacier melting with these features and Shapley values were used for model transparency, quantifying feature's influence on the melting process. Investigating geographical heterogeneity among clusters enhanced our understanding of the observed changes. High fitting accuracy (>0.98) enhanced the result reliability, as well. The results show that Tibetan glaciers melt significantly from 2010 to 2020, and the cluster analysis reveals its unique melting characteristics. Melting glaciers in the same cluster are not only similar in characteristics, but also in spatial and geographical distribution, with two of the clusters concentrating in the eastern part of TAR, and the third cluster scattered in the western part of the country. the XGBoost-SHAP analysis efficiently quantifies the contribution of each cluster feature to the glacier melting, revealing the different roles of different clustered features.
期刊介绍:
Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.