{"title":"基于图和机器学习的水库景观土壤深度预测方法——以重庆市达州为例","authors":"Lanbing Yu , Biswajeet Pradhan , Yang Wang","doi":"10.1016/j.catena.2025.109425","DOIUrl":null,"url":null,"abstract":"<div><div>Reservoir-bank areas are characterized by intense soil erosion and deposition processes, resulting in significant spatial variations in soil thickness that influence landslide occurrence and threaten resident safety. This study presents an adaptive modelling framework to predict soil thickness by capturing the complex spatial relationships inherent in its distribution, significantly improving prediction accuracy. A reservoir-bank area of 1.7 Km<sup>2</sup> in Dazhou town, Chongqing Province, China, was selected as a study area. A total of 288 soil thickness samples derived from field observation and drilling works, along with 14 environmental factors (such as altitude, slope, relative slope position index (RSPI), and sediment transportation index) were utilized to generate the initial modelling dataset. Subsequently, two graph models were developed based on the feature and geographic similarity, and the extracted graph features were integrated with environmental factors as inputs for machine learning models, including Random Forest (RF), Support Vector Machine, and Gradient Boosting Decision Tree (GBDT), to predict soil thickness maps. The validation results of root-mean-square-error (RMSE), coefficient of determination (R<sup>2</sup>), and error frequency analysis highlighted two essential conclusions in this study: i) Among the three models, the GBDT model showed the best performance overall, with the highest R<sup>2</sup> (0.7431 for testing, 0.9569 for training), the lowest RMSE (5.3189 for testing, 2.3001 for training), and the lowest residual skewness value of 0.11. ii) Incorporating graph-based features significantly enhances the accuracy of soil thickness predictions, particularly for nonlinear models (RF and GBDT), by effectively mitigating overestimation issues caused by spatial dependencies among independent variables (such as altitude and RSPI). This study integrates machine learning techniques with graph-based spatial analysis, providing a new path for advancing soil thickness prediction research.</div></div>","PeriodicalId":9801,"journal":{"name":"Catena","volume":"260 ","pages":"Article 109425"},"PeriodicalIF":5.7000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph-based and machine learning approaches for soil depth prediction in a reservoir landscape: A case study in Dazhou County, Chongqing, China\",\"authors\":\"Lanbing Yu , Biswajeet Pradhan , Yang Wang\",\"doi\":\"10.1016/j.catena.2025.109425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Reservoir-bank areas are characterized by intense soil erosion and deposition processes, resulting in significant spatial variations in soil thickness that influence landslide occurrence and threaten resident safety. This study presents an adaptive modelling framework to predict soil thickness by capturing the complex spatial relationships inherent in its distribution, significantly improving prediction accuracy. A reservoir-bank area of 1.7 Km<sup>2</sup> in Dazhou town, Chongqing Province, China, was selected as a study area. A total of 288 soil thickness samples derived from field observation and drilling works, along with 14 environmental factors (such as altitude, slope, relative slope position index (RSPI), and sediment transportation index) were utilized to generate the initial modelling dataset. Subsequently, two graph models were developed based on the feature and geographic similarity, and the extracted graph features were integrated with environmental factors as inputs for machine learning models, including Random Forest (RF), Support Vector Machine, and Gradient Boosting Decision Tree (GBDT), to predict soil thickness maps. The validation results of root-mean-square-error (RMSE), coefficient of determination (R<sup>2</sup>), and error frequency analysis highlighted two essential conclusions in this study: i) Among the three models, the GBDT model showed the best performance overall, with the highest R<sup>2</sup> (0.7431 for testing, 0.9569 for training), the lowest RMSE (5.3189 for testing, 2.3001 for training), and the lowest residual skewness value of 0.11. ii) Incorporating graph-based features significantly enhances the accuracy of soil thickness predictions, particularly for nonlinear models (RF and GBDT), by effectively mitigating overestimation issues caused by spatial dependencies among independent variables (such as altitude and RSPI). This study integrates machine learning techniques with graph-based spatial analysis, providing a new path for advancing soil thickness prediction research.</div></div>\",\"PeriodicalId\":9801,\"journal\":{\"name\":\"Catena\",\"volume\":\"260 \",\"pages\":\"Article 109425\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Catena\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0341816225007271\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Catena","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0341816225007271","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
Graph-based and machine learning approaches for soil depth prediction in a reservoir landscape: A case study in Dazhou County, Chongqing, China
Reservoir-bank areas are characterized by intense soil erosion and deposition processes, resulting in significant spatial variations in soil thickness that influence landslide occurrence and threaten resident safety. This study presents an adaptive modelling framework to predict soil thickness by capturing the complex spatial relationships inherent in its distribution, significantly improving prediction accuracy. A reservoir-bank area of 1.7 Km2 in Dazhou town, Chongqing Province, China, was selected as a study area. A total of 288 soil thickness samples derived from field observation and drilling works, along with 14 environmental factors (such as altitude, slope, relative slope position index (RSPI), and sediment transportation index) were utilized to generate the initial modelling dataset. Subsequently, two graph models were developed based on the feature and geographic similarity, and the extracted graph features were integrated with environmental factors as inputs for machine learning models, including Random Forest (RF), Support Vector Machine, and Gradient Boosting Decision Tree (GBDT), to predict soil thickness maps. The validation results of root-mean-square-error (RMSE), coefficient of determination (R2), and error frequency analysis highlighted two essential conclusions in this study: i) Among the three models, the GBDT model showed the best performance overall, with the highest R2 (0.7431 for testing, 0.9569 for training), the lowest RMSE (5.3189 for testing, 2.3001 for training), and the lowest residual skewness value of 0.11. ii) Incorporating graph-based features significantly enhances the accuracy of soil thickness predictions, particularly for nonlinear models (RF and GBDT), by effectively mitigating overestimation issues caused by spatial dependencies among independent variables (such as altitude and RSPI). This study integrates machine learning techniques with graph-based spatial analysis, providing a new path for advancing soil thickness prediction research.
期刊介绍:
Catena publishes papers describing original field and laboratory investigations and reviews on geoecology and landscape evolution with emphasis on interdisciplinary aspects of soil science, hydrology and geomorphology. It aims to disseminate new knowledge and foster better understanding of the physical environment, of evolutionary sequences that have resulted in past and current landscapes, and of the natural processes that are likely to determine the fate of our terrestrial environment.
Papers within any one of the above topics are welcome provided they are of sufficiently wide interest and relevance.