Graph-based and machine learning approaches for soil depth prediction in a reservoir landscape: A case study in Dazhou County, Chongqing, China

IF 5.7 1区 农林科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Lanbing Yu , Biswajeet Pradhan , Yang Wang
{"title":"Graph-based and machine learning approaches for soil depth prediction in a reservoir landscape: A case study in Dazhou County, Chongqing, China","authors":"Lanbing Yu ,&nbsp;Biswajeet Pradhan ,&nbsp;Yang Wang","doi":"10.1016/j.catena.2025.109425","DOIUrl":null,"url":null,"abstract":"<div><div>Reservoir-bank areas are characterized by intense soil erosion and deposition processes, resulting in significant spatial variations in soil thickness that influence landslide occurrence and threaten resident safety. This study presents an adaptive modelling framework to predict soil thickness by capturing the complex spatial relationships inherent in its distribution, significantly improving prediction accuracy. A reservoir-bank area of 1.7 Km<sup>2</sup> in Dazhou town, Chongqing Province, China, was selected as a study area. A total of 288 soil thickness samples derived from field observation and drilling works, along with 14 environmental factors (such as altitude, slope, relative slope position index (RSPI), and sediment transportation index) were utilized to generate the initial modelling dataset. Subsequently, two graph models were developed based on the feature and geographic similarity, and the extracted graph features were integrated with environmental factors as inputs for machine learning models, including Random Forest (RF), Support Vector Machine, and Gradient Boosting Decision Tree (GBDT), to predict soil thickness maps. The validation results of root-mean-square-error (RMSE), coefficient of determination (R<sup>2</sup>), and error frequency analysis highlighted two essential conclusions in this study: i) Among the three models, the GBDT model showed the best performance overall, with the highest R<sup>2</sup> (0.7431 for testing, 0.9569 for training), the lowest RMSE (5.3189 for testing, 2.3001 for training), and the lowest residual skewness value of 0.11. ii) Incorporating graph-based features significantly enhances the accuracy of soil thickness predictions, particularly for nonlinear models (RF and GBDT), by effectively mitigating overestimation issues caused by spatial dependencies among independent variables (such as altitude and RSPI). This study integrates machine learning techniques with graph-based spatial analysis, providing a new path for advancing soil thickness prediction research.</div></div>","PeriodicalId":9801,"journal":{"name":"Catena","volume":"260 ","pages":"Article 109425"},"PeriodicalIF":5.7000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Catena","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0341816225007271","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Reservoir-bank areas are characterized by intense soil erosion and deposition processes, resulting in significant spatial variations in soil thickness that influence landslide occurrence and threaten resident safety. This study presents an adaptive modelling framework to predict soil thickness by capturing the complex spatial relationships inherent in its distribution, significantly improving prediction accuracy. A reservoir-bank area of 1.7 Km2 in Dazhou town, Chongqing Province, China, was selected as a study area. A total of 288 soil thickness samples derived from field observation and drilling works, along with 14 environmental factors (such as altitude, slope, relative slope position index (RSPI), and sediment transportation index) were utilized to generate the initial modelling dataset. Subsequently, two graph models were developed based on the feature and geographic similarity, and the extracted graph features were integrated with environmental factors as inputs for machine learning models, including Random Forest (RF), Support Vector Machine, and Gradient Boosting Decision Tree (GBDT), to predict soil thickness maps. The validation results of root-mean-square-error (RMSE), coefficient of determination (R2), and error frequency analysis highlighted two essential conclusions in this study: i) Among the three models, the GBDT model showed the best performance overall, with the highest R2 (0.7431 for testing, 0.9569 for training), the lowest RMSE (5.3189 for testing, 2.3001 for training), and the lowest residual skewness value of 0.11. ii) Incorporating graph-based features significantly enhances the accuracy of soil thickness predictions, particularly for nonlinear models (RF and GBDT), by effectively mitigating overestimation issues caused by spatial dependencies among independent variables (such as altitude and RSPI). This study integrates machine learning techniques with graph-based spatial analysis, providing a new path for advancing soil thickness prediction research.

Abstract Image

基于图和机器学习的水库景观土壤深度预测方法——以重庆市达州为例
库岸地区具有强烈的土壤侵蚀和沉积过程,导致土壤厚度的空间变化显著,影响滑坡的发生,威胁居民安全。本研究提出了一个自适应模型框架,通过捕捉土壤厚度分布中固有的复杂空间关系来预测土壤厚度,显著提高了预测精度。以重庆市达州镇1.7 Km2的库岸区为研究区。利用288份野外观测和钻探所得的土壤厚度样本,以及海拔、坡度、相对坡位指数(RSPI)和输沙指数等14个环境因子生成初始建模数据集。随后,基于特征和地理相似性建立了两个图模型,并将提取的图特征与环境因素作为输入集成到随机森林(RF)、支持向量机(Support Vector machine)和梯度提升决策树(Gradient Boosting Decision Tree)等机器学习模型中,以预测土壤厚度图。均方根误差(RMSE)、决定系数(R2)和误差频次分析的验证结果突出了本研究的两个重要结论:1)在三个模型中,GBDT模型的总体性能最好,R2最高(检验为0.7431,训练为0.9569),RMSE最低(检验为5.3189,训练为2.3001),残偏度值最低(0.11)。ii)结合基于图的特征,通过有效减轻自变量(如海拔高度和RSPI)之间的空间依赖性引起的高估问题,显著提高了土壤厚度预测的准确性,特别是对于非线性模型(RF和GBDT)。该研究将机器学习技术与基于图的空间分析相结合,为推进土壤厚度预测研究提供了新的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Catena
Catena 环境科学-地球科学综合
CiteScore
10.50
自引率
9.70%
发文量
816
审稿时长
54 days
期刊介绍: Catena publishes papers describing original field and laboratory investigations and reviews on geoecology and landscape evolution with emphasis on interdisciplinary aspects of soil science, hydrology and geomorphology. It aims to disseminate new knowledge and foster better understanding of the physical environment, of evolutionary sequences that have resulted in past and current landscapes, and of the natural processes that are likely to determine the fate of our terrestrial environment. Papers within any one of the above topics are welcome provided they are of sufficiently wide interest and relevance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信