Integrating ensemble learning and rocky desertification indices improves accuracy and interpretability of soil thickness prediction in karst landscapes
{"title":"Integrating ensemble learning and rocky desertification indices improves accuracy and interpretability of soil thickness prediction in karst landscapes","authors":"Fayong Fang , Ruyi Zi , Tingsheng Chen , Qilian Zhu , Zhen Han , Rui Hou , Wanyang Yu , Longshan Zhao","doi":"10.1016/j.catena.2026.109876","DOIUrl":null,"url":null,"abstract":"<div><div>Soil thickness, a critical parameter for hydrological partitioning, ecosystem functioning, and biogeochemical cycling, is challenging to predict spatially in complex karst landscapes—hampered by high heterogeneity, intricate natural/anthropogenic impacts, and rocky desertification. Here, we integrate interpretable machine learning (ML) with rocky desertification information indices (RIs) to enhance soil thickness prediction in typical karst regions. We evaluated six individual ML models and three stacking ensembles (with/without RIs). RIs significantly boosted model explanatory power and consistency (average 7% improvement, 4%–11%), capturing the heterogeneity of soil thickness associated with karst-specific soil degradation processes. Stacking ensembles reduced RMSE (1.33–2.95 cm) and MAE (0.99–2.73 cm); the stacking model with linear regression as meta-model performed best (R<sup>2</sup> = 0.47, RMSE = 31.50 cm), while the Cubist base model showed highest accuracy (CCC = 0.63, R<sup>2</sup> = 0.45). Shapley additive explanations and permutation feature importance highlighted dominant drivers (rock exposure, vegetation cover, topography), improving transparency. Uncertainty assessments (prediction interval width and prediction interval ratio) validated robustness and identified high-uncertainty areas (steep topography, severe rocky desertification, model disagreement and sparse sampling). Our RIs-integrated model improves soil thickness prediction in karst regions, presents a potentially scalable framework for analogous complex landscapes, advances understanding of soil formation processes in karst systems, and thereby delivers targeted decision support for regional soil management practices.</div></div>","PeriodicalId":9801,"journal":{"name":"Catena","volume":"265 ","pages":"Article 109876"},"PeriodicalIF":5.7000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Catena","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S034181622600086X","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/6 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Soil thickness, a critical parameter for hydrological partitioning, ecosystem functioning, and biogeochemical cycling, is challenging to predict spatially in complex karst landscapes—hampered by high heterogeneity, intricate natural/anthropogenic impacts, and rocky desertification. Here, we integrate interpretable machine learning (ML) with rocky desertification information indices (RIs) to enhance soil thickness prediction in typical karst regions. We evaluated six individual ML models and three stacking ensembles (with/without RIs). RIs significantly boosted model explanatory power and consistency (average 7% improvement, 4%–11%), capturing the heterogeneity of soil thickness associated with karst-specific soil degradation processes. Stacking ensembles reduced RMSE (1.33–2.95 cm) and MAE (0.99–2.73 cm); the stacking model with linear regression as meta-model performed best (R2 = 0.47, RMSE = 31.50 cm), while the Cubist base model showed highest accuracy (CCC = 0.63, R2 = 0.45). Shapley additive explanations and permutation feature importance highlighted dominant drivers (rock exposure, vegetation cover, topography), improving transparency. Uncertainty assessments (prediction interval width and prediction interval ratio) validated robustness and identified high-uncertainty areas (steep topography, severe rocky desertification, model disagreement and sparse sampling). Our RIs-integrated model improves soil thickness prediction in karst regions, presents a potentially scalable framework for analogous complex landscapes, advances understanding of soil formation processes in karst systems, and thereby delivers targeted decision support for regional soil management practices.
期刊介绍:
Catena publishes papers describing original field and laboratory investigations and reviews on geoecology and landscape evolution with emphasis on interdisciplinary aspects of soil science, hydrology and geomorphology. It aims to disseminate new knowledge and foster better understanding of the physical environment, of evolutionary sequences that have resulted in past and current landscapes, and of the natural processes that are likely to determine the fate of our terrestrial environment.
Papers within any one of the above topics are welcome provided they are of sufficiently wide interest and relevance.