Bifeng Hu , Yibo Geng , Hanjie Ni , Zhou Shi , Zheng Wang , Nan Wang , Jipeng Luo , Modian Xie , Qian Zou , Thomas Optiz , Hongyi Li
{"title":"Mapping and understanding the regional farmland SOC distribution in southern China using a Bayesian spatial model","authors":"Bifeng Hu , Yibo Geng , Hanjie Ni , Zhou Shi , Zheng Wang , Nan Wang , Jipeng Luo , Modian Xie , Qian Zou , Thomas Optiz , Hongyi Li","doi":"10.1016/j.geoderma.2025.117446","DOIUrl":null,"url":null,"abstract":"<div><div>Information on the spatial distribution of soil organic carbon (SOC) in regional farmland is crucial for improving management and production. Mapping SOC in farmlands is challenging due to the strong variation of SOC caused by the influence of natural and anthropogenic activities. Additionally, currently widely used predictive models usually suffer from a lack of model interpretability. To fill these gaps, here we use a Bayesian spatial model termed Integrated Nested Laplace Approximation with the Stochastic Partial Differential Equation (INLA-SPDE) to produce the fine scale SOC map in the farmland of Jiangxi Province, south China based on an extensive soil survey dataset (n = 16,050). The competitive adaptive reweighted sampling algorithm − partial least square (CARS-PLS) algorithm is adopted to select the most related covariates from the original covariates pool. Then, the performance of Random Forest (RF), Geographically Weighted Regression (GWR), and Ordinary Kriging (OK) was compared with INLA-SPDE. Finally, an interpretable machine learning model, the SHapley Additive exPlanation (SHAP), is used to quantify the environmental covariates’ contribution to mapping SOC, as well as mapping spatial varying primary covariates for predicting SOC in the study area. We find that INLA-SPDE was able to handle a large data and performed much better than OK and GWR with an improvement of 38.89 % and 117.39 % in R<sup>2</sup>, respectively. It also outperforms RF. Overall, amount of straw return, mean annual precipitation, mean annual solar radiation are the most important covariates for mapping SOC. Locally, soil management are the most important covariates for mapping SOC in 50.52 % regions of the study area, followed by climate factors (22.06 %), soil properties (17.09 %), terrain (6.38 %), lithology (2.21 %) and biota factors (1.72 %). Our study demonstrates the advantages of INLA-SPDE on mapping SOC compared with geostatistical and RF for SOC mapping and provides valuable implications for interpreting the results of digital soil mapping.</div></div>","PeriodicalId":12511,"journal":{"name":"Geoderma","volume":"460 ","pages":"Article 117446"},"PeriodicalIF":6.6000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoderma","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016706125002873","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOIL SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Information on the spatial distribution of soil organic carbon (SOC) in regional farmland is crucial for improving management and production. Mapping SOC in farmlands is challenging due to the strong variation of SOC caused by the influence of natural and anthropogenic activities. Additionally, currently widely used predictive models usually suffer from a lack of model interpretability. To fill these gaps, here we use a Bayesian spatial model termed Integrated Nested Laplace Approximation with the Stochastic Partial Differential Equation (INLA-SPDE) to produce the fine scale SOC map in the farmland of Jiangxi Province, south China based on an extensive soil survey dataset (n = 16,050). The competitive adaptive reweighted sampling algorithm − partial least square (CARS-PLS) algorithm is adopted to select the most related covariates from the original covariates pool. Then, the performance of Random Forest (RF), Geographically Weighted Regression (GWR), and Ordinary Kriging (OK) was compared with INLA-SPDE. Finally, an interpretable machine learning model, the SHapley Additive exPlanation (SHAP), is used to quantify the environmental covariates’ contribution to mapping SOC, as well as mapping spatial varying primary covariates for predicting SOC in the study area. We find that INLA-SPDE was able to handle a large data and performed much better than OK and GWR with an improvement of 38.89 % and 117.39 % in R2, respectively. It also outperforms RF. Overall, amount of straw return, mean annual precipitation, mean annual solar radiation are the most important covariates for mapping SOC. Locally, soil management are the most important covariates for mapping SOC in 50.52 % regions of the study area, followed by climate factors (22.06 %), soil properties (17.09 %), terrain (6.38 %), lithology (2.21 %) and biota factors (1.72 %). Our study demonstrates the advantages of INLA-SPDE on mapping SOC compared with geostatistical and RF for SOC mapping and provides valuable implications for interpreting the results of digital soil mapping.
期刊介绍:
Geoderma - the global journal of soil science - welcomes authors, readers and soil research from all parts of the world, encourages worldwide soil studies, and embraces all aspects of soil science and its associated pedagogy. The journal particularly welcomes interdisciplinary work focusing on dynamic soil processes and functions across space and time.