Comparison of Machine Learning and Geostatistical Methods on Mapping Soil Organic Carbon Density in Regional Croplands and Visualizing Its Location-Specific Dominators via Interpretable Model

IF 3.6 2区 农林科学 Q2 ENVIRONMENTAL SCIENCES
Bifeng Hu, Yibo Geng, Yi Lin, Hanjie Ni, Modian Xie, Nan Wang, Jie Hu, Qian Zou, Songchao Chen, Yin Zhou, Hongyi Li, Zhou Shi
{"title":"Comparison of Machine Learning and Geostatistical Methods on Mapping Soil Organic Carbon Density in Regional Croplands and Visualizing Its Location-Specific Dominators via Interpretable Model","authors":"Bifeng Hu, Yibo Geng, Yi Lin, Hanjie Ni, Modian Xie, Nan Wang, Jie Hu, Qian Zou, Songchao Chen, Yin Zhou, Hongyi Li, Zhou Shi","doi":"10.1002/ldr.5573","DOIUrl":null,"url":null,"abstract":"High-precision soil organic carbon density (SOCD) map is significant for understanding ecosystem carbon cycles and estimating soil organic carbon storage. However, the current mapping methods are difficult to balance accuracy and interpretability, which brings great challenges to the mapping of SOCD. In the present research, a total of 6223 soil samples were collected, along with data pertaining to 30 environmental covariates, from agricultural land located in the Poyang Lake Plain of Jiangxi Province, southern China. Furthermore, ordinary kriging (OK), geographically weighted regression (GWR), random forest (RF), and empirical Bayesian kriging (EBK), along with three hybrid models (RF-OK, RF-EBK, RF-GWR), were constructed. These models were used to map the SOCD (soil organic carbon density) in the study region with a high resolution of 30 m. After that, shapley additive explanations (SHAP) were used to quantify the global contribution and spatially identify the dominant factors that influence SOCD variation. The study outcomes suggested that compared to the single geostatistics model and hybrid model, the RF method emerged as the most effective predictive model, showcasing superior performance (coefficient of determination (<i>R</i><sup>2</sup>) = 0.44, root mean squared error (RMSE) = 0.61 kg m<sup>−2</sup>, Lin's concordance coefficient (LCCC) = 0.58). Using the SHAP, we found that soil properties contributed the most to the prediction of global SOCD (81.67%). At the pixel level, total nitrogen dominated 50.33% of the farmland, followed by parent material (8.11%), available silicon (8.00%), and mean annual precipitation (5.71%), and the remaining variables accounted for less than 5.50%. In summary, our study offered valuable enlightenment toward achieving a balance between accuracy and interpretability of digital soil mapping, and deepened our understanding of the spatial variation of farmland SOCD.","PeriodicalId":203,"journal":{"name":"Land Degradation & Development","volume":"69 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Land Degradation & Development","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1002/ldr.5573","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

High-precision soil organic carbon density (SOCD) map is significant for understanding ecosystem carbon cycles and estimating soil organic carbon storage. However, the current mapping methods are difficult to balance accuracy and interpretability, which brings great challenges to the mapping of SOCD. In the present research, a total of 6223 soil samples were collected, along with data pertaining to 30 environmental covariates, from agricultural land located in the Poyang Lake Plain of Jiangxi Province, southern China. Furthermore, ordinary kriging (OK), geographically weighted regression (GWR), random forest (RF), and empirical Bayesian kriging (EBK), along with three hybrid models (RF-OK, RF-EBK, RF-GWR), were constructed. These models were used to map the SOCD (soil organic carbon density) in the study region with a high resolution of 30 m. After that, shapley additive explanations (SHAP) were used to quantify the global contribution and spatially identify the dominant factors that influence SOCD variation. The study outcomes suggested that compared to the single geostatistics model and hybrid model, the RF method emerged as the most effective predictive model, showcasing superior performance (coefficient of determination (R2) = 0.44, root mean squared error (RMSE) = 0.61 kg m−2, Lin's concordance coefficient (LCCC) = 0.58). Using the SHAP, we found that soil properties contributed the most to the prediction of global SOCD (81.67%). At the pixel level, total nitrogen dominated 50.33% of the farmland, followed by parent material (8.11%), available silicon (8.00%), and mean annual precipitation (5.71%), and the remaining variables accounted for less than 5.50%. In summary, our study offered valuable enlightenment toward achieving a balance between accuracy and interpretability of digital soil mapping, and deepened our understanding of the spatial variation of farmland SOCD.
机器学习与地统计学方法在区域农田土壤有机碳密度制图及可解释模型中位置特异支配因子可视化中的比较
高精度土壤有机碳密度(SOCD)图对于了解生态系统碳循环和估算土壤有机碳储量具有重要意义。然而,现有的测绘方法难以平衡准确性和可解释性,这给SOCD的测绘带来了很大的挑战。本研究在江西省鄱阳湖平原的农业用地上采集了6223个土壤样本和30个环境协变量数据。在此基础上,构建了普通克里格模型(OK)、地理加权回归模型(GWR)、随机森林模型(RF)和经验贝叶斯克里格模型(EBK),以及RF-OK、RF-EBK、RF-GWR三个混合模型。利用这些模型绘制了研究区土壤有机碳密度(SOCD)图,分辨率为30 m。在此基础上,利用shapley加性解释(SHAP)量化全球贡献,并在空间上识别影响SOCD变化的主导因子。研究结果表明,与单一地质统计模型和混合模型相比,射频方法是最有效的预测模型,具有更好的预测效果(决定系数(R2) = 0.44,均方根误差(RMSE) = 0.61 kg m−2,Lin’s concordance系数(LCCC) = 0.58)。利用SHAP,我们发现土壤性质对全球SOCD的预测贡献最大(81.67%)。在像元水平上,全氮占农田面积的50.33%,其次是母质(8.11%)、有效硅(8.00%)和年均降水量(5.71%),其余变量占比均小于5.50%。本研究为实现数字土壤制图的准确性和可解释性之间的平衡提供了有价值的启示,加深了我们对农田土壤土壤镉空间变化的认识。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Land Degradation & Development
Land Degradation & Development 农林科学-环境科学
CiteScore
7.70
自引率
8.50%
发文量
379
审稿时长
5.5 months
期刊介绍: Land Degradation & Development is an international journal which seeks to promote rational study of the recognition, monitoring, control and rehabilitation of degradation in terrestrial environments. The journal focuses on: - what land degradation is; - what causes land degradation; - the impacts of land degradation - the scale of land degradation; - the history, current status or future trends of land degradation; - avoidance, mitigation and control of land degradation; - remedial actions to rehabilitate or restore degraded land; - sustainable land management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信