Spatial prediction of organic carbon in German agricultural topsoil using machine learning algorithms

4区 农林科学 Q2 Agricultural and Biological Sciences
Ali Sakhaee, Anika Gebauer, Mareike Ließ, A. Don
{"title":"Spatial prediction of organic carbon in German agricultural topsoil using machine learning algorithms","authors":"Ali Sakhaee, Anika Gebauer, Mareike Ließ, A. Don","doi":"10.5194/soil-8-587-2022","DOIUrl":null,"url":null,"abstract":"Abstract. As the largest terrestrial carbon pool, soil organic carbon (SOC) has the\npotential to influence and mitigate climate change; thus, SOC monitoring is of high importance\nin the frameworks of various international treaties. Therefore, high-resolution SOC maps are required. Machine learning (ML) offers new\nopportunities to develop these maps due to its ability to data mine large\ndatasets. The aim of this study was to apply three algorithms commonly used\nin digital soil mapping – random forest (RF), boosted regression trees\n(BRT), and support vector machine for regression (SVR) – on the first German\nagricultural soil inventory to model the agricultural topsoil (0–30 cm) SOC\ncontent and develop a two-model approach to address the high variability in\nSOC in German agricultural soils. Model performance is often limited by the\nsize and quality of the soil dataset available for calibration and\nvalidation. Therefore, the impact of enlarging the training dataset was tested\nby including data from the European Land Use/Cover Area frame Survey\nfor agricultural sites in Germany. Nested cross-validation was implemented\nfor model evaluation and parameter tuning. Grid search and the differential\nevolution algorithm were also applied to ensure that each algorithm was\nappropriately tuned . The SOC content of the German agricultural soil\ninventory was highly variable, ranging from 4 to 480 g kg−1. However, only 4 % of all soils contained more than 87 g kg−1 SOC and were considered organic or degraded organic soils. The\nresults showed that SVR produced the best performance, with a root-mean-square error (RMSE) of 32 g kg−1 when the algorithms were trained on the full dataset. However, the\naverage RMSE of all algorithms decreased by 34 % when mineral and organic\nsoils were modelled separately, with the best result from SVR presenting an RMSE of\n21 g kg−1. The model performance was enhanced by up to 1 % for\nmineral soils and by up to 2 % for organic soils. Despite the ability of machine\nlearning algorithms, in general, and SVR, in particular, to model SOC on a\nnational scale, the study showed that the most important aspect for\nimproving the model performance was to separate the modelling of mineral and\norganic soils.\n","PeriodicalId":22015,"journal":{"name":"Soil Science","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soil Science","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.5194/soil-8-587-2022","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 7

Abstract

Abstract. As the largest terrestrial carbon pool, soil organic carbon (SOC) has the potential to influence and mitigate climate change; thus, SOC monitoring is of high importance in the frameworks of various international treaties. Therefore, high-resolution SOC maps are required. Machine learning (ML) offers new opportunities to develop these maps due to its ability to data mine large datasets. The aim of this study was to apply three algorithms commonly used in digital soil mapping – random forest (RF), boosted regression trees (BRT), and support vector machine for regression (SVR) – on the first German agricultural soil inventory to model the agricultural topsoil (0–30 cm) SOC content and develop a two-model approach to address the high variability in SOC in German agricultural soils. Model performance is often limited by the size and quality of the soil dataset available for calibration and validation. Therefore, the impact of enlarging the training dataset was tested by including data from the European Land Use/Cover Area frame Survey for agricultural sites in Germany. Nested cross-validation was implemented for model evaluation and parameter tuning. Grid search and the differential evolution algorithm were also applied to ensure that each algorithm was appropriately tuned . The SOC content of the German agricultural soil inventory was highly variable, ranging from 4 to 480 g kg−1. However, only 4 % of all soils contained more than 87 g kg−1 SOC and were considered organic or degraded organic soils. The results showed that SVR produced the best performance, with a root-mean-square error (RMSE) of 32 g kg−1 when the algorithms were trained on the full dataset. However, the average RMSE of all algorithms decreased by 34 % when mineral and organic soils were modelled separately, with the best result from SVR presenting an RMSE of 21 g kg−1. The model performance was enhanced by up to 1 % for mineral soils and by up to 2 % for organic soils. Despite the ability of machine learning algorithms, in general, and SVR, in particular, to model SOC on a national scale, the study showed that the most important aspect for improving the model performance was to separate the modelling of mineral and organic soils.
机器学习算法在德国农业表层土壤有机碳空间预测中的应用
摘要土壤有机碳(SOC)作为最大的陆地碳库,具有影响和减缓气候变化的潜力;因此,SOC监测在各种国际条约框架中具有重要意义。因此,需要高分辨率的SOC图。机器学习(ML)为开发这些地图提供了新的机会,因为它能够对大型数据集进行数据挖掘。本研究的目的是将数字土壤测绘中常用的三种算法——随机森林(RF)、增强回归树(BRT)和回归支持向量机(SVR)——应用于第一次德国农业土壤调查,以模拟农业表土(0-30 cm)的soc含量,并开发一种双模型方法来解决德国农业土壤中soc的高变异性。模型性能通常受到可用于校准和验证的土壤数据集的大小和质量的限制。因此,对扩大训练数据集的影响进行了测试,包括来自德国农业用地的欧洲土地利用/覆盖面积框架调查的数据。嵌套交叉验证用于模型评估和参数调整。网格搜索和差分进化算法也被应用,以确保每个算法都是适当的调整。德国农业土壤的有机碳含量变化很大,在4 ~ 480 g kg−1之间。然而,所有土壤中只有4%的土壤有机碳含量超过87 g kg - 1,被认为是有机或退化有机土壤。结果表明,当算法在完整数据集上训练时,SVR产生了最好的性能,均方根误差(RMSE)为32 g kg−1。然而,当矿物和有机土壤分别建模时,所有算法的平均RMSE降低了34%,SVR的最佳结果显示RMSE为21 g kg - 1。模型性能在矿质土壤中提高了1%,在有机土壤中提高了2%。尽管机器学习算法(尤其是SVR)能够在全国范围内模拟土壤有机碳,但研究表明,提高模型性能的最重要方面是将矿物土壤和有机土壤的建模分离开来。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Soil Science
Soil Science 农林科学-土壤科学
CiteScore
2.70
自引率
0.00%
发文量
0
审稿时长
4.4 months
期刊介绍: Cessation.Soil Science satisfies the professional needs of all scientists and laboratory personnel involved in soil and plant research by publishing primary research reports and critical reviews of basic and applied soil science, especially as it relates to soil and plant studies and general environmental soil science. Each month, Soil Science presents authoritative research articles from an impressive array of discipline: soil chemistry and biochemistry, physics, fertility and nutrition, soil genesis and morphology, soil microbiology and mineralogy. Of immediate relevance to soil scientists-both industrial and academic-this unique publication also has long-range value for agronomists and environmental scientists.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信