[Estimation of Soil Organic Carbon Content in Gannan Grassland Based on SSA Optimized CatBoost].

Q2 Environmental Science
Zi-Ming Ma, Mei-Ling Zhang, Xing-Yu Liu
{"title":"[Estimation of Soil Organic Carbon Content in Gannan Grassland Based on SSA Optimized CatBoost].","authors":"Zi-Ming Ma, Mei-Ling Zhang, Xing-Yu Liu","doi":"10.13227/j.hjkx.202408081","DOIUrl":null,"url":null,"abstract":"<p><p>Estimating the content of soil organic carbon (SOC) in Gannan Tibetan Autonomous Prefecture, studying its spatial distribution characteristics, and clarifying the main influencing factors of SOC are of great significance for improving grassland quality, optimizing management, regulating climate, and maintaining ecosystem functions. Taking the grassland in Gannan Tibetan Autonomous Prefecture of Gansu Province as the research object, multi-feature factor data were constructed by integrating data such as soil properties, meteorological factors, elevation, and vegetation index, and 24 significant feature factors were screened out using Pearson correlation analysis. Then, the normalized contribution degree was obtained according to the SHAP value. The machine learning model was used to divide the 8∶2 training set and test set, and the results were obtained by ten-fold cross-validation. According to the evaluation models such as MAE, RMSE, and <i>R</i><sup>2</sup>, the sparrow search algorithm (SSA) and whale optimization algorithm (WOA) were used to optimize the parameters and estimate the SOC content. The results showed that the spatial distribution of SOC reserves on grassland surface in Gannan Tibetan Autonomous Prefecture based on the model was gradually decreasing from west to east, being high in the northwest and low in the southeast, with relatively low average temperature and high organic carbon content in the northwest. The annual average temperature, enhanced vegetation index (EVI), and digital elevation model (DEM) contributed significantly to the SOC content of Gannan grassland, which were the main factors affecting the spatial distribution of SOC. Among the random forest, decision tree, gradient lifting regression, CatBoost, XGBoost, and LightGBM, the CatBoost model performed best on the test set. According to the convergence rate curves of SSA and WOA, it was found that SSA converged faster, and updating parameters was more effective. The optimized SSA-CatBoost model performed best in predicting SOC content. The spatial distribution of SOC has an important impact on the ecosystem and carbon cycle in the region. The grassland in the northwest of the Gannan region has greater potential in soil fertility and carbon storage, which is helpful to formulate more effective soil management and ecological protection strategies, slow down the process of climate warming, and further promote the sustainable development of the global ecosystem.</p>","PeriodicalId":35937,"journal":{"name":"环境科学","volume":"46 8","pages":"4961-4970"},"PeriodicalIF":0.0000,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"环境科学","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.13227/j.hjkx.202408081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Environmental Science","Score":null,"Total":0}
引用次数: 0

Abstract

Estimating the content of soil organic carbon (SOC) in Gannan Tibetan Autonomous Prefecture, studying its spatial distribution characteristics, and clarifying the main influencing factors of SOC are of great significance for improving grassland quality, optimizing management, regulating climate, and maintaining ecosystem functions. Taking the grassland in Gannan Tibetan Autonomous Prefecture of Gansu Province as the research object, multi-feature factor data were constructed by integrating data such as soil properties, meteorological factors, elevation, and vegetation index, and 24 significant feature factors were screened out using Pearson correlation analysis. Then, the normalized contribution degree was obtained according to the SHAP value. The machine learning model was used to divide the 8∶2 training set and test set, and the results were obtained by ten-fold cross-validation. According to the evaluation models such as MAE, RMSE, and R2, the sparrow search algorithm (SSA) and whale optimization algorithm (WOA) were used to optimize the parameters and estimate the SOC content. The results showed that the spatial distribution of SOC reserves on grassland surface in Gannan Tibetan Autonomous Prefecture based on the model was gradually decreasing from west to east, being high in the northwest and low in the southeast, with relatively low average temperature and high organic carbon content in the northwest. The annual average temperature, enhanced vegetation index (EVI), and digital elevation model (DEM) contributed significantly to the SOC content of Gannan grassland, which were the main factors affecting the spatial distribution of SOC. Among the random forest, decision tree, gradient lifting regression, CatBoost, XGBoost, and LightGBM, the CatBoost model performed best on the test set. According to the convergence rate curves of SSA and WOA, it was found that SSA converged faster, and updating parameters was more effective. The optimized SSA-CatBoost model performed best in predicting SOC content. The spatial distribution of SOC has an important impact on the ecosystem and carbon cycle in the region. The grassland in the northwest of the Gannan region has greater potential in soil fertility and carbon storage, which is helpful to formulate more effective soil management and ecological protection strategies, slow down the process of climate warming, and further promote the sustainable development of the global ecosystem.

基于SSA优化CatBoost的甘南草原土壤有机碳含量估算[j]。
估算甘南藏族自治州土壤有机碳(SOC)含量,研究其空间分布特征,明确土壤有机碳的主要影响因素,对改善草地质量、优化管理、调节气候和维持生态系统功能具有重要意义。以甘肃省甘南藏族自治州草原为研究对象,通过整合土壤性质、气象因子、高程、植被指数等数据构建多特征因子数据,利用Pearson相关分析筛选出24个显著特征因子。然后,根据SHAP值得到归一化贡献度。利用机器学习模型对8∶2的训练集和测试集进行分割,并通过10倍交叉验证得到结果。根据MAE、RMSE和R2等评价模型,采用麻雀搜索算法(SSA)和鲸鱼优化算法(WOA)对参数进行优化,估算出土壤有机碳含量。结果表明:基于该模型的甘南藏族自治州草地表面有机碳储量空间分布呈现自西向东逐渐降低的趋势,西北高东南低,西北平均气温相对较低,有机碳含量较高;年平均气温、增强植被指数(EVI)和数字高程模型(DEM)对甘南草原有机碳含量有显著影响,是影响草地有机碳空间分布的主要因素。在随机森林、决策树、梯度提升回归、CatBoost、XGBoost和LightGBM中,CatBoost模型在测试集中表现最好。根据SSA和WOA的收敛速度曲线,发现SSA收敛速度更快,更新参数更有效。优化后的SSA-CatBoost模型在预测有机碳含量方面表现最佳。土壤有机碳的空间分布对区域生态系统和碳循环具有重要影响。甘南西北草原土壤肥力和碳储量潜力较大,有助于制定更有效的土壤管理和生态保护战略,减缓气候变暖进程,进一步促进全球生态系统的可持续发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
环境科学
环境科学 Environmental Science-Environmental Science (all)
CiteScore
4.40
自引率
0.00%
发文量
15329
期刊介绍:
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信