Hyperspectral inversion of soil organic matter based on improved ensemble learning method

IF 4.3 2区 化学 Q1 SPECTROSCOPY
Junjie Liu , Yongsheng Hong , Bifeng Hu , Songchao Chen , Jia Deng , Keyang Yin , Jiao Lin , Defang Luo , Jie Peng , Zhou Shi
{"title":"Hyperspectral inversion of soil organic matter based on improved ensemble learning method","authors":"Junjie Liu ,&nbsp;Yongsheng Hong ,&nbsp;Bifeng Hu ,&nbsp;Songchao Chen ,&nbsp;Jia Deng ,&nbsp;Keyang Yin ,&nbsp;Jiao Lin ,&nbsp;Defang Luo ,&nbsp;Jie Peng ,&nbsp;Zhou Shi","doi":"10.1016/j.saa.2025.126302","DOIUrl":null,"url":null,"abstract":"<div><div>Soil organic matter (SOM) is a vital component of soil, and its rapid and accurate detection is crucial for ensuring land health and stabilizing atmospheric carbon dioxide levels. Soil hyperspectroscopy has demonstrated its efficiency and cost-effectiveness as a method for detecting SOM. In the field of soil spectroscopy, the Ensemble Model (EM) holds substantial promise due to its robust nature and strong generalization capabilities. However, the efficacy of EM is largely contingent upon the judicious selection of the base learner count and the strategic allocation of weights. Traditional practices is mainly relying on empirical weight distribution or a singular index, <em>R<sup>2</sup></em>, of the base learners, with scant clarity on the optimal base learner count for varying ensemble techniques. To address this gap, our study utilizes Vis-NIR spectroscopy to quantitatively assess SOM across 704 samples from the Tarim River Basin in Xinjiang, China. Our objective is to innovate base learner weight assignment methods and identify the differing optimal counts of EM base learners, thereby refining the ensemble approach and augmenting EM performance. Subsequently, we examined the impact of various weight coefficient assignment methods and base learner counts on EM performance within Weighted Averaging (WA), Blending, and Stacking frameworks. Our findings reveal that a weight coefficient assignment method incorporating <em>R<sup>2</sup></em>, <em>RMSE</em>, and <em>MAE</em> significantly enhances EM performance. This improvement surpasses traditional methods relying solely on base learner <em>R<sup>2</sup></em>, yielding an increased EM <em>R<sup>2</sup></em> of 0.006–0.024, with reductions in <em>RMSE</em> and <em>MAE</em> by 0.014–0.085 g kg<sup>−1</sup> and 0.03–0.085 g kg<sup>−1</sup>, respectively. Though the number of base learners is crucial, it does not establish a linear relationship; an increase does not invariably translate to enhanced performance. Notably, when the base learner count is 12, Blending and Stacking exhibit peak performance, whereas WA’s precision continues to ascend with 15 base learners. Among the ensemble methods, Stacking demonstrates the highest precision, achieving a validation <em>R<sup>2</sup></em> of 0.889, <em>RMSE</em> of 0.957 g kg<sup>−1</sup>, and <em>MAE</em> of 0.803 g kg<sup>−1</sup>. In summary, configuring the base learner count to 12 and employing a multi-index comprehensive evaluation for weight assignment within the Stacking method emerges as the optimal integration strategy for SOM hyperspectral inversion.</div></div>","PeriodicalId":433,"journal":{"name":"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy","volume":"339 ","pages":"Article 126302"},"PeriodicalIF":4.3000,"publicationDate":"2025-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386142525006080","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SPECTROSCOPY","Score":null,"Total":0}
引用次数: 0

Abstract

Soil organic matter (SOM) is a vital component of soil, and its rapid and accurate detection is crucial for ensuring land health and stabilizing atmospheric carbon dioxide levels. Soil hyperspectroscopy has demonstrated its efficiency and cost-effectiveness as a method for detecting SOM. In the field of soil spectroscopy, the Ensemble Model (EM) holds substantial promise due to its robust nature and strong generalization capabilities. However, the efficacy of EM is largely contingent upon the judicious selection of the base learner count and the strategic allocation of weights. Traditional practices is mainly relying on empirical weight distribution or a singular index, R2, of the base learners, with scant clarity on the optimal base learner count for varying ensemble techniques. To address this gap, our study utilizes Vis-NIR spectroscopy to quantitatively assess SOM across 704 samples from the Tarim River Basin in Xinjiang, China. Our objective is to innovate base learner weight assignment methods and identify the differing optimal counts of EM base learners, thereby refining the ensemble approach and augmenting EM performance. Subsequently, we examined the impact of various weight coefficient assignment methods and base learner counts on EM performance within Weighted Averaging (WA), Blending, and Stacking frameworks. Our findings reveal that a weight coefficient assignment method incorporating R2, RMSE, and MAE significantly enhances EM performance. This improvement surpasses traditional methods relying solely on base learner R2, yielding an increased EM R2 of 0.006–0.024, with reductions in RMSE and MAE by 0.014–0.085 g kg−1 and 0.03–0.085 g kg−1, respectively. Though the number of base learners is crucial, it does not establish a linear relationship; an increase does not invariably translate to enhanced performance. Notably, when the base learner count is 12, Blending and Stacking exhibit peak performance, whereas WA’s precision continues to ascend with 15 base learners. Among the ensemble methods, Stacking demonstrates the highest precision, achieving a validation R2 of 0.889, RMSE of 0.957 g kg−1, and MAE of 0.803 g kg−1. In summary, configuring the base learner count to 12 and employing a multi-index comprehensive evaluation for weight assignment within the Stacking method emerges as the optimal integration strategy for SOM hyperspectral inversion.

Abstract Image

基于改进集合学习方法的土壤有机质高光谱反演
土壤有机质(SOM)是土壤的重要组成部分,其快速准确的检测对于确保土地健康和稳定大气二氧化碳水平至关重要。土壤超光谱学已经证明了它作为一种检测SOM的方法的效率和成本效益。在土壤光谱学领域,集合模型(Ensemble Model, EM)以其鲁棒性和较强的泛化能力而具有广阔的应用前景。然而,EM的有效性在很大程度上取决于基本学习者数量的明智选择和权重的战略性分配。传统的实践主要依赖于基础学习器的经验权重分布或单一指数R2,对不同集成技术的最佳基础学习器计数缺乏明确的了解。为了解决这一差距,我们的研究利用可见光-近红外光谱对中国新疆塔里木河流域704个样品的SOM进行了定量评估。我们的目标是创新基础学习器权重分配方法,并确定EM基础学习器的不同最佳计数,从而改进集成方法并提高EM性能。随后,我们研究了在加权平均(WA)、混合和堆叠框架中,各种权重系数分配方法和基本学习者计数对EM性能的影响。我们的研究结果表明,结合R2、RMSE和MAE的权重系数分配方法显著提高了EM的性能。这种改进超过了仅依赖基础学习器R2的传统方法,EM R2增加了0.006-0.024,RMSE和MAE分别降低了0.014-0.085 g kg - 1和0.03-0.085 g kg - 1。虽然基础学习器的数量是至关重要的,但它并没有建立线性关系;增加并不一定转化为性能的提高。值得注意的是,当基本学习器数量为12时,混合和堆叠表现出最高的性能,而华盛顿州的精度继续上升,有15个基本学习器。其中,Stacking方法精度最高,验证R2为0.889,RMSE为0.957 g kg - 1, MAE为0.803 g kg - 1。综上所述,将基本学习器数量配置为12,并在堆叠方法中采用多指标综合评价来分配权重,是SOM高光谱反演的最优集成策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.40
自引率
11.40%
发文量
1364
审稿时长
40 days
期刊介绍: Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy (SAA) is an interdisciplinary journal which spans from basic to applied aspects of optical spectroscopy in chemistry, medicine, biology, and materials science. The journal publishes original scientific papers that feature high-quality spectroscopic data and analysis. From the broad range of optical spectroscopies, the emphasis is on electronic, vibrational or rotational spectra of molecules, rather than on spectroscopy based on magnetic moments. Criteria for publication in SAA are novelty, uniqueness, and outstanding quality. Routine applications of spectroscopic techniques and computational methods are not appropriate. Topics of particular interest of Spectrochimica Acta Part A include, but are not limited to: Spectroscopy and dynamics of bioanalytical, biomedical, environmental, and atmospheric sciences, Novel experimental techniques or instrumentation for molecular spectroscopy, Novel theoretical and computational methods, Novel applications in photochemistry and photobiology, Novel interpretational approaches as well as advances in data analysis based on electronic or vibrational spectroscopy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信