Variable Selection in Near-Infrared Spectra for Modeling of Hemoglobin Content in Bio-Water Solutions

IF 0.8 4区 化学 Q4 SPECTROSCOPY
Renjie Fang, Xin Han, Xiangxian Li, Jingjing Tong, Minguang Gao, Yang Wang
{"title":"Variable Selection in Near-Infrared Spectra for Modeling of Hemoglobin Content in Bio-Water Solutions","authors":"Renjie Fang,&nbsp;Xin Han,&nbsp;Xiangxian Li,&nbsp;Jingjing Tong,&nbsp;Minguang Gao,&nbsp;Yang Wang","doi":"10.1007/s10812-024-01801-0","DOIUrl":null,"url":null,"abstract":"<p>The background differences in water content of different samples have a very strong influence on the robustness of near-infrared spectroscopy (NIRS). For this reason, this study simulated typical biological water matrix samples with formulated hemoglobin (Hb), glucose (Glc), and distilled water, and attempted to use four different intelligent spectral variable selection algorithms [Competitive Adaptive Reweighted Sampling (CARS), Randomized Frog Hopping Algorithm (RF), Genetic Algorithm (GA), and Variable Projection Importance Algorithm (VIP)] to perform the Hb water interference-resistant feature band preferences, while combining partial least squares (PLS) in parallel to build a robust quantitative model of Hb. In addition, the applicability and validity of the model were validated using three prediction sets <i>P</i><sub>1</sub>, <i>P</i><sub>2</sub>, <i>P</i><sub>3</sub> with different water backgrounds (the formulation method and composition were kept the same, and only the water content increased sequentially). The results showed that RF, GA, and VIP could effectively screen out the characteristic wavelengths of Hb with low sensitivity to water changes and successfully correct the water effect, but due to the large number of characteristic variables they screened out and the existence of a large number of redundant and water interference variables, this ultimately made the model's robustness less than ideal. The CARS algorithm performed the best, and the RMSEP of the three prediction sets were 0.016, 0.017, and 0.038, which is closer to the RMSECV of the calibration set. Therefore, NIRS combined with the variable selection can reduce the effect of water on model robustness and improve the prediction accuracy of the model by the method of selecting effective wave number intervals, and CARS may be one of the ideal algorithms to solve such problems.</p>","PeriodicalId":609,"journal":{"name":"Journal of Applied Spectroscopy","volume":"91 4","pages":"928 - 935"},"PeriodicalIF":0.8000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1007/s10812-024-01801-0","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"SPECTROSCOPY","Score":null,"Total":0}
引用次数: 0

Abstract

The background differences in water content of different samples have a very strong influence on the robustness of near-infrared spectroscopy (NIRS). For this reason, this study simulated typical biological water matrix samples with formulated hemoglobin (Hb), glucose (Glc), and distilled water, and attempted to use four different intelligent spectral variable selection algorithms [Competitive Adaptive Reweighted Sampling (CARS), Randomized Frog Hopping Algorithm (RF), Genetic Algorithm (GA), and Variable Projection Importance Algorithm (VIP)] to perform the Hb water interference-resistant feature band preferences, while combining partial least squares (PLS) in parallel to build a robust quantitative model of Hb. In addition, the applicability and validity of the model were validated using three prediction sets P1, P2, P3 with different water backgrounds (the formulation method and composition were kept the same, and only the water content increased sequentially). The results showed that RF, GA, and VIP could effectively screen out the characteristic wavelengths of Hb with low sensitivity to water changes and successfully correct the water effect, but due to the large number of characteristic variables they screened out and the existence of a large number of redundant and water interference variables, this ultimately made the model's robustness less than ideal. The CARS algorithm performed the best, and the RMSEP of the three prediction sets were 0.016, 0.017, and 0.038, which is closer to the RMSECV of the calibration set. Therefore, NIRS combined with the variable selection can reduce the effect of water on model robustness and improve the prediction accuracy of the model by the method of selecting effective wave number intervals, and CARS may be one of the ideal algorithms to solve such problems.

在近红外光谱中选择变量,为生物水溶液中的血红蛋白含量建模
不同样品含水量的背景差异对近红外光谱(NIRS)的稳健性有很大影响。因此,本研究模拟了血红蛋白(Hb)、葡萄糖(Glc)和蒸馏水等典型的生物水基质样品,并尝试使用四种不同的智能光谱变量选择算法[竞争性自适应重加权采样(CARS)、随机化蛙跳算法(RF)、遗传算法(GR)、自适应重加权采样(CARS)]、随机蛙跳算法(RF)、遗传算法(GA)和可变投影重要度算法(VIP)]来进行 Hb 水抗干扰特征频带优选,同时结合偏最小二乘法(PLS)并行建立 Hb 的稳健定量模型。此外,还使用三个预测集 P1、P2、P3 验证了模型的适用性和有效性,这三个预测集具有不同的水背景(配方方法和成分保持不变,只是水含量依次增加)。结果表明,RF、GA 和 VIP 能有效筛选出对水分变化敏感度较低的 Hb 特征波长,并成功校正了水分效应,但由于它们筛选出的特征变量较多,且存在大量冗余变量和水分干扰变量,最终使模型的鲁棒性不够理想。CARS 算法表现最好,三个预测集的 RMSEP 分别为 0.016、0.017 和 0.038,与校准集的 RMSECV 比较接近。因此,近红外系统与变量选择相结合,可以通过选择有效波数区间的方法减少水对模型鲁棒性的影响,提高模型的预测精度,而 CARS 可能是解决此类问题的理想算法之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.30
自引率
14.30%
发文量
145
审稿时长
2.5 months
期刊介绍: Journal of Applied Spectroscopy reports on many key applications of spectroscopy in chemistry, physics, metallurgy, and biology. An increasing number of papers focus on the theory of lasers, as well as the tremendous potential for the practical applications of lasers in numerous fields and industries.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信