Semi-Supervised Deep Learning-Based Multi-component Spectral Calibration Modeling for UV–vis and Near-Infrared Spectroscopy without Information Loss

IF 6.7 1区 化学 Q1 CHEMISTRY, ANALYTICAL
Fei Cheng, Chunhua Yang, Hongqiu Zhu, Yonggang Li, Lijuan Lan and Kai Wang*, 
{"title":"Semi-Supervised Deep Learning-Based Multi-component Spectral Calibration Modeling for UV–vis and Near-Infrared Spectroscopy without Information Loss","authors":"Fei Cheng,&nbsp;Chunhua Yang,&nbsp;Hongqiu Zhu,&nbsp;Yonggang Li,&nbsp;Lijuan Lan and Kai Wang*,&nbsp;","doi":"10.1021/acs.analchem.3c01132","DOIUrl":null,"url":null,"abstract":"<p >Spectral analysis is an important method for characterizing and identifying chemical species. However, quantitative spectral analysis of multiple chemical properties in the real world has always been a challenging problem due to the strong correlation, massive noise, and serious information overlapping of the spectral features. Here, we present a new semi-supervised spectral calibration method based on information lossless decoupling of spectral features named NICEM. To realize the separation and extraction of key latent features, the method uses the flow-based model non-linear independent component estimation (NICE) to learn the sample distribution. The spectral data information is transformed into independent latent variables obeying Gaussian distribution by the reversible structure of deep network without information loss, so as to find the essential properties and realize the feature nonlinear decomposition. Moreover, the association between the input latent feature variables and attributes is evaluated by the maximum mutual information coefficient to eliminate the adverse effects of irrelevant information in the latent variable space and mine key information. Since the latent variables are independent in each dimension, the NICEM method is easier to establish an accurate semi-supervised multi-component calibration model even for high overlapping and complex spectral data. The applicability of the proposed spectral modeling method is demonstrated by using three ultraviolet–visible and near-infrared spectral data sets with 15 physical and chemical properties including diesel fuels, corn, and multi-metal ions solution. Results show that the proposed NICEM method has the highest determination coefficient (<i>R</i><sup>2</sup>) and significantly improves extrapolation compared with the seven state-of-the-art methods. The proposed method is intuitive because it obviates complex feature engineering and prior knowledge and is a promising spectral calibration tool for quantitative analysis in other spectroscopy applications.</p>","PeriodicalId":27,"journal":{"name":"Analytical Chemistry","volume":"95 36","pages":"13446–13455"},"PeriodicalIF":6.7000,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.analchem.3c01132","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Spectral analysis is an important method for characterizing and identifying chemical species. However, quantitative spectral analysis of multiple chemical properties in the real world has always been a challenging problem due to the strong correlation, massive noise, and serious information overlapping of the spectral features. Here, we present a new semi-supervised spectral calibration method based on information lossless decoupling of spectral features named NICEM. To realize the separation and extraction of key latent features, the method uses the flow-based model non-linear independent component estimation (NICE) to learn the sample distribution. The spectral data information is transformed into independent latent variables obeying Gaussian distribution by the reversible structure of deep network without information loss, so as to find the essential properties and realize the feature nonlinear decomposition. Moreover, the association between the input latent feature variables and attributes is evaluated by the maximum mutual information coefficient to eliminate the adverse effects of irrelevant information in the latent variable space and mine key information. Since the latent variables are independent in each dimension, the NICEM method is easier to establish an accurate semi-supervised multi-component calibration model even for high overlapping and complex spectral data. The applicability of the proposed spectral modeling method is demonstrated by using three ultraviolet–visible and near-infrared spectral data sets with 15 physical and chemical properties including diesel fuels, corn, and multi-metal ions solution. Results show that the proposed NICEM method has the highest determination coefficient (R2) and significantly improves extrapolation compared with the seven state-of-the-art methods. The proposed method is intuitive because it obviates complex feature engineering and prior knowledge and is a promising spectral calibration tool for quantitative analysis in other spectroscopy applications.

Abstract Image

基于半监督深度学习的无信息损失紫外-可见和近红外光谱多组分校准建模
光谱分析是表征和鉴定化学物质的重要方法。然而,由于光谱特征相关性强、噪声大、信息重叠严重,对现实世界中多种化学性质的定量光谱分析一直是一个具有挑战性的问题。本文提出了一种基于光谱特征信息无损解耦的半监督光谱定标方法NICEM。为了实现关键潜在特征的分离和提取,该方法采用基于流的模型非线性独立分量估计(NICE)来学习样本分布。利用无信息损失的深度网络可逆结构,将光谱数据信息转化为服从高斯分布的独立潜变量,从而找到特征的本质属性,实现特征的非线性分解。通过最大互信息系数评价输入潜在特征变量与属性之间的关联,消除潜在变量空间中不相关信息的不利影响,挖掘关键信息。由于潜在变量在每个维度上是独立的,因此即使对于高重叠和复杂的光谱数据,NICEM方法也更容易建立精确的半监督多分量校准模型。利用柴油燃料、玉米、多金属离子溶液等15种理化性质的3组紫外-可见和近红外光谱数据,验证了所提出的光谱建模方法的适用性。结果表明,与现有的7种方法相比,所提出的NICEM方法具有最高的决定系数(R2),并显著提高了外推率。该方法直观,避免了复杂的特征工程和先验知识,是一种很有前途的光谱定标工具,可用于其他光谱应用的定量分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Analytical Chemistry
Analytical Chemistry 化学-分析化学
CiteScore
12.10
自引率
12.20%
发文量
1949
审稿时长
1.4 months
期刊介绍: Analytical Chemistry, a peer-reviewed research journal, focuses on disseminating new and original knowledge across all branches of analytical chemistry. Fundamental articles may explore general principles of chemical measurement science and need not directly address existing or potential analytical methodology. They can be entirely theoretical or report experimental results. Contributions may cover various phases of analytical operations, including sampling, bioanalysis, electrochemistry, mass spectrometry, microscale and nanoscale systems, environmental analysis, separations, spectroscopy, chemical reactions and selectivity, instrumentation, imaging, surface analysis, and data processing. Papers discussing known analytical methods should present a significant, original application of the method, a notable improvement, or results on an important analyte.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信