用于频谱分析的基本主成分数和近乎无训练模型

Yifeng Bie;Shuai You;Xinrui Li;Xuekui Zhang;Tao Lu
{"title":"用于频谱分析的基本主成分数和近乎无训练模型","authors":"Yifeng Bie;Shuai You;Xinrui Li;Xuekui Zhang;Tao Lu","doi":"10.1109/TPAMI.2024.3436860","DOIUrl":null,"url":null,"abstract":"Learning-enabled spectroscopic analysis, promising for automated real-time analysis of chemicals, is facing several challenges. First, a typical machine learning model requires a large number of training samples that physical systems can not provide. Second, it requires the testing samples to be in range with the training samples, which often is not the case in the real world. Further, a spectroscopy device is limited by its memory size, computing power, and battery capacity. That requires highly efficient learning models for on-site analysis. In this paper, by analyzing multi-gas mixtures and multi-molecule suspensions, we first show that orders of magnitude reduction of data dimension can be achieved as the number of principal components that need to be retained is the same as the independent constituents in the mixture. From this principle, we designed highly compact models in which the essential principal components can be directly extracted from the interrelations between the individual chemical properties and principal components; and only a few training samples are required. Our model can predict the constituent concentrations that have not been seen in the training dataset and provide estimations of measurement noises. This approach can be extended as an effectively standardized method for principle component extraction.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10620616","citationCount":"0","resultStr":"{\"title\":\"Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis\",\"authors\":\"Yifeng Bie;Shuai You;Xinrui Li;Xuekui Zhang;Tao Lu\",\"doi\":\"10.1109/TPAMI.2024.3436860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning-enabled spectroscopic analysis, promising for automated real-time analysis of chemicals, is facing several challenges. First, a typical machine learning model requires a large number of training samples that physical systems can not provide. Second, it requires the testing samples to be in range with the training samples, which often is not the case in the real world. Further, a spectroscopy device is limited by its memory size, computing power, and battery capacity. That requires highly efficient learning models for on-site analysis. In this paper, by analyzing multi-gas mixtures and multi-molecule suspensions, we first show that orders of magnitude reduction of data dimension can be achieved as the number of principal components that need to be retained is the same as the independent constituents in the mixture. From this principle, we designed highly compact models in which the essential principal components can be directly extracted from the interrelations between the individual chemical properties and principal components; and only a few training samples are required. Our model can predict the constituent concentrations that have not been seen in the training dataset and provide estimations of measurement noises. This approach can be extended as an effectively standardized method for principle component extraction.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10620616\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10620616/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10620616/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

具有学习功能的光谱分析有望实现化学物质的自动实时分析,但目前面临着一些挑战。首先,典型的机器学习模型需要大量的训练样本,而物理系统无法提供。其次,它要求测试样本与训练样本在一定范围内,而现实世界中往往不存在这种情况。此外,光谱设备还受到内存大小、计算能力和电池容量的限制。这就需要高效的学习模型来进行现场分析。在本文中,通过分析多气体混合物和多分子悬浮液,我们首先表明,由于需要保留的主成分数量与混合物中的独立成分数量相同,因此可以实现数据维度的数量级缩减。根据这一原理,我们设计了高度紧凑的模型,可以直接从单个化学特性和主成分之间的相互关系中提取基本主成分,而且只需要少量训练样本。我们的模型可以预测训练数据集中未出现的成分浓度,并提供测量噪声估计。这种方法可以扩展为一种有效的标准化原理成分提取方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis
Learning-enabled spectroscopic analysis, promising for automated real-time analysis of chemicals, is facing several challenges. First, a typical machine learning model requires a large number of training samples that physical systems can not provide. Second, it requires the testing samples to be in range with the training samples, which often is not the case in the real world. Further, a spectroscopy device is limited by its memory size, computing power, and battery capacity. That requires highly efficient learning models for on-site analysis. In this paper, by analyzing multi-gas mixtures and multi-molecule suspensions, we first show that orders of magnitude reduction of data dimension can be achieved as the number of principal components that need to be retained is the same as the independent constituents in the mixture. From this principle, we designed highly compact models in which the essential principal components can be directly extracted from the interrelations between the individual chemical properties and principal components; and only a few training samples are required. Our model can predict the constituent concentrations that have not been seen in the training dataset and provide estimations of measurement noises. This approach can be extended as an effectively standardized method for principle component extraction.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信