单细胞拉曼光谱分析中基于片段加权相似性的片段学习模型。

IF 6.7 1区 化学 Q1 CHEMISTRY, ANALYTICAL
LangLang Yi,Qiudi Ye,Chaofan Wang,Qingqing Hu,Shiya Zhang,Xiaokun Shen,Minghui Liang,Guoqian Li,Klyuyev Dmitriy,Yang Guo,Qiang Yu,Bo Hu
{"title":"单细胞拉曼光谱分析中基于片段加权相似性的片段学习模型。","authors":"LangLang Yi,Qiudi Ye,Chaofan Wang,Qingqing Hu,Shiya Zhang,Xiaokun Shen,Minghui Liang,Guoqian Li,Klyuyev Dmitriy,Yang Guo,Qiang Yu,Bo Hu","doi":"10.1021/acs.analchem.5c01189","DOIUrl":null,"url":null,"abstract":"Raman spectroscopy provides intrinsic biochemical profiles of all cellular biomolecules in a segmented manner, promising nondestructive and label-free phenotyping at the single-cell level. However, current analytical methods rarely utilize spectral biological characteristics and their fusion with data characteristics, limiting the application of these methods to biological Raman spectroscopy. Herein, a segment-weighting similarity-based fragment-learning (SWS-FL) model, integrating SWS-based feature extraction and fusion learning, is proposed to fuse biological and data characteristics for single-cell spectral analysis, which segments spectra into fragments and differentiates their biological characteristics for fusing feature matrices. The SWS-based feature extraction fabricates a group of low-dimensional feature vectors at multiple N values, providing a more distinguishable feature space compared to conventional KNN. The weights of five fragments, including the fingerprint region, protein I region, mixed region, protein II region, and genetic material region, are assigned as 0.282, 0.302, 0.273, 0.276, and 0.239, respectively, which highlights the spectral biological characteristics. The fusion learning process synthesizes characteristics from all spectral fragments using an ANN, achieving accuracy with only 0.5% variation across N values from 1 to 30, greatly enhancing the robustness of the model. In the five-classification task of breast cancer cells and their subtypes, the accuracy and kappa coefficient of SWS-FL can reach 94.9% and 0.943%, respectively, which are 5% and 7% higher than those of ANN. The generalization capability is also validated on the data set of lung cancer cells and their subtypes. This model provides a new path for the fusion of biological and data characteristics in spectral analysis and promises to be a powerful analytical framework in more spectroscopic areas.","PeriodicalId":27,"journal":{"name":"Analytical Chemistry","volume":"7 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Segment-Weighting Similarity-Based Fragment-Learning Model for Single-Cell Raman Spectral Analysis.\",\"authors\":\"LangLang Yi,Qiudi Ye,Chaofan Wang,Qingqing Hu,Shiya Zhang,Xiaokun Shen,Minghui Liang,Guoqian Li,Klyuyev Dmitriy,Yang Guo,Qiang Yu,Bo Hu\",\"doi\":\"10.1021/acs.analchem.5c01189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Raman spectroscopy provides intrinsic biochemical profiles of all cellular biomolecules in a segmented manner, promising nondestructive and label-free phenotyping at the single-cell level. However, current analytical methods rarely utilize spectral biological characteristics and their fusion with data characteristics, limiting the application of these methods to biological Raman spectroscopy. Herein, a segment-weighting similarity-based fragment-learning (SWS-FL) model, integrating SWS-based feature extraction and fusion learning, is proposed to fuse biological and data characteristics for single-cell spectral analysis, which segments spectra into fragments and differentiates their biological characteristics for fusing feature matrices. The SWS-based feature extraction fabricates a group of low-dimensional feature vectors at multiple N values, providing a more distinguishable feature space compared to conventional KNN. The weights of five fragments, including the fingerprint region, protein I region, mixed region, protein II region, and genetic material region, are assigned as 0.282, 0.302, 0.273, 0.276, and 0.239, respectively, which highlights the spectral biological characteristics. The fusion learning process synthesizes characteristics from all spectral fragments using an ANN, achieving accuracy with only 0.5% variation across N values from 1 to 30, greatly enhancing the robustness of the model. In the five-classification task of breast cancer cells and their subtypes, the accuracy and kappa coefficient of SWS-FL can reach 94.9% and 0.943%, respectively, which are 5% and 7% higher than those of ANN. The generalization capability is also validated on the data set of lung cancer cells and their subtypes. This model provides a new path for the fusion of biological and data characteristics in spectral analysis and promises to be a powerful analytical framework in more spectroscopic areas.\",\"PeriodicalId\":27,\"journal\":{\"name\":\"Analytical Chemistry\",\"volume\":\"7 1\",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2025-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analytical Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.analchem.5c01189\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.analchem.5c01189","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

摘要

拉曼光谱以一种分段的方式提供了所有细胞生物分子的内在生化特征,有望在单细胞水平上进行无损和无标记的表型分析。然而,目前的分析方法很少利用光谱生物学特性及其与数据特性的融合,限制了这些方法在生物拉曼光谱中的应用。本文提出了一种基于片段加权相似度的片段学习(SWS-FL)模型,将基于片段加权相似度的特征提取和融合学习相结合,融合单细胞光谱分析的生物特征和数据特征,该模型将光谱分割成片段并区分其生物特征进行融合特征矩阵。基于sws的特征提取在多个N值下生成一组低维特征向量,与传统KNN相比,提供了更可区分的特征空间。指纹区、蛋白I区、混合区、蛋白II区和遗传物质区5个片段的权重分别为0.282、0.302、0.273、0.276和0.239,突出了光谱生物学特征。融合学习过程使用人工神经网络综合所有光谱片段的特征,在1到30的N值范围内,准确率仅为0.5%,大大增强了模型的鲁棒性。在乳腺癌细胞及其亚型的五种分类任务中,SWS-FL的准确率和kappa系数分别达到94.9%和0.943%,比ANN的准确率和kappa系数分别高出5%和7%。在肺癌细胞及其亚型数据集上验证了泛化能力。该模型为光谱分析中生物特征和数据特征的融合提供了新的途径,有望成为更多光谱领域的强大分析框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Segment-Weighting Similarity-Based Fragment-Learning Model for Single-Cell Raman Spectral Analysis.
Raman spectroscopy provides intrinsic biochemical profiles of all cellular biomolecules in a segmented manner, promising nondestructive and label-free phenotyping at the single-cell level. However, current analytical methods rarely utilize spectral biological characteristics and their fusion with data characteristics, limiting the application of these methods to biological Raman spectroscopy. Herein, a segment-weighting similarity-based fragment-learning (SWS-FL) model, integrating SWS-based feature extraction and fusion learning, is proposed to fuse biological and data characteristics for single-cell spectral analysis, which segments spectra into fragments and differentiates their biological characteristics for fusing feature matrices. The SWS-based feature extraction fabricates a group of low-dimensional feature vectors at multiple N values, providing a more distinguishable feature space compared to conventional KNN. The weights of five fragments, including the fingerprint region, protein I region, mixed region, protein II region, and genetic material region, are assigned as 0.282, 0.302, 0.273, 0.276, and 0.239, respectively, which highlights the spectral biological characteristics. The fusion learning process synthesizes characteristics from all spectral fragments using an ANN, achieving accuracy with only 0.5% variation across N values from 1 to 30, greatly enhancing the robustness of the model. In the five-classification task of breast cancer cells and their subtypes, the accuracy and kappa coefficient of SWS-FL can reach 94.9% and 0.943%, respectively, which are 5% and 7% higher than those of ANN. The generalization capability is also validated on the data set of lung cancer cells and their subtypes. This model provides a new path for the fusion of biological and data characteristics in spectral analysis and promises to be a powerful analytical framework in more spectroscopic areas.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Analytical Chemistry
Analytical Chemistry 化学-分析化学
CiteScore
12.10
自引率
12.20%
发文量
1949
审稿时长
1.4 months
期刊介绍: Analytical Chemistry, a peer-reviewed research journal, focuses on disseminating new and original knowledge across all branches of analytical chemistry. Fundamental articles may explore general principles of chemical measurement science and need not directly address existing or potential analytical methodology. They can be entirely theoretical or report experimental results. Contributions may cover various phases of analytical operations, including sampling, bioanalysis, electrochemistry, mass spectrometry, microscale and nanoscale systems, environmental analysis, separations, spectroscopy, chemical reactions and selectivity, instrumentation, imaging, surface analysis, and data processing. Papers discussing known analytical methods should present a significant, original application of the method, a notable improvement, or results on an important analyte.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信