{"title":"单细胞拉曼光谱分析中基于片段加权相似性的片段学习模型。","authors":"LangLang Yi,Qiudi Ye,Chaofan Wang,Qingqing Hu,Shiya Zhang,Xiaokun Shen,Minghui Liang,Guoqian Li,Klyuyev Dmitriy,Yang Guo,Qiang Yu,Bo Hu","doi":"10.1021/acs.analchem.5c01189","DOIUrl":null,"url":null,"abstract":"Raman spectroscopy provides intrinsic biochemical profiles of all cellular biomolecules in a segmented manner, promising nondestructive and label-free phenotyping at the single-cell level. However, current analytical methods rarely utilize spectral biological characteristics and their fusion with data characteristics, limiting the application of these methods to biological Raman spectroscopy. Herein, a segment-weighting similarity-based fragment-learning (SWS-FL) model, integrating SWS-based feature extraction and fusion learning, is proposed to fuse biological and data characteristics for single-cell spectral analysis, which segments spectra into fragments and differentiates their biological characteristics for fusing feature matrices. The SWS-based feature extraction fabricates a group of low-dimensional feature vectors at multiple N values, providing a more distinguishable feature space compared to conventional KNN. The weights of five fragments, including the fingerprint region, protein I region, mixed region, protein II region, and genetic material region, are assigned as 0.282, 0.302, 0.273, 0.276, and 0.239, respectively, which highlights the spectral biological characteristics. The fusion learning process synthesizes characteristics from all spectral fragments using an ANN, achieving accuracy with only 0.5% variation across N values from 1 to 30, greatly enhancing the robustness of the model. In the five-classification task of breast cancer cells and their subtypes, the accuracy and kappa coefficient of SWS-FL can reach 94.9% and 0.943%, respectively, which are 5% and 7% higher than those of ANN. The generalization capability is also validated on the data set of lung cancer cells and their subtypes. This model provides a new path for the fusion of biological and data characteristics in spectral analysis and promises to be a powerful analytical framework in more spectroscopic areas.","PeriodicalId":27,"journal":{"name":"Analytical Chemistry","volume":"7 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Segment-Weighting Similarity-Based Fragment-Learning Model for Single-Cell Raman Spectral Analysis.\",\"authors\":\"LangLang Yi,Qiudi Ye,Chaofan Wang,Qingqing Hu,Shiya Zhang,Xiaokun Shen,Minghui Liang,Guoqian Li,Klyuyev Dmitriy,Yang Guo,Qiang Yu,Bo Hu\",\"doi\":\"10.1021/acs.analchem.5c01189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Raman spectroscopy provides intrinsic biochemical profiles of all cellular biomolecules in a segmented manner, promising nondestructive and label-free phenotyping at the single-cell level. However, current analytical methods rarely utilize spectral biological characteristics and their fusion with data characteristics, limiting the application of these methods to biological Raman spectroscopy. Herein, a segment-weighting similarity-based fragment-learning (SWS-FL) model, integrating SWS-based feature extraction and fusion learning, is proposed to fuse biological and data characteristics for single-cell spectral analysis, which segments spectra into fragments and differentiates their biological characteristics for fusing feature matrices. The SWS-based feature extraction fabricates a group of low-dimensional feature vectors at multiple N values, providing a more distinguishable feature space compared to conventional KNN. The weights of five fragments, including the fingerprint region, protein I region, mixed region, protein II region, and genetic material region, are assigned as 0.282, 0.302, 0.273, 0.276, and 0.239, respectively, which highlights the spectral biological characteristics. The fusion learning process synthesizes characteristics from all spectral fragments using an ANN, achieving accuracy with only 0.5% variation across N values from 1 to 30, greatly enhancing the robustness of the model. In the five-classification task of breast cancer cells and their subtypes, the accuracy and kappa coefficient of SWS-FL can reach 94.9% and 0.943%, respectively, which are 5% and 7% higher than those of ANN. The generalization capability is also validated on the data set of lung cancer cells and their subtypes. This model provides a new path for the fusion of biological and data characteristics in spectral analysis and promises to be a powerful analytical framework in more spectroscopic areas.\",\"PeriodicalId\":27,\"journal\":{\"name\":\"Analytical Chemistry\",\"volume\":\"7 1\",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2025-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analytical Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.analchem.5c01189\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.analchem.5c01189","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
Segment-Weighting Similarity-Based Fragment-Learning Model for Single-Cell Raman Spectral Analysis.
Raman spectroscopy provides intrinsic biochemical profiles of all cellular biomolecules in a segmented manner, promising nondestructive and label-free phenotyping at the single-cell level. However, current analytical methods rarely utilize spectral biological characteristics and their fusion with data characteristics, limiting the application of these methods to biological Raman spectroscopy. Herein, a segment-weighting similarity-based fragment-learning (SWS-FL) model, integrating SWS-based feature extraction and fusion learning, is proposed to fuse biological and data characteristics for single-cell spectral analysis, which segments spectra into fragments and differentiates their biological characteristics for fusing feature matrices. The SWS-based feature extraction fabricates a group of low-dimensional feature vectors at multiple N values, providing a more distinguishable feature space compared to conventional KNN. The weights of five fragments, including the fingerprint region, protein I region, mixed region, protein II region, and genetic material region, are assigned as 0.282, 0.302, 0.273, 0.276, and 0.239, respectively, which highlights the spectral biological characteristics. The fusion learning process synthesizes characteristics from all spectral fragments using an ANN, achieving accuracy with only 0.5% variation across N values from 1 to 30, greatly enhancing the robustness of the model. In the five-classification task of breast cancer cells and their subtypes, the accuracy and kappa coefficient of SWS-FL can reach 94.9% and 0.943%, respectively, which are 5% and 7% higher than those of ANN. The generalization capability is also validated on the data set of lung cancer cells and their subtypes. This model provides a new path for the fusion of biological and data characteristics in spectral analysis and promises to be a powerful analytical framework in more spectroscopic areas.
期刊介绍:
Analytical Chemistry, a peer-reviewed research journal, focuses on disseminating new and original knowledge across all branches of analytical chemistry. Fundamental articles may explore general principles of chemical measurement science and need not directly address existing or potential analytical methodology. They can be entirely theoretical or report experimental results. Contributions may cover various phases of analytical operations, including sampling, bioanalysis, electrochemistry, mass spectrometry, microscale and nanoscale systems, environmental analysis, separations, spectroscopy, chemical reactions and selectivity, instrumentation, imaging, surface analysis, and data processing. Papers discussing known analytical methods should present a significant, original application of the method, a notable improvement, or results on an important analyte.