Songlin Lu, Yuanfang Huang, Wan Xiang Shen, Yu Lin Cao, Mengna Cai, Yan Chen, Ying Tan, Yu Yang Jiang, Yu Zong Chen
{"title":"Raman spectroscopic deep learning with signal aggregated representations for enhanced cell phenotype and signature identification","authors":"Songlin Lu, Yuanfang Huang, Wan Xiang Shen, Yu Lin Cao, Mengna Cai, Yan Chen, Ying Tan, Yu Yang Jiang, Yu Zong Chen","doi":"10.1093/pnasnexus/pgae268","DOIUrl":null,"url":null,"abstract":"Feature representation is critical for data learning, particularly in learning spectroscopic data. Machine learning (ML) and deep learning (DL) models learn Raman spectra for rapid, non-destructive, and label-free cell phenotype identification, which facilitate diagnostic, therapeutic, forensic, and microbiological applications. But these are challenged by high-dimensional, unordered and low-sample spectroscopic data. Here we introduced novel 2D image-like dual signal and component aggregated representations by restructuring Raman spectra and principal components, which enables spectroscopic DL for enhanced cell phenotype and signature identification. New ConvNet models DSCARNets significantly outperformed the state-of-the-art (SOTA) ML and DL models on six benchmark datasets, mostly with >2% improvement over the SOTA performance of 85%-97% accuracies. DSCARNets also performed well on four additional datasets against SOTA models of extremely high performances (>98%) and two datasets without a published supervised phenotype classification model. Explainable DSCARNets identified Raman signatures consistent with experimental indications.","PeriodicalId":516525,"journal":{"name":"PNAS Nexus","volume":"77 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PNAS Nexus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/pnasnexus/pgae268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Feature representation is critical for data learning, particularly in learning spectroscopic data. Machine learning (ML) and deep learning (DL) models learn Raman spectra for rapid, non-destructive, and label-free cell phenotype identification, which facilitate diagnostic, therapeutic, forensic, and microbiological applications. But these are challenged by high-dimensional, unordered and low-sample spectroscopic data. Here we introduced novel 2D image-like dual signal and component aggregated representations by restructuring Raman spectra and principal components, which enables spectroscopic DL for enhanced cell phenotype and signature identification. New ConvNet models DSCARNets significantly outperformed the state-of-the-art (SOTA) ML and DL models on six benchmark datasets, mostly with >2% improvement over the SOTA performance of 85%-97% accuracies. DSCARNets also performed well on four additional datasets against SOTA models of extremely high performances (>98%) and two datasets without a published supervised phenotype classification model. Explainable DSCARNets identified Raman signatures consistent with experimental indications.