Comprehensive metabolomics combined with machine learning for the identification of SARS-CoV-2 and other viruses directly from upper respiratory samples.

IF 5.4 2区 医学 Q1 MICROBIOLOGY
Catherine A Hogan, Anthony T Le, Afraz Khan, LingHui David Su, ChunHong Huang, Malaya K Sahoo, Chieh-Wen Lo, Marwah Karim, Karin Ann Stein, Shirit Einav, Tina M Cowan, Benjamin A Pinsky
{"title":"Comprehensive metabolomics combined with machine learning for the identification of SARS-CoV-2 and other viruses directly from upper respiratory samples.","authors":"Catherine A Hogan, Anthony T Le, Afraz Khan, LingHui David Su, ChunHong Huang, Malaya K Sahoo, Chieh-Wen Lo, Marwah Karim, Karin Ann Stein, Shirit Einav, Tina M Cowan, Benjamin A Pinsky","doi":"10.1128/jcm.02042-24","DOIUrl":null,"url":null,"abstract":"<p><p>Metabolic profiling of respiratory samples from individuals infected and uninfected with respiratory viral infections may identify biomarker signatures that complement routine clinical diagnostic testing and offer unique insights into pathophysiology. We used liquid chromatography quadrupole time-of-flight mass spectrometry to generate untargeted metabolomic profiles and identified top biomarker signatures differentiating severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) positive from negative samples via machine learning. We then adapted these signatures to liquid chromatography-tandem mass spectrometry for targeted profiling and assessed classification performance, including samples positive for other respiratory viruses and negative for viral testing. A total of 1,226 samples were tested, including 521 positive samples for SARS-CoV-2, 97 for influenza A, 96 for respiratory syncytial virus (RSV), 211 for other respiratory viruses, and 301 negative samples. The top-performing model was the Light Gradient Boosting Model, which showed an area under the receiver operating characteristic curve (AUC) of 0.99 (95% confidence interval [CI], 0.99-1.00), sensitivity of 0.96 (95% CI, 0.91-0.99), and specificity of 0.95 (95% CI, 0.90-0.97). A separate machine learning analysis investigating the performance by viral subtype showed high performance for the identification of influenza A virus with an AUC of 0.97 (95% CI, 0.94-0.99) and RSV with an AUC of 0.99 (95% CI, 0.97-1.00). The two features with the highest ranking were identified as 3-oxo-heneicosanoic acid and 2-(4-hydroxyphenyl) ethanol. These findings extend our understanding of the metabolic impact of respiratory viral infections and support the potential of metabolomics to complement routine clinical diagnostic methods.IMPORTANCEMolecular testing has greatly improved how viruses are diagnosed; however, gaps remain, including limited sensitivity directly from specimens and inability to differentiate active from resolved infection. In this study, we investigated the use of a distinct diagnostic approach, mass spectrometry for detection of metabolites (small molecules) combined with machine learning analysis, for the diagnosis of SARS-CoV-2 and other respiratory viruses. We demonstrated strong performance of this approach directly from upper respiratory swab samples to differentiate SARS-CoV-2-infected versus uninfected individuals. Extension of this approach to influenza and RSV maintained a high level of performance. This research suggests that mass spectrometry-based infectious disease diagnostic testing has clinical potential and that these metabolomic features may reveal novel host-pathogen interactions and therapeutic targets. Applying a similar approach to prospective, multisite cohorts of patients with other infectious diseases carries potential to extend our understanding of the metabolic pathways involved in the host response to infection.</p>","PeriodicalId":15511,"journal":{"name":"Journal of Clinical Microbiology","volume":" ","pages":"e0204224"},"PeriodicalIF":5.4000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Microbiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1128/jcm.02042-24","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Metabolic profiling of respiratory samples from individuals infected and uninfected with respiratory viral infections may identify biomarker signatures that complement routine clinical diagnostic testing and offer unique insights into pathophysiology. We used liquid chromatography quadrupole time-of-flight mass spectrometry to generate untargeted metabolomic profiles and identified top biomarker signatures differentiating severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) positive from negative samples via machine learning. We then adapted these signatures to liquid chromatography-tandem mass spectrometry for targeted profiling and assessed classification performance, including samples positive for other respiratory viruses and negative for viral testing. A total of 1,226 samples were tested, including 521 positive samples for SARS-CoV-2, 97 for influenza A, 96 for respiratory syncytial virus (RSV), 211 for other respiratory viruses, and 301 negative samples. The top-performing model was the Light Gradient Boosting Model, which showed an area under the receiver operating characteristic curve (AUC) of 0.99 (95% confidence interval [CI], 0.99-1.00), sensitivity of 0.96 (95% CI, 0.91-0.99), and specificity of 0.95 (95% CI, 0.90-0.97). A separate machine learning analysis investigating the performance by viral subtype showed high performance for the identification of influenza A virus with an AUC of 0.97 (95% CI, 0.94-0.99) and RSV with an AUC of 0.99 (95% CI, 0.97-1.00). The two features with the highest ranking were identified as 3-oxo-heneicosanoic acid and 2-(4-hydroxyphenyl) ethanol. These findings extend our understanding of the metabolic impact of respiratory viral infections and support the potential of metabolomics to complement routine clinical diagnostic methods.IMPORTANCEMolecular testing has greatly improved how viruses are diagnosed; however, gaps remain, including limited sensitivity directly from specimens and inability to differentiate active from resolved infection. In this study, we investigated the use of a distinct diagnostic approach, mass spectrometry for detection of metabolites (small molecules) combined with machine learning analysis, for the diagnosis of SARS-CoV-2 and other respiratory viruses. We demonstrated strong performance of this approach directly from upper respiratory swab samples to differentiate SARS-CoV-2-infected versus uninfected individuals. Extension of this approach to influenza and RSV maintained a high level of performance. This research suggests that mass spectrometry-based infectious disease diagnostic testing has clinical potential and that these metabolomic features may reveal novel host-pathogen interactions and therapeutic targets. Applying a similar approach to prospective, multisite cohorts of patients with other infectious diseases carries potential to extend our understanding of the metabolic pathways involved in the host response to infection.

综合代谢组学结合机器学习直接从上呼吸道样本中鉴定SARS-CoV-2和其他病毒。
对感染和未感染呼吸道病毒感染的个体的呼吸道样本进行代谢分析,可以识别出补充常规临床诊断测试的生物标志物特征,并为病理生理学提供独特的见解。我们使用液相色谱四极杆飞行时间质谱法生成非靶向代谢组学图谱,并通过机器学习确定了区分严重急性呼吸综合征冠状病毒2型(SARS-CoV-2)阳性和阴性样本的顶级生物标志物特征。然后,我们将这些特征应用于液相色谱-串联质谱法进行靶向分析,并评估分类性能,包括其他呼吸道病毒阳性和病毒检测阴性的样本。共检测样本1226份,其中SARS-CoV-2阳性样本521份,甲型流感阳性样本97份,呼吸道合胞病毒(RSV)阳性样本96份,其他呼吸道病毒211份,阴性样本301份。表现最好的模型是光梯度增强模型,其受试者工作特征曲线下面积(AUC)为0.99(95%置信区间[CI], 0.99-1.00),灵敏度为0.96 (95% CI, 0.91-0.99),特异性为0.95 (95% CI, 0.90-0.97)。另一项研究病毒亚型性能的独立机器学习分析显示,甲型流感病毒的AUC为0.97 (95% CI, 0.94-0.99), RSV的AUC为0.99 (95% CI, 0.97-1.00)。排名最高的两个特征是3-氧-二烯二酸和2-(4-羟基苯基)乙醇。这些发现扩展了我们对呼吸道病毒感染的代谢影响的理解,并支持代谢组学补充常规临床诊断方法的潜力。重要性:分子检测极大地改进了病毒的诊断方法;然而,差距仍然存在,包括直接来自标本的有限敏感性以及无法区分活动性感染和已消退感染。在这项研究中,我们研究了使用一种独特的诊断方法,即质谱法检测代谢物(小分子)结合机器学习分析,用于诊断SARS-CoV-2和其他呼吸道病毒。我们证明了直接从上呼吸道拭子样本中区分sars - cov -2感染与未感染个体的这种方法的强大性能。将这种方法扩展到流感和呼吸道合流病毒,保持了高水平的表现。这项研究表明,基于质谱的传染病诊断测试具有临床潜力,这些代谢组学特征可能揭示新的宿主-病原体相互作用和治疗靶点。将类似的方法应用于其他感染性疾病患者的前瞻性多位点队列,有可能扩展我们对宿主对感染反应所涉及的代谢途径的理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Clinical Microbiology
Journal of Clinical Microbiology 医学-微生物学
CiteScore
17.10
自引率
4.30%
发文量
347
审稿时长
3 months
期刊介绍: The Journal of Clinical Microbiology® disseminates the latest research concerning the laboratory diagnosis of human and animal infections, along with the laboratory's role in epidemiology and the management of infectious diseases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信