Integrating a host transcriptomic biomarker with a large language model for diagnosis of lower respiratory tract infection

Hoang Van Phan, Natasha Spottiswoode, Emily C. Lydon, Victoria T. Chu, Adolfo Cuesta, Alexander D. Kazberouk, Natalie L. Richmond, Carolyn S. Calfee, Charles R. Langelier
{"title":"Integrating a host transcriptomic biomarker with a large language model for diagnosis of lower respiratory tract infection","authors":"Hoang Van Phan, Natasha Spottiswoode, Emily C. Lydon, Victoria T. Chu, Adolfo Cuesta, Alexander D. Kazberouk, Natalie L. Richmond, Carolyn S. Calfee, Charles R. Langelier","doi":"10.1101/2024.08.28.24312732","DOIUrl":null,"url":null,"abstract":"Lower respiratory tract infections (LRTIs) are a leading cause of mortality worldwide. Despite this, diagnosing LRTI remains challenging, particularly in the intensive care unit, where non-infectious respiratory conditions can present with similar features. Here, we tested a new method for LRTI diagnosis that combines the transcriptomic biomarker <em>FABP4</em> with assessment of text from the electronic medical record (EMR) using the large language model Generative Pre-trained Transformer 4 (GPT-4). We evaluated this methodology in a prospective cohort of critically ill adults with acute respiratory failure, in which we measured pulmonary <em>FABP4</em> expression and identified patients with LRTI or non-infectious conditions using retrospective adjudication. A diagnostic classifier combining <em>FABP4</em> and GPT-4 achieved an area under the receiver operator curve (AUC) of 0.92 ± 0.06 by five-fold cross validation (CV), outperforming classifiers based on <em>FABP4</em> expression alone (AUC 0.83) or GPT-4 alone (AUC 0.84). At the Youden’s index within each CV fold, the combined classifier achieved a mean sensitivity of 92% ± 7%, specificity of 90% ± 17% and accuracy of 91% +/- 8%. Taken together, our findings suggest that combining a host transcriptional biomarker with interpretation of EMR data using artificial intelligence is a promising new approach to infectious disease diagnosis.","PeriodicalId":501509,"journal":{"name":"medRxiv - Infectious Diseases","volume":"2010 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Infectious Diseases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.28.24312732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Lower respiratory tract infections (LRTIs) are a leading cause of mortality worldwide. Despite this, diagnosing LRTI remains challenging, particularly in the intensive care unit, where non-infectious respiratory conditions can present with similar features. Here, we tested a new method for LRTI diagnosis that combines the transcriptomic biomarker FABP4 with assessment of text from the electronic medical record (EMR) using the large language model Generative Pre-trained Transformer 4 (GPT-4). We evaluated this methodology in a prospective cohort of critically ill adults with acute respiratory failure, in which we measured pulmonary FABP4 expression and identified patients with LRTI or non-infectious conditions using retrospective adjudication. A diagnostic classifier combining FABP4 and GPT-4 achieved an area under the receiver operator curve (AUC) of 0.92 ± 0.06 by five-fold cross validation (CV), outperforming classifiers based on FABP4 expression alone (AUC 0.83) or GPT-4 alone (AUC 0.84). At the Youden’s index within each CV fold, the combined classifier achieved a mean sensitivity of 92% ± 7%, specificity of 90% ± 17% and accuracy of 91% +/- 8%. Taken together, our findings suggest that combining a host transcriptional biomarker with interpretation of EMR data using artificial intelligence is a promising new approach to infectious disease diagnosis.
将宿主转录组生物标记物与大型语言模型相结合诊断下呼吸道感染
下呼吸道感染(LRTI)是导致全球死亡的主要原因。尽管如此,LRTI 的诊断仍然具有挑战性,尤其是在重症监护病房,因为非感染性呼吸道疾病也可能表现出类似的特征。在这里,我们测试了一种新的 LRTI 诊断方法,它将转录组生物标志物 FABP4 与使用大型语言模型生成预训练转换器 4 (GPT-4) 评估电子病历 (EMR) 中的文本相结合。我们在急性呼吸衰竭重症成人前瞻性队列中评估了这一方法,测量了肺部 FABP4 的表达,并通过回顾性判定确定了 LRTI 或非感染性疾病患者。通过五倍交叉验证(CV),结合 FABP4 和 GPT-4 的诊断分类器的接收运算曲线下面积(AUC)为 0.92 ± 0.06,优于仅基于 FABP4 表达的分类器(AUC 0.83)或仅基于 GPT-4 的分类器(AUC 0.84)。在每个交叉验证褶皱内的尤登指数上,组合分类器的平均灵敏度为 92% ± 7%,特异度为 90% ± 17%,准确度为 91% +/- 8%。综上所述,我们的研究结果表明,将宿主转录生物标记物与利用人工智能解读EMR数据相结合是一种很有前景的传染病诊断新方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信