Utilising routinely collected clinical data through time series deep learning to improve identification of bacterial bloodstream infections: a retrospective cohort study

IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS
Damien K Ming PhD , Vasin Vasikasin PhD , Timothy M Rawson PhD , Prof Pantelis Georgiou PhD , Frances J Davies PhD , Prof Alison H Holmes FMedSci , Bernard Hernandez PhD
{"title":"Utilising routinely collected clinical data through time series deep learning to improve identification of bacterial bloodstream infections: a retrospective cohort study","authors":"Damien K Ming PhD ,&nbsp;Vasin Vasikasin PhD ,&nbsp;Timothy M Rawson PhD ,&nbsp;Prof Pantelis Georgiou PhD ,&nbsp;Frances J Davies PhD ,&nbsp;Prof Alison H Holmes FMedSci ,&nbsp;Bernard Hernandez PhD","doi":"10.1016/j.landig.2025.01.010","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Blood cultures are the gold standard for diagnosing bacterial bloodstream infections, but test results are only available 24–48 h after sampling. We aimed to develop and evaluate models using health-care data to predict bloodstream infections in patients admitted to hospital.</div></div><div><h3>Methods</h3><div>In this retrospective cohort study, we used routinely collected blood biomarkers and demographic data from patients who underwent blood sample collection for testing via culture between March 3, 2014, and Dec 1, 2021, at Imperial College Healthcare NHS Trust (London, UK) as model features. Data up to 14 days before blood sample collection were provided to long short-term memory (LSTM) or static logistic regression models. The primary outcome was prediction of blood culture results, defined as a pathogenic bloodstream infection (ie, isolation of pathogenic bacteria of interest) or no bloodstream infection (ie, no growth or contamination). Data collected up to Feb 28, 2021 (n=15 212) comprised the training set and were evaluated against a temporal hold-out test set comprising patients who were sampled after March 1, 2021 (n=5638).</div></div><div><h3>Findings</h3><div>Among 20 850 patients with available data, pathogenic bacteria were observed in the cultured blood samples of 3866 (18·5%) patients. 2920 (62·2%) of 4897 patients who had their blood samples taken more than 48 h after admission to hospital had pathogenic bloodstream infections, and so were defined as having hospital-acquired bloodstream infections. Including data from the 7 days before admission (7-day window approach) and using five-fold cross validation in the training set gave an area under receiver operator curve (AUROC) of 0·75 (IQR 0·68–0·82) and an area under the precision recall curve (AUPRC) of 0·58 (0·46–0·77) for static models and an AUROC of 0·92 (0·91–0·93) and AUPRC of 0·75 (0·72–0·76) for the LSTM model. In the hold-out test set performances were: AUROC of 0·74 (95% CI 0·70–0·78) and AUPRC of 0·48 (0·43–0·53) for static models and AUROC of 0·97 (0·96–0·97) and AUPRC of 0·65 (0·60–0·70) for LSTM. Removal of time series information resulted in lower model performance, particularly for hospital-acquired bloodstream infections. Dynamics of C-reactive protein concentration, eosinophil count, and platelet count were important features for prediction of blood culture results.</div></div><div><h3>Interpretation</h3><div>Deep learning models accounting for longitudinal changes could support individualised clinical decision making for patients at risk of bloodstream infections. Appropriate implementation into existing diagnostic pathways could enhance diagnostic stewardship and reduce unnecessary antimicrobial prescribing.</div></div><div><h3>Funding</h3><div>UK Department of Health and Social Care, the National Institute for Health and Care Research, and the Wellcome Trust.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 3","pages":"Pages e205-e215"},"PeriodicalIF":23.8000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S258975002500010X","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Blood cultures are the gold standard for diagnosing bacterial bloodstream infections, but test results are only available 24–48 h after sampling. We aimed to develop and evaluate models using health-care data to predict bloodstream infections in patients admitted to hospital.

Methods

In this retrospective cohort study, we used routinely collected blood biomarkers and demographic data from patients who underwent blood sample collection for testing via culture between March 3, 2014, and Dec 1, 2021, at Imperial College Healthcare NHS Trust (London, UK) as model features. Data up to 14 days before blood sample collection were provided to long short-term memory (LSTM) or static logistic regression models. The primary outcome was prediction of blood culture results, defined as a pathogenic bloodstream infection (ie, isolation of pathogenic bacteria of interest) or no bloodstream infection (ie, no growth or contamination). Data collected up to Feb 28, 2021 (n=15 212) comprised the training set and were evaluated against a temporal hold-out test set comprising patients who were sampled after March 1, 2021 (n=5638).

Findings

Among 20 850 patients with available data, pathogenic bacteria were observed in the cultured blood samples of 3866 (18·5%) patients. 2920 (62·2%) of 4897 patients who had their blood samples taken more than 48 h after admission to hospital had pathogenic bloodstream infections, and so were defined as having hospital-acquired bloodstream infections. Including data from the 7 days before admission (7-day window approach) and using five-fold cross validation in the training set gave an area under receiver operator curve (AUROC) of 0·75 (IQR 0·68–0·82) and an area under the precision recall curve (AUPRC) of 0·58 (0·46–0·77) for static models and an AUROC of 0·92 (0·91–0·93) and AUPRC of 0·75 (0·72–0·76) for the LSTM model. In the hold-out test set performances were: AUROC of 0·74 (95% CI 0·70–0·78) and AUPRC of 0·48 (0·43–0·53) for static models and AUROC of 0·97 (0·96–0·97) and AUPRC of 0·65 (0·60–0·70) for LSTM. Removal of time series information resulted in lower model performance, particularly for hospital-acquired bloodstream infections. Dynamics of C-reactive protein concentration, eosinophil count, and platelet count were important features for prediction of blood culture results.

Interpretation

Deep learning models accounting for longitudinal changes could support individualised clinical decision making for patients at risk of bloodstream infections. Appropriate implementation into existing diagnostic pathways could enhance diagnostic stewardship and reduce unnecessary antimicrobial prescribing.

Funding

UK Department of Health and Social Care, the National Institute for Health and Care Research, and the Wellcome Trust.
通过时间序列深度学习利用常规收集的临床数据来提高细菌血流感染的识别:一项回顾性队列研究
血液培养是诊断细菌性血流感染的金标准,但检测结果仅在采样后24-48小时可用。我们的目的是开发和评估使用医疗保健数据的模型来预测住院患者的血液感染。方法在这项回顾性队列研究中,我们使用2014年3月3日至2021年12月1日在帝国理工学院医疗保健NHS信托基金(伦敦,英国)接受血样采集并通过培养进行检测的患者常规采集的血液生物标志物和人口统计学数据作为模型特征。采集血样前14天的数据提供给长短期记忆(LSTM)或静态逻辑回归模型。主要结果是血培养结果的预测,定义为致病性血流感染(即分离出感兴趣的致病性细菌)或没有血流感染(即没有生长或污染)。收集到2021年2月28日的数据(n= 15212)组成了训练集,并与2021年3月1日之后采样的患者(n=5638)组成的时间保留测试集进行了评估。结果在20850例有资料的患者中,培养血中检出致病菌3866例(18.5%)。4897例入院后48 h以上采血的患者中病原性血流感染2920例(62.2%),定义为医院获得性血流感染。采用入院前7天的数据(7天窗口法),对训练集进行五重交叉验证,静态模型的接收算子曲线下面积(AUROC)为0.75 (IQR为0.68 ~ 0.82),精确查全曲线下面积(AUPRC)为0.58 (0.46 ~ 0.77),LSTM模型的AUROC为0.92 (0.91 ~ 0.93),AUPRC为0.75(0.72 ~ 0.76)。静态模型的AUROC为0.74 (95% CI为0.70 ~ 0.78),AUPRC为0.48 (95% CI为0.43 ~ 0.53);LSTM模型的AUROC为0.97 (95% CI为0.96 ~ 0.97),AUPRC为0.65 (95% CI为0.60 ~ 0.70)。去除时间序列信息导致模型性能降低,特别是对于医院获得性血液感染。c反应蛋白浓度、嗜酸性粒细胞计数和血小板计数的动态变化是预测血培养结果的重要特征。解释:考虑纵向变化的深度学习模型可以支持有血流感染风险的患者的个性化临床决策。在现有诊断途径中适当实施可加强诊断管理并减少不必要的抗菌药物处方。资助:英国卫生和社会保障部、国家卫生和保健研究所以及威康信托基金。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
41.20
自引率
1.60%
发文量
232
审稿时长
13 weeks
期刊介绍: The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health. The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health. We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信