Development and application of an early prediction model for risk of bloodstream infection based on real-world study.

IF 3.3 3区 医学 Q2 MEDICAL INFORMATICS
Xiefei Hu, Shenshen Zhi, Yang Li, Yuming Cheng, Haiping Fan, Haorong Li, Zihao Meng, Jiaxin Xie, Shu Tang, Wei Li
{"title":"Development and application of an early prediction model for risk of bloodstream infection based on real-world study.","authors":"Xiefei Hu, Shenshen Zhi, Yang Li, Yuming Cheng, Haiping Fan, Haorong Li, Zihao Meng, Jiaxin Xie, Shu Tang, Wei Li","doi":"10.1186/s12911-025-03020-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Bloodstream Infection (BSI) is a severe systemic infectious disease that can lead to sepsis and Multiple Organ Dysfunction Syndrome (MODS), resulting in high mortality rates and posing a major public health burden globally. Early identification of BSI is crucial for effective intervention, reducing mortality, and improving patient outcomes. However, existing diagnostic methods are flawed by low specificity, long detection times and high demands on testing platforms. The development of artificial intelligence provides a new approach for early disease identification. This study aims to explore the optimal combination of routine laboratory data and clinical monitoring indicators, and to utilize machine learning algorithms to construct an early, rapid, and universally applicable BSI risk prediction model, to assist in the early diagnosis of BSI in clinical practice.</p><p><strong>Methods: </strong>Clinical data of 2582 suspected BSI patients admitted to the Chongqing University Central Hospital, from January 1, 2021 to December 31, 2023 were collected for this study. The data were divided into a modeling dataset and an external validation dataset based on chronological order, while the modeling dataset was further divided into a training set and an internal validation set. The occurrence rate of BSI, distribution of pathogens, and microbial primary reporting time were analyzed within the training set. During the feature selection stage, univariate regression and ML algorithms were applied. First, Univariate logistic regression was used to screen for predictive factors of BSI. Then, the Boruta algorithm, Lasso regression, and Recursive Feature Elimination with Cross-validation (RFE-CV) were employed to determine the optimal combination of predictors for predicting BSI. Based on the optimal combination, six machine learning algorithms were used to construct an early BSI risk prediction model. The best model was selected by models' performance, and the Shapley Additive Explanations (SHAP) method was used to explain the model. The external validation set was used to evaluate the predictive performance and generalizability of the selected model, and the research findings were ultimately applied in clinical practice.</p><p><strong>Results: </strong>The incidence of BSI among inpatients at the Chongqing University Central Hospital was 12.91%. Following further feature selection, a set of 5 variables was determined, including white blood cell count, standard bicarbonate, base excess of extracellular fluid, interleukin-6, and body temperature. BSI early risk prediction models were constructed using six machine learning algorithms, with the XGBoost model demonstrating the best performance, achieving an AUC value of 0.782 in the internal validation set and an AUC value of 0.776 in the external validation set. This model is made publicly available as an online webpage tool for clinical use.</p><p><strong>Conclusions: </strong>This study successfully identified a set of 5 features by analyzing routine laboratory data clinical monitoring indicators among hospitalized patients. Based on this set, a machine learning-based early risk prediction model for BSI was constructed. The model is capable of early and rapid differentiation between BSI and non-BSI patients. The inclusion of minimal risk prediction factors enhances its applicability in clinical settings, particularly at the primary care level. To further improve the model's real-world applicability and more convenient for clinical use, the online application of the model could greatly improve the efficiency of BSI diagnosis and reducing patients' mortality.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"186"},"PeriodicalIF":3.3000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12079808/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03020-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Bloodstream Infection (BSI) is a severe systemic infectious disease that can lead to sepsis and Multiple Organ Dysfunction Syndrome (MODS), resulting in high mortality rates and posing a major public health burden globally. Early identification of BSI is crucial for effective intervention, reducing mortality, and improving patient outcomes. However, existing diagnostic methods are flawed by low specificity, long detection times and high demands on testing platforms. The development of artificial intelligence provides a new approach for early disease identification. This study aims to explore the optimal combination of routine laboratory data and clinical monitoring indicators, and to utilize machine learning algorithms to construct an early, rapid, and universally applicable BSI risk prediction model, to assist in the early diagnosis of BSI in clinical practice.

Methods: Clinical data of 2582 suspected BSI patients admitted to the Chongqing University Central Hospital, from January 1, 2021 to December 31, 2023 were collected for this study. The data were divided into a modeling dataset and an external validation dataset based on chronological order, while the modeling dataset was further divided into a training set and an internal validation set. The occurrence rate of BSI, distribution of pathogens, and microbial primary reporting time were analyzed within the training set. During the feature selection stage, univariate regression and ML algorithms were applied. First, Univariate logistic regression was used to screen for predictive factors of BSI. Then, the Boruta algorithm, Lasso regression, and Recursive Feature Elimination with Cross-validation (RFE-CV) were employed to determine the optimal combination of predictors for predicting BSI. Based on the optimal combination, six machine learning algorithms were used to construct an early BSI risk prediction model. The best model was selected by models' performance, and the Shapley Additive Explanations (SHAP) method was used to explain the model. The external validation set was used to evaluate the predictive performance and generalizability of the selected model, and the research findings were ultimately applied in clinical practice.

Results: The incidence of BSI among inpatients at the Chongqing University Central Hospital was 12.91%. Following further feature selection, a set of 5 variables was determined, including white blood cell count, standard bicarbonate, base excess of extracellular fluid, interleukin-6, and body temperature. BSI early risk prediction models were constructed using six machine learning algorithms, with the XGBoost model demonstrating the best performance, achieving an AUC value of 0.782 in the internal validation set and an AUC value of 0.776 in the external validation set. This model is made publicly available as an online webpage tool for clinical use.

Conclusions: This study successfully identified a set of 5 features by analyzing routine laboratory data clinical monitoring indicators among hospitalized patients. Based on this set, a machine learning-based early risk prediction model for BSI was constructed. The model is capable of early and rapid differentiation between BSI and non-BSI patients. The inclusion of minimal risk prediction factors enhances its applicability in clinical settings, particularly at the primary care level. To further improve the model's real-world applicability and more convenient for clinical use, the online application of the model could greatly improve the efficiency of BSI diagnosis and reducing patients' mortality.

基于现实世界研究的血流感染风险早期预测模型的开发与应用。
背景:血流感染(BSI)是一种严重的全身性传染病,可导致败血症和多器官功能障碍综合征(MODS),导致高死亡率,并在全球范围内构成重大公共卫生负担。早期识别BSI对于有效干预、降低死亡率和改善患者预后至关重要。然而,现有的诊断方法存在特异性低、检测时间长、对检测平台要求高的缺陷。人工智能的发展为疾病的早期识别提供了新的途径。本研究旨在探索常规实验室数据与临床监测指标的最佳结合,利用机器学习算法构建早期、快速、普遍适用的BSI风险预测模型,协助临床对BSI的早期诊断。方法:收集重庆大学中心医院2021年1月1日至2023年12月31日收治的2582例疑似BSI患者的临床资料。将数据按时间顺序分为建模数据集和外部验证数据集,将建模数据集进一步分为训练数据集和内部验证数据集。分析训练集内BSI发生率、病原菌分布、微生物初报时间。在特征选择阶段,采用单变量回归和ML算法。首先,采用单变量logistic回归筛选BSI的预测因素。然后,采用Boruta算法、Lasso回归和递归特征消除交叉验证(RFE-CV)来确定预测BSI的最佳预测因子组合。在最优组合的基础上,利用6种机器学习算法构建早期BSI风险预测模型。根据模型的性能选择最佳模型,采用Shapley加性解释(SHAP)方法对模型进行解释。外部验证集用于评估所选模型的预测性能和泛化性,研究结果最终应用于临床实践。结果:重庆大学中心医院住院患者BSI发生率为12.91%。在进一步的特征选择之后,确定了一组5个变量,包括白细胞计数、标准碳酸氢盐、细胞外液碱性过剩、白细胞介素-6和体温。采用6种机器学习算法构建BSI早期风险预测模型,其中XGBoost模型表现最佳,在内部验证集中AUC值为0.782,在外部验证集中AUC值为0.776。该模型作为临床使用的在线网页工具公开提供。结论:本研究通过分析住院患者的常规实验室数据和临床监测指标,成功识别出5个特征。在此基础上,构建了基于机器学习的BSI早期风险预测模型。该模型能够早期快速区分BSI和非BSI患者。最小风险预测因素的纳入提高了其在临床环境中的适用性,特别是在初级保健水平。为了进一步提高模型在现实世界的适用性,方便临床使用,该模型的在线应用可以大大提高BSI的诊断效率,降低患者的死亡率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信