Development of a machine learning-based prediction model for serious bacterial infections in febrile young infants.

IF 2.3 4区医学 Q2 PEDIATRICS

BMJ Paediatrics Open Pub Date : 2025-07-30 DOI:10.1136/bmjpo-2025-003548

Jun Sung Park, Reenar Yoo, Soo-Young Lim, Dahyun Kim, Min Kyo Chun, Jeeho Han, Jeong-Yong Lee, Seung Jun Choi, Seak Hee Oh, Jong Seung Lee, Jina Lee

{"title":"Development of a machine learning-based prediction model for serious bacterial infections in febrile young infants.","authors":"Jun Sung Park, Reenar Yoo, Soo-Young Lim, Dahyun Kim, Min Kyo Chun, Jeeho Han, Jeong-Yong Lee, Seung Jun Choi, Seak Hee Oh, Jong Seung Lee, Jina Lee","doi":"10.1136/bmjpo-2025-003548","DOIUrl":null,"url":null,"abstract":"Background: To develop and validate machine learning (ML)-based models to predict serious bacterial infections (SBIs) in febrile infants aged ≤90 days.Methods: This retrospective study analysed data from febrile infants (≥38.0℃) aged ≤90 days. The development dataset comprised data from patients who visited the Seoul Asan Medical Center between 2015 and 2021, whereas the validation dataset included data from those who visited the centre from January 2022 to August 2023. Logistic regression (LR) and eXtreme Gradient Boosting (XGB) were used to develop the models for predicting SBIs, which were then compared with traditional rule-based models.Results: The study included data from 2860 patients: 2288 (80%) in the development dataset and 572 (20%) in the validation dataset. SBIs were confirmed in 482 patients (21.0%) in the development dataset and 131 (22.9%) in the validation dataset. The XGB and LR models showed excellent performance with areas under the curve of 0.990 and 0.981 in development, and 0.989 and 0.985 in validation datasets. In validation, both models demonstrated superior specificity (82.3-87.0% vs 46.2-72.2%) and positive predictive value (61.5-68.5% vs 34.4-49.8%) compared with traditional rule-based models, while maintaining perfect sensitivity and negative predictive value (both 100% vs 81.7-100% and 92.0-100%, respectively) without any false negatives. Urinalysis, C-reactive protein and procalcitonin were identified as top-tier features in the XGB model.Conclusions: The ML-based prediction model demonstrated robust performance, with superior specificity and perfect sensitivity, which may enhance the accuracy of SBI detection and reduce the costs associated with false positives.","PeriodicalId":9069,"journal":{"name":"BMJ Paediatrics Open","volume":"9 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12314954/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Paediatrics Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/bmjpo-2025-003548","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PEDIATRICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: To develop and validate machine learning (ML)-based models to predict serious bacterial infections (SBIs) in febrile infants aged ≤90 days.

Methods: This retrospective study analysed data from febrile infants (≥38.0℃) aged ≤90 days. The development dataset comprised data from patients who visited the Seoul Asan Medical Center between 2015 and 2021, whereas the validation dataset included data from those who visited the centre from January 2022 to August 2023. Logistic regression (LR) and eXtreme Gradient Boosting (XGB) were used to develop the models for predicting SBIs, which were then compared with traditional rule-based models.

Results: The study included data from 2860 patients: 2288 (80%) in the development dataset and 572 (20%) in the validation dataset. SBIs were confirmed in 482 patients (21.0%) in the development dataset and 131 (22.9%) in the validation dataset. The XGB and LR models showed excellent performance with areas under the curve of 0.990 and 0.981 in development, and 0.989 and 0.985 in validation datasets. In validation, both models demonstrated superior specificity (82.3-87.0% vs 46.2-72.2%) and positive predictive value (61.5-68.5% vs 34.4-49.8%) compared with traditional rule-based models, while maintaining perfect sensitivity and negative predictive value (both 100% vs 81.7-100% and 92.0-100%, respectively) without any false negatives. Urinalysis, C-reactive protein and procalcitonin were identified as top-tier features in the XGB model.

Conclusions: The ML-based prediction model demonstrated robust performance, with superior specificity and perfect sensitivity, which may enhance the accuracy of SBI detection and reduce the costs associated with false positives.

查看原文本刊更多论文

基于机器学习的发热婴儿严重细菌感染预测模型的开发。

背景：开发并验证基于机器学习（ML）的模型来预测年龄≤90天的发热婴儿的严重细菌感染（SBIs）。方法：回顾性分析年龄≤90天发热婴儿（≥38.0℃）的资料。开发数据集包括2015年至2021年访问首尔峨山医院的患者数据，验证数据集包括2022年1月至2023年8月访问首尔峨山医院的患者数据。采用Logistic回归（LR）和极限梯度增强（XGB）建立了sbi预测模型，并与传统的基于规则的模型进行了比较。结果：该研究纳入了2860例患者的数据：2288例（80%）在开发数据集中，572例（20%）在验证数据集中。在开发数据集中有482例（21.0%）患者确诊为sbi，在验证数据集中有131例（22.9%）患者确诊为sbi。XGB和LR模型在开发数据集曲线下面积分别为0.990和0.981，验证数据集曲线下面积分别为0.989和0.985。在验证中，两种模型的特异性（82.3 ~ 87.0% vs 46.2 ~ 72.2%）和阳性预测值（61.5 ~ 68.5% vs 34.4 ~ 49.8%）均优于传统的基于规则的模型，同时保持完美的敏感性和阴性预测值（分别为100% vs 81.7 ~ 100%和92.0 ~ 100%），无假阴性。尿液分析、c反应蛋白和降钙素原被确定为XGB模型的顶级特征。结论：基于ml的预测模型具有较强的特异性和较好的灵敏度，可以提高SBI检测的准确性，降低假阳性的相关成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMJ Paediatrics Open Medicine-Pediatrics, Perinatology and Child Health

CiteScore

4.10

自引率

3.80%

发文量

124