Development and validation of an interpretable machine learning model for retrospective identification of suspected infection for sepsis surveillance: a multicentre cohort study.

IF 10 1区 医学 Q1 MEDICINE, GENERAL & INTERNAL
EClinicalMedicine Pub Date : 2025-08-08 eCollection Date: 2025-09-01 DOI:10.1016/j.eclinm.2025.103401
Renée A M Tuinte, Luuk P J Smolenaers, Bram T Knoop, Konstantin Föhse, Tamar J van der Aart, Hjalmar R Bouma, Mihai G Netea, Katrijn Van Deun, Jaap Ten Oever, Jacobien J Hoogerwerf
{"title":"Development and validation of an interpretable machine learning model for retrospective identification of suspected infection for sepsis surveillance: a multicentre cohort study.","authors":"Renée A M Tuinte, Luuk P J Smolenaers, Bram T Knoop, Konstantin Föhse, Tamar J van der Aart, Hjalmar R Bouma, Mihai G Netea, Katrijn Van Deun, Jaap Ten Oever, Jacobien J Hoogerwerf","doi":"10.1016/j.eclinm.2025.103401","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>How to identify suspected infection for sepsis surveillance purposes remains a well-recognised challenge. This study aimed to operationalise suspected infection for sepsis surveillance by developing an interpretable machine learning (ML) model for retrospective identification of patients with sepsis.</p><p><strong>Methods: </strong>This multicentre cohort and machine learning study was conducted in two Dutch tertiary care hospitals. Adult patients with a quick Sequential Organ Failure assessment (qSOFA) ≥2 were included. Exclusion criteria included admission to the intensive care unit, transfer to or from another hospital, or patient refusal to reuse data. Cohort one consisted of patients admitted to the Emergency Department (ED) of hospital A between 01/01/2019 and 12/31/2019, to investigate community-onset sepsis. An external validation cohort of ED patients was obtained from hospital B between 01/01/2021 and 06/03/2022. Cohort two included hospitalised patients from hospital A between 01/01/2021 and 06/01/2022, to investigate hospital-onset sepsis. Objective data were extracted from electronic health records. Seven ML methods, including gradient boosting, random forest, logistic regression, decision trees, support vector machines, K nearest neighbours and stochastic gradient descent, were trained to identify sepsis with manual chart review as reference standard. The F1 score (harmonic mean of precision and recall), sensitivity and specificity were used as evaluation metrics. The best performing ML method was compared with other commonly used suspected infection proxies, including the Sepsis-3 definition, an adapted Adult Sepsis Event (ASE) definition and International Classification of Diseases (ICD) codes.</p><p><strong>Findings: </strong>In the ED cohort, 655 patients were included (male: 355 (54.2%), female: 300 (45.8%)) and 240 (36.6%) had sepsis. For community-onset sepsis, gradient boosting performed best with an F1 score of 85.9%, a sensitivity of 91.1% (95%-CI 83.4-95.4%) and a specificity of 89.0% (95%-CI 83.4-92.8%). Most model features reflected either the inflammatory response (CRP, body temperature) or actions taken when an infection is suspected (antibiotic administration, microbial culture). In the external validation cohort, 185 patients were included (male: 94 (50.8%), female: 91 (49.2%)) and 54 (29.2%) had sepsis. External validation yielded an F1 score of 85.7%, a sensitivity of 87.5% (95%-CI 75.3-94.1%) and a specificity of 92.5% (95%-CI 85.9-96.2%). The gradient boosting model outperformed other commonly used proxies for suspected infection in terms of sensitivity, achieving 91.1% (95% CI: 83.4-95.4%), compared to Sepsis-3 with 78.9% (95% CI: 69.4-86.0%), the adapted ASE with 85.6% (95% CI: 76.8-91.4%), and ICD codes with 33.3% (95% CI: 24.5-43.6%). In the hospitalised cohort, 493 patients were included (male: 265 (53.8%), female: 228 (46.2%)) and 129 (26.2%) had sepsis. For hospital-onset sepsis, logistic regression had the highest F1 score (52.2%). Sensitivity was 58.1% (95%-CI 40.6-75.5%) and specificity was 82.9% (95%-CI 76.0-89.8%).</p><p><strong>Interpretation: </strong>ED patients meeting ≥2 qSOFA criteria can be accurately classified as having suspected infection or not by a gradient boosting algorithm, outperforming common suspected infection definitions for sepsis surveillance. Including the inflammatory response in the suspected infection surveillance definition may enhance the accuracy and objectivity of sepsis surveillance. Future research is needed to validate the algorithm using other organ dysfunction criteria and in international settings.</p><p><strong>Funding: </strong>None.</p>","PeriodicalId":11393,"journal":{"name":"EClinicalMedicine","volume":"87 ","pages":"103401"},"PeriodicalIF":10.0000,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12355416/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EClinicalMedicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.eclinm.2025.103401","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: How to identify suspected infection for sepsis surveillance purposes remains a well-recognised challenge. This study aimed to operationalise suspected infection for sepsis surveillance by developing an interpretable machine learning (ML) model for retrospective identification of patients with sepsis.

Methods: This multicentre cohort and machine learning study was conducted in two Dutch tertiary care hospitals. Adult patients with a quick Sequential Organ Failure assessment (qSOFA) ≥2 were included. Exclusion criteria included admission to the intensive care unit, transfer to or from another hospital, or patient refusal to reuse data. Cohort one consisted of patients admitted to the Emergency Department (ED) of hospital A between 01/01/2019 and 12/31/2019, to investigate community-onset sepsis. An external validation cohort of ED patients was obtained from hospital B between 01/01/2021 and 06/03/2022. Cohort two included hospitalised patients from hospital A between 01/01/2021 and 06/01/2022, to investigate hospital-onset sepsis. Objective data were extracted from electronic health records. Seven ML methods, including gradient boosting, random forest, logistic regression, decision trees, support vector machines, K nearest neighbours and stochastic gradient descent, were trained to identify sepsis with manual chart review as reference standard. The F1 score (harmonic mean of precision and recall), sensitivity and specificity were used as evaluation metrics. The best performing ML method was compared with other commonly used suspected infection proxies, including the Sepsis-3 definition, an adapted Adult Sepsis Event (ASE) definition and International Classification of Diseases (ICD) codes.

Findings: In the ED cohort, 655 patients were included (male: 355 (54.2%), female: 300 (45.8%)) and 240 (36.6%) had sepsis. For community-onset sepsis, gradient boosting performed best with an F1 score of 85.9%, a sensitivity of 91.1% (95%-CI 83.4-95.4%) and a specificity of 89.0% (95%-CI 83.4-92.8%). Most model features reflected either the inflammatory response (CRP, body temperature) or actions taken when an infection is suspected (antibiotic administration, microbial culture). In the external validation cohort, 185 patients were included (male: 94 (50.8%), female: 91 (49.2%)) and 54 (29.2%) had sepsis. External validation yielded an F1 score of 85.7%, a sensitivity of 87.5% (95%-CI 75.3-94.1%) and a specificity of 92.5% (95%-CI 85.9-96.2%). The gradient boosting model outperformed other commonly used proxies for suspected infection in terms of sensitivity, achieving 91.1% (95% CI: 83.4-95.4%), compared to Sepsis-3 with 78.9% (95% CI: 69.4-86.0%), the adapted ASE with 85.6% (95% CI: 76.8-91.4%), and ICD codes with 33.3% (95% CI: 24.5-43.6%). In the hospitalised cohort, 493 patients were included (male: 265 (53.8%), female: 228 (46.2%)) and 129 (26.2%) had sepsis. For hospital-onset sepsis, logistic regression had the highest F1 score (52.2%). Sensitivity was 58.1% (95%-CI 40.6-75.5%) and specificity was 82.9% (95%-CI 76.0-89.8%).

Interpretation: ED patients meeting ≥2 qSOFA criteria can be accurately classified as having suspected infection or not by a gradient boosting algorithm, outperforming common suspected infection definitions for sepsis surveillance. Including the inflammatory response in the suspected infection surveillance definition may enhance the accuracy and objectivity of sepsis surveillance. Future research is needed to validate the algorithm using other organ dysfunction criteria and in international settings.

Funding: None.

Abstract Image

Abstract Image

开发和验证一个可解释的机器学习模型,用于脓毒症监测中疑似感染的回顾性识别:一项多中心队列研究。
背景:如何为脓毒症监测目的识别疑似感染仍然是一个公认的挑战。本研究旨在通过开发可解释的机器学习(ML)模型,对脓毒症患者进行回顾性识别,从而对疑似感染进行脓毒症监测。方法:这项多中心队列和机器学习研究在荷兰的两家三级医院进行。纳入快速序贯器官衰竭评估(qSOFA)≥2的成年患者。排除标准包括入住重症监护病房、转至或转离其他医院或患者拒绝重复使用数据。队列一包括2019年1月1日至2019年12月31日在A医院急诊科(ED)住院的患者,以调查社区发病的脓毒症。在2021年1月1日至2022年3月6日期间,从B医院获得了ED患者的外部验证队列。队列二纳入了2021年1月1日至2022年6月1日期间A医院住院的患者,以调查院源性败血症。从电子病历中提取客观数据。训练梯度增强、随机森林、逻辑回归、决策树、支持向量机、K近邻和随机梯度下降等7种机器学习方法,以手工图表回顾为参考标准识别脓毒症。以F1评分(精密度和召回率的调和平均值)、敏感性和特异性作为评价指标。将表现最佳的ML方法与其他常用的疑似感染指标进行比较,包括脓毒症-3定义、成人脓毒症事件(ASE)定义和国际疾病分类(ICD)代码。结果:在ED队列中,包括655例患者(男性:355例(54.2%),女性:300例(45.8%))和240例(36.6%)败血症。对于社区发病的脓毒症,梯度增强效果最好,F1评分为85.9%,敏感性为91.1% (95%-CI 83.4-95.4%),特异性为89.0% (95%-CI 83.4-92.8%)。大多数模型特征要么反映炎症反应(CRP、体温),要么反映怀疑感染时采取的行动(抗生素给药、微生物培养)。在外部验证队列中,纳入185例患者(男性94例(50.8%),女性91例(49.2%))和54例(29.2%)败血症。外部验证的F1评分为85.7%,敏感性为87.5% (95%-CI 75.3-94.1%),特异性为92.5% (95%-CI 85.9-96.2%)。在敏感性方面,梯度增强模型优于其他常用的疑似感染代理,达到91.1% (95% CI: 83.4-95.4%),而脓毒症-3为78.9% (95% CI: 69.4-86.0%),适应ASE为85.6% (95% CI: 76.5 -91.4%), ICD代码为33.3% (95% CI: 24.5-43.6%)。在住院队列中,包括493例患者(男性:265例(53.8%),女性:228例(46.2%))和129例(26.2%)败血症。对于院源性败血症,logistic回归的F1评分最高(52.2%)。敏感性为58.1% (95%-CI 40.6-75.5%),特异性为82.9% (95%-CI 76.0-89.8%)。解释:符合≥2个qSOFA标准的ED患者可以通过梯度增强算法准确地分类为疑似感染或非疑似感染,优于败血症监测中常见的疑似感染定义。将炎症反应纳入疑似感染监测定义可提高脓毒症监测的准确性和客观性。未来的研究需要使用其他器官功能障碍标准和国际环境来验证该算法。资金:没有。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
EClinicalMedicine
EClinicalMedicine Medicine-Medicine (all)
CiteScore
18.90
自引率
1.30%
发文量
506
审稿时长
22 days
期刊介绍: eClinicalMedicine is a gold open-access clinical journal designed to support frontline health professionals in addressing the complex and rapid health transitions affecting societies globally. The journal aims to assist practitioners in overcoming healthcare challenges across diverse communities, spanning diagnosis, treatment, prevention, and health promotion. Integrating disciplines from various specialties and life stages, it seeks to enhance health systems as fundamental institutions within societies. With a forward-thinking approach, eClinicalMedicine aims to redefine the future of healthcare.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信