False hope of a single generalisable AI sepsis prediction model: bias and proposed mitigation strategies for improving performance based on a retrospective multisite cohort study.

IF 5.6 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES
Rudolf Schnetler, Anton van der Vegt, Vikrant R Kalke, Paul Lane, Ian Scott
{"title":"False hope of a single generalisable AI sepsis prediction model: bias and proposed mitigation strategies for improving performance based on a retrospective multisite cohort study.","authors":"Rudolf Schnetler, Anton van der Vegt, Vikrant R Kalke, Paul Lane, Ian Scott","doi":"10.1136/bmjqs-2024-018328","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To identify bias in using a single machine learning (ML) sepsis prediction model across multiple hospitals and care locations; evaluate the impact of six different bias mitigation strategies and propose a generic modelling approach for developing best-performing models.</p><p><strong>Methods: </strong>We developed a baseline ML model to predict sepsis using retrospective data on patients in emergency departments (EDs) and wards across nine hospitals. We set model sensitivity at 70% and determined the number of alerts required to be evaluated (number needed to evaluate (NNE), 95% CI) for each case of true sepsis and the number of hours between the first alert and timestamped outcomes meeting sepsis-3 reference criteria (HTS3). Six bias mitigation models were compared with the baseline model for impact on NNE and HTS3.</p><p><strong>Results: </strong>Across 969 292 admissions, mean NNE for the baseline model was significantly lower for EDs (6.1 patients, 95% CI 6 to 6.2) than for wards (7.5 patients, 95% CI 7.4 to 7.5). Across all sites, median HTS3 was 20 hours (20-21) for wards vs 5 (5-5) for EDs. Bias mitigation models significantly impacted NNE but not HTS3. Compared with the baseline model, the best-performing models for NNE with reduced interhospital variance were those trained separately on data from ED patients or from ward patients across all sites. These models generated the lowest NNE results for all care locations in seven of nine hospitals.</p><p><strong>Conclusions: </strong>Implementing a single sepsis prediction model across all sites and care locations within multihospital systems may be unacceptable given large variances in NNE across multiple sites. Bias mitigation methods can identify models demonstrating improved performance across most sites in reducing alert burden but with no impact on the length of the prediction window.</p>","PeriodicalId":9077,"journal":{"name":"BMJ Quality & Safety","volume":" ","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Quality & Safety","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/bmjqs-2024-018328","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To identify bias in using a single machine learning (ML) sepsis prediction model across multiple hospitals and care locations; evaluate the impact of six different bias mitigation strategies and propose a generic modelling approach for developing best-performing models.

Methods: We developed a baseline ML model to predict sepsis using retrospective data on patients in emergency departments (EDs) and wards across nine hospitals. We set model sensitivity at 70% and determined the number of alerts required to be evaluated (number needed to evaluate (NNE), 95% CI) for each case of true sepsis and the number of hours between the first alert and timestamped outcomes meeting sepsis-3 reference criteria (HTS3). Six bias mitigation models were compared with the baseline model for impact on NNE and HTS3.

Results: Across 969 292 admissions, mean NNE for the baseline model was significantly lower for EDs (6.1 patients, 95% CI 6 to 6.2) than for wards (7.5 patients, 95% CI 7.4 to 7.5). Across all sites, median HTS3 was 20 hours (20-21) for wards vs 5 (5-5) for EDs. Bias mitigation models significantly impacted NNE but not HTS3. Compared with the baseline model, the best-performing models for NNE with reduced interhospital variance were those trained separately on data from ED patients or from ward patients across all sites. These models generated the lowest NNE results for all care locations in seven of nine hospitals.

Conclusions: Implementing a single sepsis prediction model across all sites and care locations within multihospital systems may be unacceptable given large variances in NNE across multiple sites. Bias mitigation methods can identify models demonstrating improved performance across most sites in reducing alert burden but with no impact on the length of the prediction window.

求助全文
约1分钟内获得全文 求助全文
来源期刊
BMJ Quality & Safety
BMJ Quality & Safety HEALTH CARE SCIENCES & SERVICES-
CiteScore
9.80
自引率
7.40%
发文量
104
审稿时长
4-8 weeks
期刊介绍: BMJ Quality & Safety (previously Quality & Safety in Health Care) is an international peer review publication providing research, opinions, debates and reviews for academics, clinicians and healthcare managers focused on the quality and safety of health care and the science of improvement. The journal receives approximately 1000 manuscripts a year and has an acceptance rate for original research of 12%. Time from submission to first decision averages 22 days and accepted articles are typically published online within 20 days. Its current impact factor is 3.281.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信