Performance Degradation between Development and Deployment of a Predictive Model for Central Line-Associated Bloodstream Infections in Hospitalized Children.

IF 2.2 2区医学 Q4 MEDICAL INFORMATICS

Applied Clinical Informatics Pub Date : 2025-08-01 Epub Date: 2025-05-12 DOI:10.1055/a-2605-1847

Jonathan M Beus, Mark Mai, Nikolay P Braykov, Swaminathan Kandaswamy, Edwin Ray, David B Cundiff, Paulette Djachechi, Sarah Thompson, Azade Tabaie, Ryan Birmingham, Rishi Kamaleswaran, Evan Orenstein

{"title":"Performance Degradation between Development and Deployment of a Predictive Model for Central Line-Associated Bloodstream Infections in Hospitalized Children.","authors":"Jonathan M Beus, Mark Mai, Nikolay P Braykov, Swaminathan Kandaswamy, Edwin Ray, David B Cundiff, Paulette Djachechi, Sarah Thompson, Azade Tabaie, Ryan Birmingham, Rishi Kamaleswaran, Evan Orenstein","doi":"10.1055/a-2605-1847","DOIUrl":null,"url":null,"abstract":"Central line-associated bloodstream infections (CLABSIs) are associated with substantial pediatric morbidity and mortality. The capacity to predict which children with central lines are at greatest risk of CLABSI could inform surveillance and prevention efforts. Our team previously published in silico predictive models for CLABSI.To prospectively implement a pediatric CLABSI predictive model and achieve adequate performance in offline validation for implementation in clinical practice.Most performant predictive models were deep learning models requiring substantial pre-processing of many features into 8-hour windows including the current day and up to 56 days prior for the current admission. To replicate this pre-processing, we created a novel infrastructure to (1) organize current-day data for all the relevant features and (2) create a staged historical data store for those same features with application programming interfaces to connect the two. We compared predictive performance of these scores for CLABSI in the next 48 hours with two labels, one based on manual review of positive blood cultures in children with central lines and another based on positive blood culture and receipt of at least 4 days of new IV antibiotics.The area under the receiver-operating characteristic (AUROC) fell from 0.97 from retrospective data to <0.60 despite multiple iterations of troubleshooting. Primary root causes included train/serve skew, feature leakage, and overfitting. Hypothesized secondary drivers were complex model specification, poor data governance, inadequate testing, challenging feature translation between real-time and historical data models, limited monitoring and logging infrastructure for troubleshooting, and suboptimal handoff between the model development and deployment teams.Bridging the gap from predictive model development to clinical deployment requires early and close coordination between data governance, data science, clinical informatics, and implementation engineers. Balancing predictive performance with implementation feasibility can accelerate the adoption of predictive clinical decision support systems.","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":" ","pages":"1192-1199"},"PeriodicalIF":2.2000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12473521/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Clinical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2605-1847","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/12 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Central line-associated bloodstream infections (CLABSIs) are associated with substantial pediatric morbidity and mortality. The capacity to predict which children with central lines are at greatest risk of CLABSI could inform surveillance and prevention efforts. Our team previously published in silico predictive models for CLABSI.To prospectively implement a pediatric CLABSI predictive model and achieve adequate performance in offline validation for implementation in clinical practice.Most performant predictive models were deep learning models requiring substantial pre-processing of many features into 8-hour windows including the current day and up to 56 days prior for the current admission. To replicate this pre-processing, we created a novel infrastructure to (1) organize current-day data for all the relevant features and (2) create a staged historical data store for those same features with application programming interfaces to connect the two. We compared predictive performance of these scores for CLABSI in the next 48 hours with two labels, one based on manual review of positive blood cultures in children with central lines and another based on positive blood culture and receipt of at least 4 days of new IV antibiotics.The area under the receiver-operating characteristic (AUROC) fell from 0.97 from retrospective data to <0.60 despite multiple iterations of troubleshooting. Primary root causes included train/serve skew, feature leakage, and overfitting. Hypothesized secondary drivers were complex model specification, poor data governance, inadequate testing, challenging feature translation between real-time and historical data models, limited monitoring and logging infrastructure for troubleshooting, and suboptimal handoff between the model development and deployment teams.Bridging the gap from predictive model development to clinical deployment requires early and close coordination between data governance, data science, clinical informatics, and implementation engineers. Balancing predictive performance with implementation feasibility can accelerate the adoption of predictive clinical decision support systems.

查看原文本刊更多论文

关于CDS失效的特刊：住院儿童中央线相关血流感染预测模型的开发和部署之间的性能下降。

背景：中心线相关血流感染（CLABSIs）与大量儿科发病率和死亡率相关。预测哪些有中心静脉管的儿童有最大的CLABSI风险的能力可以为监测和预防工作提供信息。我们的团队之前发表了CLABSI的计算机预测模型。目的：前瞻性地实施儿童CLABSI预测模型，并在临床实践中实现足够的离线验证。方法：最高效的预测模型是深度学习模型，需要对许多特征进行大量预处理，进入8小时的窗口，包括当天和当前入院前56天。为了复制这种预处理，我们创建了新的基础设施来(1)组织所有相关特性的当前数据，(2)为这些相同的特性创建一个分阶段的历史数据存储，并使用应用程序编程接口将两者连接起来。我们比较了未来48小时CLABSI评分与两种标签的预测性能，一种是基于对中心静脉导管儿童阳性血培养的人工审查，另一种是基于阳性血培养和接受至少4天新的静脉注射抗生素。结果：接受者操作特征下面积（AUROC）从回顾性数据的0.97下降到结论：要弥合从预测模型开发到临床部署的差距，需要数据治理、数据科学、临床信息学和实施工程师之间的早期密切协调。平衡预测性能和实施可行性可以加速采用预测性临床决策支持系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Clinical Informatics MEDICAL INFORMATICS-

CiteScore

4.60

自引率

24.10%

发文量

132

期刊介绍： ACI is the third Schattauer journal dealing with biomedical and health informatics. It perfectly complements our other journals Öffnet internen Link im aktuellen FensterMethods of Information in Medicine and the Öffnet internen Link im aktuellen FensterYearbook of Medical Informatics. The Yearbook of Medical Informatics being the “Milestone” or state-of-the-art journal and Methods of Information in Medicine being the “Science and Research” journal of IMIA, ACI intends to be the “Practical” journal of IMIA.