Incorporating Machine Learning Driven Factors in the Design of Electronic-triggers to Detect Diagnostic Errors in the Emergency Department.

IF 1.7 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Journal of Patient Safety Pub Date : 2025-09-24 DOI:10.1097/PTS.0000000000001409

Moein Enayati, Mahsa Khalili, Shrinath Patel, Todd R Huschka, Daniel Cabrera, Sarah J Parker, Kalyan S Pasupathy, Prashant Mahajan, Fernanda Bellolio

{"title":"Incorporating Machine Learning Driven Factors in the Design of Electronic-triggers to Detect Diagnostic Errors in the Emergency Department.","authors":"Moein Enayati, Mahsa Khalili, Shrinath Patel, Todd R Huschka, Daniel Cabrera, Sarah J Parker, Kalyan S Pasupathy, Prashant Mahajan, Fernanda Bellolio","doi":"10.1097/PTS.0000000000001409","DOIUrl":null,"url":null,"abstract":"Objectives: Electronic health records (EHR)-based triggers (eTriggers) have been used to study diagnostic errors in the emergency department (ED), often with suboptimal performance. Our objective was to investigate incremental value of multi-factor machine learning (ML) approaches to improve eTrigger performance.Methods: Patients presenting to an academic ED were categorized into trigger-positive and trigger-negative using standard trigger (T) definitions: (T1) ED return visits resulting in admission within 10 days; (T2) care escalation from the inpatient unit to the ICU within 24 hours; and (T3) deaths within 24 hours of admission. We trained and evaluated 6 supervised ML models.Results: A total of 124,053 consecutive encounters (5791 T-positive and 118,262 T-negative) were included. Among the T-positive, 4159 (72%) were associated with T1, 1415 (24%) with T2, and 217 (4%) with T3. The T-based positive predictive values (PPV) were 5.2% for T1, 8.2% for T2, and 6.5% for T3. ML models trained and evaluated on balanced training dataset and imbalanced test set had low classification performances (accuracy: 0.72-0.95; PPV: 0.00-0.16; F1-score: 0.00-0.23). Higher performances were observed in balanced test sets (accuracy: 0.80-0.97; PPV: 0.82-1.00; F1-score: 0.79-0.97). Comparing models trained on clinically annotated data with models trained on T-based labels identified other important factors.Conclusions: Utilizing machine learning to refine e-triggers slightly improves the identification of diagnostic errors, as evidenced by an increase in PPV values. We identified new potential factors contributing to ED diagnostic errors. These findings open new avenues to construct or modify more accurate e-triggers for diagnostic error identification.","PeriodicalId":48901,"journal":{"name":"Journal of Patient Safety","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Patient Safety","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/PTS.0000000000001409","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Electronic health records (EHR)-based triggers (eTriggers) have been used to study diagnostic errors in the emergency department (ED), often with suboptimal performance. Our objective was to investigate incremental value of multi-factor machine learning (ML) approaches to improve eTrigger performance.

Methods: Patients presenting to an academic ED were categorized into trigger-positive and trigger-negative using standard trigger (T) definitions: (T1) ED return visits resulting in admission within 10 days; (T2) care escalation from the inpatient unit to the ICU within 24 hours; and (T3) deaths within 24 hours of admission. We trained and evaluated 6 supervised ML models.

Results: A total of 124,053 consecutive encounters (5791 T-positive and 118,262 T-negative) were included. Among the T-positive, 4159 (72%) were associated with T1, 1415 (24%) with T2, and 217 (4%) with T3. The T-based positive predictive values (PPV) were 5.2% for T1, 8.2% for T2, and 6.5% for T3. ML models trained and evaluated on balanced training dataset and imbalanced test set had low classification performances (accuracy: 0.72-0.95; PPV: 0.00-0.16; F1-score: 0.00-0.23). Higher performances were observed in balanced test sets (accuracy: 0.80-0.97; PPV: 0.82-1.00; F1-score: 0.79-0.97). Comparing models trained on clinically annotated data with models trained on T-based labels identified other important factors.

Conclusions: Utilizing machine learning to refine e-triggers slightly improves the identification of diagnostic errors, as evidenced by an increase in PPV values. We identified new potential factors contributing to ED diagnostic errors. These findings open new avenues to construct or modify more accurate e-triggers for diagnostic error identification.

查看原文本刊更多论文

将机器学习驱动因素纳入电子触发器设计以检测急诊科诊断错误。

目的：基于电子健康记录（EHR）的触发器（eTriggers）已被用于研究急诊科（ED）的诊断错误，通常表现不佳。我们的目标是研究多因素机器学习（ML）方法在提高eTrigger性能方面的增量价值。方法：采用标准触发(T)定义，将出现学术性ED的患者分为触发阳性和触发阴性两类：（T1）在10天内就诊的ED复诊；（T2） 24小时内从住院部到ICU的护理升级；（T3）入院24小时内死亡。我们训练并评估了6个有监督的ML模型。结果：共纳入124,053例连续接触病例（t阳性5791例，t阴性118,262例）。t阳性患者中，T1伴发4159例（72%），T2伴发1415例（24%），T3伴发217例（4%）。基于阳性预测值（PPV） T1为5.2%，T2为8.2%，T3为6.5%。在平衡训练集和不平衡测试集上训练和评估的ML模型分类性能较差（准确率：0.72-0.95;PPV: 0.00-0.16; F1-score: 0.00-0.23）。在平衡测试集中，准确率为0.80-0.97，PPV为0.82-1.00，F1-score为0.79-0.97。将临床注释数据训练的模型与基于标签训练的模型进行比较，确定了其他重要因素。结论：利用机器学习来改进电子触发器略微提高了诊断错误的识别，正如PPV值的增加所证明的那样。我们发现了导致ED诊断错误的新的潜在因素。这些发现为构建或修改更准确的诊断错误识别电子触发器开辟了新的途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Patient Safety HEALTH CARE SCIENCES & SERVICES-

CiteScore

4.60

自引率

13.60%

发文量

302

期刊介绍： Journal of Patient Safety (ISSN 1549-8417; online ISSN 1549-8425) is dedicated to presenting research advances and field applications in every area of patient safety. While Journal of Patient Safety has a research emphasis, it also publishes articles describing near-miss opportunities, system modifications that are barriers to error, and the impact of regulatory changes on healthcare delivery. This mix of research and real-world findings makes Journal of Patient Safety a valuable resource across the breadth of health professions and from bench to bedside.