Aaditeya Jhaveri, Farbod Abolhassani, Benjamin Fine
{"title":"Prospective External Validation of an AI-Based Emergency Department Pneumonia Disposition Prediction Tool.","authors":"Aaditeya Jhaveri, Farbod Abolhassani, Benjamin Fine","doi":"10.1177/08465371251320938","DOIUrl":null,"url":null,"abstract":"<p><p><b>Purpose:</b> This shadow deployment evaluated an externally-developed AI tool to predict disposition using chest X-rays (CXR) in patients with community-acquired pneumonia (CAP) in the Emergency Department (ED). Retrospective and prospective external validations were conducted to assess differences between the 2 evaluations and across subgroups to inform deployment decisions. <b>Methods:</b> The CNN was retrospectively validated (n = 17 689) from November 1, 2020, to June 30, 2021, and prospectively validated on \"suspected-CAP\" patients (n = 3062) from Jan 1 to Jan 31, 2023. Calibration and standard metrics, including AUC, accuracy, sensitivity, specificity, PPV, and NPV, were calculated. Subgroup analyses were conducted for age, sex, modality, and CXR projection (PA vs AP). <b>Results:</b> The model's AUC was 67% in both validations. The prospective evaluation showed a non-significant increase in sensitivity (65% vs 59%) and PPV (64% vs 63%), while specificity (68% vs 73%) and NPV (69% vs 70%) slightly decreased. NPV was very high for younger patients in the prospective evaluation (95%); PPV was moderately high for older patients (81%). Sensitivity dropped significantly in females under 31 years (50%), and specificity was reduced in females over 86 years (38%). <b>Conclusion:</b> This study showed moderate, consistent performance in both retrospective and prospective validations. While this consistency is encouraging, further direct comparisons are needed to determine whether both validation approaches are necessary in different clinical settings. Subgroup analysis suggests the tool may be helpful to accelerate discharge in younger patients (high NPV) and possibly for admission in older patients (high PPV).</p>","PeriodicalId":55290,"journal":{"name":"Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes","volume":" ","pages":"8465371251320938"},"PeriodicalIF":2.9000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/08465371251320938","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: This shadow deployment evaluated an externally-developed AI tool to predict disposition using chest X-rays (CXR) in patients with community-acquired pneumonia (CAP) in the Emergency Department (ED). Retrospective and prospective external validations were conducted to assess differences between the 2 evaluations and across subgroups to inform deployment decisions. Methods: The CNN was retrospectively validated (n = 17 689) from November 1, 2020, to June 30, 2021, and prospectively validated on "suspected-CAP" patients (n = 3062) from Jan 1 to Jan 31, 2023. Calibration and standard metrics, including AUC, accuracy, sensitivity, specificity, PPV, and NPV, were calculated. Subgroup analyses were conducted for age, sex, modality, and CXR projection (PA vs AP). Results: The model's AUC was 67% in both validations. The prospective evaluation showed a non-significant increase in sensitivity (65% vs 59%) and PPV (64% vs 63%), while specificity (68% vs 73%) and NPV (69% vs 70%) slightly decreased. NPV was very high for younger patients in the prospective evaluation (95%); PPV was moderately high for older patients (81%). Sensitivity dropped significantly in females under 31 years (50%), and specificity was reduced in females over 86 years (38%). Conclusion: This study showed moderate, consistent performance in both retrospective and prospective validations. While this consistency is encouraging, further direct comparisons are needed to determine whether both validation approaches are necessary in different clinical settings. Subgroup analysis suggests the tool may be helpful to accelerate discharge in younger patients (high NPV) and possibly for admission in older patients (high PPV).
目的:该影子部署评估了一种外部开发的AI工具,用于预测急诊科(ED)社区获得性肺炎(CAP)患者使用胸部x光片(CXR)的处置。进行回顾性和前瞻性外部验证,以评估两种评估之间的差异,并跨子组进行评估,以告知部署决策。方法:CNN于2020年11月1日至2021年6月30日进行回顾性验证(n = 17689),并于2023年1月1日至1月31日对“疑似cap”患者(n = 3062)进行前瞻性验证。计算校准和标准指标,包括AUC、准确度、灵敏度、特异性、PPV和NPV。对年龄、性别、模态和CXR投影(PA vs AP)进行亚组分析。结果:两种验证模型的AUC均为67%。前瞻性评价显示敏感性(65% vs 59%)和PPV (64% vs 63%)无显著增加,而特异性(68% vs 73%)和NPV (69% vs 70%)略有下降。在前瞻性评估中,年轻患者的NPV非常高(95%);老年患者的PPV中等偏高(81%)。31岁以下女性的敏感性显著下降(50%),86岁以上女性的特异性降低(38%)。结论:该研究在回顾性和前瞻性验证中表现出适度、一致的效果。虽然这种一致性令人鼓舞,但需要进一步的直接比较来确定这两种验证方法在不同的临床环境中是否必要。亚组分析表明,该工具可能有助于加速年轻患者(高NPV)的出院,也可能有助于老年患者(高PPV)的入院。
期刊介绍:
The Canadian Association of Radiologists Journal is a peer-reviewed, Medline-indexed publication that presents a broad scientific review of radiology in Canada. The Journal covers such topics as abdominal imaging, cardiovascular radiology, computed tomography, continuing professional development, education and training, gastrointestinal radiology, health policy and practice, magnetic resonance imaging, musculoskeletal radiology, neuroradiology, nuclear medicine, pediatric radiology, radiology history, radiology practice guidelines and advisories, thoracic and cardiac imaging, trauma and emergency room imaging, ultrasonography, and vascular and interventional radiology. Article types considered for publication include original research articles, critically appraised topics, review articles, guest editorials, pictorial essays, technical notes, and letter to the Editor.