Threshold optimization in AI chest radiography analysis: integrating real-world data and clinical subgroups.

IF 3.6 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

European Radiology Experimental Pub Date : 2025-09-22 DOI:10.1186/s41747-025-00632-8

Jan Rudolph, Christian Huemmer, Alexander Preuhs, Giulia Buizza, Julien Dinkel, Vanessa Koliogiannis, Nicola Fink, Sophia Samira Goller, Vincent Schwarze, Maurice Heimer, Boj Friedrich Hoppe, Thomas Liebig, Jens Ricke, Bastian Oliver Sabel, Johannes Rueckel

{"title":"Threshold optimization in AI chest radiography analysis: integrating real-world data and clinical subgroups.","authors":"Jan Rudolph, Christian Huemmer, Alexander Preuhs, Giulia Buizza, Julien Dinkel, Vanessa Koliogiannis, Nicola Fink, Sophia Samira Goller, Vincent Schwarze, Maurice Heimer, Boj Friedrich Hoppe, Thomas Liebig, Jens Ricke, Bastian Oliver Sabel, Johannes Rueckel","doi":"10.1186/s41747-025-00632-8","DOIUrl":null,"url":null,"abstract":"Background: Manufacturer-defined AI thresholds for chest x-ray (CXR) often lack customization options. Threshold optimization strategies utilizing users' clinical real-world data along with pathology-enriched validation data may better address subgroup-specific and user-specific needs.Materials and methods: A pathology-enriched dataset (study cohort, 563 (CXRs)) with pleural effusions, consolidations, pneumothoraces, nodules, and unremarkable findings was analysed by an AI system and six reference radiologists. The same AI model was applied to a routine dataset (clinical cohort, 15,786 consecutive routine CXRs). Iterative receiver operating characteristic analysis linked achievable sensitivities (study cohort) to resulting AI alert rates in clinical routine inpatient or outpatient subgroups. \"Optimized\" thresholds (OTs) were defined by a 1% sensitivity increase leading to more than a 1% rise in AI alert rates. Threshold comparisons (OTs versus AI vendor's default thresholds (AIDT) versus Youden's thresholds) were based on 400 clinical cohort cases with expert radiologists' reference.Results: AIDTs, OTs, and Youden's thresholds varied across scenarios, with OTs differing based on tailoring for inpatient or outpatient CXRs. AIDT lowering most reasonably improved sensitivity for pleural effusion, with increases from 46.8% (AIDT) to 87.2% (OT) for outpatients and from 76.3% (AIDT) to 93.5% (OT) for inpatients; similar trends appeared for consolidations. Conversely, regarding inpatient nodule detection, increasing the threshold improved accuracy from 69.5% (AIDT) to 82.5% (OT) without compromising sensitivity. Graphical analysis supports threshold selection by illustrating estimated sensitivities and clinical routine AI alert rates.Conclusion: An innovative, subgroup-specific AI threshold optimization is proposed, automatically implemented and transferable to other AI algorithms and varying clinical subgroup settings.Relevance statement: Individually customizing thresholds tailored to specific medical experts' needs and patient subgroup characteristics is promising and may enhance diagnostic accuracy and the clinical acceptance of diagnostic AI algorithms.Key points: Customizing AI thresholds individually addresses specific user/patient subgroup needs. The presented approach utilizes pathology-enriched and real-world subgroup data for optimization. Potential is shown by comparing individualized thresholds with vendor defaults. Distinct thresholds for in- and outpatient CXR AI analysis may improve perception. The automated pipeline methodology is transferable to other AI models or subgroups.","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":"9 1","pages":"95"},"PeriodicalIF":3.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12454861/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology Experimental","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41747-025-00632-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Manufacturer-defined AI thresholds for chest x-ray (CXR) often lack customization options. Threshold optimization strategies utilizing users' clinical real-world data along with pathology-enriched validation data may better address subgroup-specific and user-specific needs.

Materials and methods: A pathology-enriched dataset (study cohort, 563 (CXRs)) with pleural effusions, consolidations, pneumothoraces, nodules, and unremarkable findings was analysed by an AI system and six reference radiologists. The same AI model was applied to a routine dataset (clinical cohort, 15,786 consecutive routine CXRs). Iterative receiver operating characteristic analysis linked achievable sensitivities (study cohort) to resulting AI alert rates in clinical routine inpatient or outpatient subgroups. "Optimized" thresholds (OTs) were defined by a 1% sensitivity increase leading to more than a 1% rise in AI alert rates. Threshold comparisons (OTs versus AI vendor's default thresholds (AIDT) versus Youden's thresholds) were based on 400 clinical cohort cases with expert radiologists' reference.

Results: AIDTs, OTs, and Youden's thresholds varied across scenarios, with OTs differing based on tailoring for inpatient or outpatient CXRs. AIDT lowering most reasonably improved sensitivity for pleural effusion, with increases from 46.8% (AIDT) to 87.2% (OT) for outpatients and from 76.3% (AIDT) to 93.5% (OT) for inpatients; similar trends appeared for consolidations. Conversely, regarding inpatient nodule detection, increasing the threshold improved accuracy from 69.5% (AIDT) to 82.5% (OT) without compromising sensitivity. Graphical analysis supports threshold selection by illustrating estimated sensitivities and clinical routine AI alert rates.

Conclusion: An innovative, subgroup-specific AI threshold optimization is proposed, automatically implemented and transferable to other AI algorithms and varying clinical subgroup settings.

Relevance statement: Individually customizing thresholds tailored to specific medical experts' needs and patient subgroup characteristics is promising and may enhance diagnostic accuracy and the clinical acceptance of diagnostic AI algorithms.

Key points: Customizing AI thresholds individually addresses specific user/patient subgroup needs. The presented approach utilizes pathology-enriched and real-world subgroup data for optimization. Potential is shown by comparing individualized thresholds with vendor defaults. Distinct thresholds for in- and outpatient CXR AI analysis may improve perception. The automated pipeline methodology is transferable to other AI models or subgroups.

查看原文本刊更多论文

人工智能胸片分析的阈值优化：整合真实世界数据和临床亚组。

背景：制造商定义的胸部x光（CXR）人工智能阈值通常缺乏定制选项。阈值优化策略利用用户的临床真实世界数据以及病理丰富的验证数据可以更好地满足亚组特定和用户特定的需求。材料和方法：人工智能系统和6名参考放射科医生分析了一个病理丰富的数据集（研究队列，563例（cxr）），其中包括胸腔积液、实变、气胸、结节和不显著的发现。将相同的人工智能模型应用于常规数据集（临床队列，15,786例连续常规cxr）。迭代接受者操作特征分析将可实现的敏感性（研究队列）与临床常规住院或门诊亚组的人工智能警报率联系起来。“优化”阈值（OTs）的定义是，灵敏度提高1%，导致人工智能警报率上升1%以上。阈值比较（ot与AI供应商的默认阈值（AIDT）与Youden阈值）基于400例临床队列病例，并有放射科专家的参考。结果：aids、ot和Youden阈值在不同的情况下有所不同，ot根据住院或门诊cxr的定制而不同。降低AIDT最合理地改善了对胸腔积液的敏感性，门诊患者从46.8% （AIDT）增加到87.2% (OT)，住院患者从76.3% （AIDT）增加到93.5% (OT)；合并也出现了类似的趋势。相反，对于住院患者的结节检测，提高阈值可将准确率从69.5% （AIDT）提高到82.5% (OT)，而不影响灵敏度。图形分析通过说明估计的敏感性和临床常规人工智能警报率来支持阈值选择。结论：提出了一种创新的、针对亚组的AI阈值优化方法，可自动实现，并可转移到其他AI算法和不同的临床亚组设置中。相关性声明：根据特定医学专家的需求和患者亚组特征量身定制阈值是有希望的，可能会提高诊断准确性和诊断人工智能算法的临床接受度。重点：定制人工智能阈值可以满足特定用户/患者分组的需求。所提出的方法利用病理丰富和现实世界的亚组数据进行优化。通过比较个性化阈值与供应商默认值来显示潜力。门诊和门诊CXR人工智能分析的不同阈值可能改善感知。自动化管道方法可转移到其他AI模型或子组。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊