Reference standard methodology in the clinical evaluation of AI chest X-ray algorithms for lung cancer detection: A systematic review

IF 3.3 3区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Sean F. Duncan , Andrew C. Kidd , Jesus Perdomo Lampignano , Paul Cannon , Mark Hall , David B. Stobo , John D. Maclay , Kevin G. Blyth , David J. Lowe
{"title":"Reference standard methodology in the clinical evaluation of AI chest X-ray algorithms for lung cancer detection: A systematic review","authors":"Sean F. Duncan ,&nbsp;Andrew C. Kidd ,&nbsp;Jesus Perdomo Lampignano ,&nbsp;Paul Cannon ,&nbsp;Mark Hall ,&nbsp;David B. Stobo ,&nbsp;John D. Maclay ,&nbsp;Kevin G. Blyth ,&nbsp;David J. Lowe","doi":"10.1016/j.ejrad.2025.112409","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Lung cancer remains the leading cause of cancer death worldwide, with early diagnosis linked to improved survival. Artificial intelligence (AI) holds promise for augmenting radiologists’ workflows in chest X-ray (CXR) interpretation, particularly for detecting thoracic malignancies. However, clinical implementation of this technology relies on robust and standardised reference standard methodology at the patient-level.</div></div><div><h3>Purpose</h3><div>This systematic review aims to describe reference standard methodology in the clinical evaluation of CXR algorithms for lung cancer detection.</div></div><div><h3>Materials and Methods</h3><div>Searches targeted studies on AI CXR analysis across MEDLINE, Embase, CENTRAL, and trial registries. 2 reviewers independently screened titles and abstracts, with disagreements resolved by a 3rd reviewer. Studies lacking external validation in real-world cohorts were excluded. Bias was assessed using a modified QUADAS-2 tool, and data synthesis followed SWiM guidelines.</div></div><div><h3>Results</h3><div>1,679 papers were screened with 46 papers included for full paper review. 24 different AI solutions were evaluated across a broad range of research questions. We identified significant heterogeneity in reference standard methodology, including variations in target abnormalities, reference standard modality, expert panel composition, and arbitration techniques. 25 % of reference standard parameters were inadequately reported. 66 % of included studies demonstrated high risk of bias in at least one domain.</div></div><div><h3>Discussion</h3><div>To our knowledge, this is the first systematic description of patient-level reference standard methodology in CXR AI analysis of thoracic malignancy. To facilitate translational progress in this field, researchers undertaking evaluations of diagnostic algorithms at the patient-level should ensure that reference standards are aligned with clinical workflows and adhere to reporting guidelines. Limitations include a lack of prospective studies.</div></div>","PeriodicalId":12063,"journal":{"name":"European Journal of Radiology","volume":"192 ","pages":"Article 112409"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0720048X25004954","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Lung cancer remains the leading cause of cancer death worldwide, with early diagnosis linked to improved survival. Artificial intelligence (AI) holds promise for augmenting radiologists’ workflows in chest X-ray (CXR) interpretation, particularly for detecting thoracic malignancies. However, clinical implementation of this technology relies on robust and standardised reference standard methodology at the patient-level.

Purpose

This systematic review aims to describe reference standard methodology in the clinical evaluation of CXR algorithms for lung cancer detection.

Materials and Methods

Searches targeted studies on AI CXR analysis across MEDLINE, Embase, CENTRAL, and trial registries. 2 reviewers independently screened titles and abstracts, with disagreements resolved by a 3rd reviewer. Studies lacking external validation in real-world cohorts were excluded. Bias was assessed using a modified QUADAS-2 tool, and data synthesis followed SWiM guidelines.

Results

1,679 papers were screened with 46 papers included for full paper review. 24 different AI solutions were evaluated across a broad range of research questions. We identified significant heterogeneity in reference standard methodology, including variations in target abnormalities, reference standard modality, expert panel composition, and arbitration techniques. 25 % of reference standard parameters were inadequately reported. 66 % of included studies demonstrated high risk of bias in at least one domain.

Discussion

To our knowledge, this is the first systematic description of patient-level reference standard methodology in CXR AI analysis of thoracic malignancy. To facilitate translational progress in this field, researchers undertaking evaluations of diagnostic algorithms at the patient-level should ensure that reference standards are aligned with clinical workflows and adhere to reporting guidelines. Limitations include a lack of prospective studies.
人工智能胸部x线肺癌检测算法临床评价的参考标准方法学:系统综述。
背景:肺癌仍然是世界范围内癌症死亡的主要原因,早期诊断与提高生存率有关。人工智能(AI)有望增强放射科医生在胸部x光(CXR)解释中的工作流程,特别是在检测胸部恶性肿瘤方面。然而,该技术的临床实施依赖于患者层面上稳健和标准化的参考标准方法。目的:本系统综述旨在描述临床评价CXR肺癌检测算法的参考标准方法。材料和方法:在MEDLINE、Embase、CENTRAL和试验注册库中检索有关AI CXR分析的目标研究。2名审稿人独立筛选标题和摘要,分歧由第三名审稿人解决。在现实世界中缺乏外部验证的研究被排除在外。使用改进的QUADAS-2工具评估偏倚,数据合成遵循SWiM指南。结果:共筛选论文1679篇,纳入全文综述46篇。针对广泛的研究问题,评估了24种不同的人工智能解决方案。我们确定了参考标准方法的显著异质性,包括目标异常、参考标准模式、专家小组组成和仲裁技术的变化。25%的参比标准参数报告不充分。66%的纳入研究显示至少在一个领域存在高偏倚风险。讨论:据我们所知,这是第一次系统描述胸部恶性肿瘤CXR AI分析中患者层面参考标准方法。为了促进该领域的转化进展,在患者层面对诊断算法进行评估的研究人员应确保参考标准与临床工作流程保持一致,并遵守报告指南。局限性包括缺乏前瞻性研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.70
自引率
3.00%
发文量
398
审稿时长
42 days
期刊介绍: European Journal of Radiology is an international journal which aims to communicate to its readers, state-of-the-art information on imaging developments in the form of high quality original research articles and timely reviews on current developments in the field. Its audience includes clinicians at all levels of training including radiology trainees, newly qualified imaging specialists and the experienced radiologist. Its aim is to inform efficient, appropriate and evidence-based imaging practice to the benefit of patients worldwide.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信