Sean F. Duncan , Andrew C. Kidd , Jesus Perdomo Lampignano , Paul Cannon , Mark Hall , David B. Stobo , John D. Maclay , Kevin G. Blyth , David J. Lowe
{"title":"人工智能胸部x线肺癌检测算法临床评价的参考标准方法学:系统综述。","authors":"Sean F. Duncan , Andrew C. Kidd , Jesus Perdomo Lampignano , Paul Cannon , Mark Hall , David B. Stobo , John D. Maclay , Kevin G. Blyth , David J. Lowe","doi":"10.1016/j.ejrad.2025.112409","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Lung cancer remains the leading cause of cancer death worldwide, with early diagnosis linked to improved survival. Artificial intelligence (AI) holds promise for augmenting radiologists’ workflows in chest X-ray (CXR) interpretation, particularly for detecting thoracic malignancies. However, clinical implementation of this technology relies on robust and standardised reference standard methodology at the patient-level.</div></div><div><h3>Purpose</h3><div>This systematic review aims to describe reference standard methodology in the clinical evaluation of CXR algorithms for lung cancer detection.</div></div><div><h3>Materials and Methods</h3><div>Searches targeted studies on AI CXR analysis across MEDLINE, Embase, CENTRAL, and trial registries. 2 reviewers independently screened titles and abstracts, with disagreements resolved by a 3rd reviewer. Studies lacking external validation in real-world cohorts were excluded. Bias was assessed using a modified QUADAS-2 tool, and data synthesis followed SWiM guidelines.</div></div><div><h3>Results</h3><div>1,679 papers were screened with 46 papers included for full paper review. 24 different AI solutions were evaluated across a broad range of research questions. We identified significant heterogeneity in reference standard methodology, including variations in target abnormalities, reference standard modality, expert panel composition, and arbitration techniques. 25 % of reference standard parameters were inadequately reported. 66 % of included studies demonstrated high risk of bias in at least one domain.</div></div><div><h3>Discussion</h3><div>To our knowledge, this is the first systematic description of patient-level reference standard methodology in CXR AI analysis of thoracic malignancy. To facilitate translational progress in this field, researchers undertaking evaluations of diagnostic algorithms at the patient-level should ensure that reference standards are aligned with clinical workflows and adhere to reporting guidelines. Limitations include a lack of prospective studies.</div></div>","PeriodicalId":12063,"journal":{"name":"European Journal of Radiology","volume":"192 ","pages":"Article 112409"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reference standard methodology in the clinical evaluation of AI chest X-ray algorithms for lung cancer detection: A systematic review\",\"authors\":\"Sean F. Duncan , Andrew C. Kidd , Jesus Perdomo Lampignano , Paul Cannon , Mark Hall , David B. Stobo , John D. Maclay , Kevin G. Blyth , David J. Lowe\",\"doi\":\"10.1016/j.ejrad.2025.112409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Lung cancer remains the leading cause of cancer death worldwide, with early diagnosis linked to improved survival. Artificial intelligence (AI) holds promise for augmenting radiologists’ workflows in chest X-ray (CXR) interpretation, particularly for detecting thoracic malignancies. However, clinical implementation of this technology relies on robust and standardised reference standard methodology at the patient-level.</div></div><div><h3>Purpose</h3><div>This systematic review aims to describe reference standard methodology in the clinical evaluation of CXR algorithms for lung cancer detection.</div></div><div><h3>Materials and Methods</h3><div>Searches targeted studies on AI CXR analysis across MEDLINE, Embase, CENTRAL, and trial registries. 2 reviewers independently screened titles and abstracts, with disagreements resolved by a 3rd reviewer. Studies lacking external validation in real-world cohorts were excluded. Bias was assessed using a modified QUADAS-2 tool, and data synthesis followed SWiM guidelines.</div></div><div><h3>Results</h3><div>1,679 papers were screened with 46 papers included for full paper review. 24 different AI solutions were evaluated across a broad range of research questions. We identified significant heterogeneity in reference standard methodology, including variations in target abnormalities, reference standard modality, expert panel composition, and arbitration techniques. 25 % of reference standard parameters were inadequately reported. 66 % of included studies demonstrated high risk of bias in at least one domain.</div></div><div><h3>Discussion</h3><div>To our knowledge, this is the first systematic description of patient-level reference standard methodology in CXR AI analysis of thoracic malignancy. To facilitate translational progress in this field, researchers undertaking evaluations of diagnostic algorithms at the patient-level should ensure that reference standards are aligned with clinical workflows and adhere to reporting guidelines. Limitations include a lack of prospective studies.</div></div>\",\"PeriodicalId\":12063,\"journal\":{\"name\":\"European Journal of Radiology\",\"volume\":\"192 \",\"pages\":\"Article 112409\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0720048X25004954\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0720048X25004954","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Reference standard methodology in the clinical evaluation of AI chest X-ray algorithms for lung cancer detection: A systematic review
Background
Lung cancer remains the leading cause of cancer death worldwide, with early diagnosis linked to improved survival. Artificial intelligence (AI) holds promise for augmenting radiologists’ workflows in chest X-ray (CXR) interpretation, particularly for detecting thoracic malignancies. However, clinical implementation of this technology relies on robust and standardised reference standard methodology at the patient-level.
Purpose
This systematic review aims to describe reference standard methodology in the clinical evaluation of CXR algorithms for lung cancer detection.
Materials and Methods
Searches targeted studies on AI CXR analysis across MEDLINE, Embase, CENTRAL, and trial registries. 2 reviewers independently screened titles and abstracts, with disagreements resolved by a 3rd reviewer. Studies lacking external validation in real-world cohorts were excluded. Bias was assessed using a modified QUADAS-2 tool, and data synthesis followed SWiM guidelines.
Results
1,679 papers were screened with 46 papers included for full paper review. 24 different AI solutions were evaluated across a broad range of research questions. We identified significant heterogeneity in reference standard methodology, including variations in target abnormalities, reference standard modality, expert panel composition, and arbitration techniques. 25 % of reference standard parameters were inadequately reported. 66 % of included studies demonstrated high risk of bias in at least one domain.
Discussion
To our knowledge, this is the first systematic description of patient-level reference standard methodology in CXR AI analysis of thoracic malignancy. To facilitate translational progress in this field, researchers undertaking evaluations of diagnostic algorithms at the patient-level should ensure that reference standards are aligned with clinical workflows and adhere to reporting guidelines. Limitations include a lack of prospective studies.
期刊介绍:
European Journal of Radiology is an international journal which aims to communicate to its readers, state-of-the-art information on imaging developments in the form of high quality original research articles and timely reviews on current developments in the field.
Its audience includes clinicians at all levels of training including radiology trainees, newly qualified imaging specialists and the experienced radiologist. Its aim is to inform efficient, appropriate and evidence-based imaging practice to the benefit of patients worldwide.