{"title":"Development and Validation of a Sham-AI Model for Intracranial Aneurysm Detection at CT Angiography.","authors":"Zhao Shi, Bin Hu, Mengjie Lu, Manting Zhang, Haiting Yang, Bo He, Jiyao Ma, Chunfeng Hu, Li Lu, Sheng Li, Shiyu Ren, Yonggao Zhang, Jun Li, Mayidili Nijiati, Jiake Dong, Hao Wang, Zhen Zhou, Fandong Zhang, Chengwei Pan, Yizhou Yu, Zijian Chen, Chang Sheng Zhou, Yongyue Wei, Junlin Zhou, Long Jiang Zhang","doi":"10.1148/ryai.240140","DOIUrl":null,"url":null,"abstract":"<p><p>Purpose To evaluate a sham-artificial intelligence (AI) model acting as a placebo control for a standard-AI model for diagnosis of intracranial aneurysm. Materials and Methods This retrospective crossover, blinded, multireader, multicase study was conducted from November 2022 to March 2023. A sham-AI model with near-zero sensitivity and similar specificity to a standard AI model was developed using 16 422 CT angiography examinations. Digital subtraction angiography-verified CT angiographic examinations from four hospitals were collected, half of which were processed by standard AI and the others by sham AI to generate sequence A; sequence B was generated in the reverse order. Twenty-eight radiologists from seven hospitals were randomly assigned to either sequence and then assigned to the other sequence after a washout period. The diagnostic performances of radiologists alone, radiologists with standard-AI assistance, and radiologists with sham-AI assistance were compared using sensitivity and specificity, and radiologists' susceptibility to sham AI suggestions was assessed. Results The testing dataset included 300 patients (median age, 61.0 years [IQR, 52.0-67.0]; 199 male), 50 of whom had aneurysms. Standard AI and sham AI performed as expected (sensitivity, 96.0% vs 0.0%; specificity, 82.0% vs 76.0%). The differences in sensitivity and specificity between standard AI-assisted and sham AI-assisted readings were 20.7% (95% CI: 15.8, 25.5 [superiority]) and 0.0% (95% CI: -2.0, 2.0 [noninferiority]), respectively. The difference between sham AI-assisted readings and radiologists alone was -2.6% (95% CI: -3.8, -1.4 [noninferiority]) for both sensitivity and specificity. After sham-AI suggestions, 5.3% (44 of 823) of true-positive and 1.2% (seven of 577) of false-negative results of radiologists alone were changed. Conclusion Radiologists' diagnostic performance was not compromised when aided by the proposed sham-AI model compared with their unassisted performance. <b>Keywords:</b> CT Angiography, Vascular, Intracranial Aneurysm, Sham AI <i>Supplemental material is available for this article.</i> Published under a CC BY 4.0 license. See also commentary by Mayfield and Romero in this issue.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240140"},"PeriodicalIF":8.1000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.240140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose To evaluate a sham-artificial intelligence (AI) model acting as a placebo control for a standard-AI model for diagnosis of intracranial aneurysm. Materials and Methods This retrospective crossover, blinded, multireader, multicase study was conducted from November 2022 to March 2023. A sham-AI model with near-zero sensitivity and similar specificity to a standard AI model was developed using 16 422 CT angiography examinations. Digital subtraction angiography-verified CT angiographic examinations from four hospitals were collected, half of which were processed by standard AI and the others by sham AI to generate sequence A; sequence B was generated in the reverse order. Twenty-eight radiologists from seven hospitals were randomly assigned to either sequence and then assigned to the other sequence after a washout period. The diagnostic performances of radiologists alone, radiologists with standard-AI assistance, and radiologists with sham-AI assistance were compared using sensitivity and specificity, and radiologists' susceptibility to sham AI suggestions was assessed. Results The testing dataset included 300 patients (median age, 61.0 years [IQR, 52.0-67.0]; 199 male), 50 of whom had aneurysms. Standard AI and sham AI performed as expected (sensitivity, 96.0% vs 0.0%; specificity, 82.0% vs 76.0%). The differences in sensitivity and specificity between standard AI-assisted and sham AI-assisted readings were 20.7% (95% CI: 15.8, 25.5 [superiority]) and 0.0% (95% CI: -2.0, 2.0 [noninferiority]), respectively. The difference between sham AI-assisted readings and radiologists alone was -2.6% (95% CI: -3.8, -1.4 [noninferiority]) for both sensitivity and specificity. After sham-AI suggestions, 5.3% (44 of 823) of true-positive and 1.2% (seven of 577) of false-negative results of radiologists alone were changed. Conclusion Radiologists' diagnostic performance was not compromised when aided by the proposed sham-AI model compared with their unassisted performance. Keywords: CT Angiography, Vascular, Intracranial Aneurysm, Sham AI Supplemental material is available for this article. Published under a CC BY 4.0 license. See also commentary by Mayfield and Romero in this issue.
期刊介绍:
Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.