Mohammad T Elhakim, Sarah W Stougaard, Ole Graumann, Mads Nielsen, Oke Gerke, Lisbet B Larsen, Benjamin S B Rasmussen
{"title":"AI-integrated Screening to Replace Double Reading of Mammograms: A Population-wide Accuracy and Feasibility Study.","authors":"Mohammad T Elhakim, Sarah W Stougaard, Ole Graumann, Mads Nielsen, Oke Gerke, Lisbet B Larsen, Benjamin S B Rasmussen","doi":"10.1148/ryai.230529","DOIUrl":null,"url":null,"abstract":"<p><p>Mammography screening supported by deep learning-based artificial intelligence (AI) solutions can potentially reduce workload without compromising breast cancer detection accuracy, but the site of deployment in the workflow might be crucial. This retrospective study compared three simulated AI-integrated screening scenarios with standard double reading with arbitration in a sample of 249 402 mammograms from a representative screening population. A commercial AI system replaced the first reader (scenario 1: integrated AI<sub>first</sub>), the second reader (scenario 2: integrated AI<sub>second</sub>), or both readers for triaging of low- and high-risk cases (scenario 3: integrated AI<sub>triage</sub>). AI threshold values were chosen based partly on previous validation and setting the screen-read volume reduction at approximately 50% across scenarios. Detection accuracy measures were calculated. Compared with standard double reading, integrated AI<sub>first</sub> showed no evidence of a difference in accuracy metrics except for a higher arbitration rate (+0.99%, <i>P</i> < .001). Integrated AI<sub>second</sub> had lower sensitivity (-1.58%, <i>P</i> < .001), negative predictive value (NPV) (-0.01%, <i>P</i> < .001), and recall rate (-0.06%, <i>P</i> = .04) but a higher positive predictive value (PPV) (+0.03%, <i>P</i> < .001) and arbitration rate (+1.22%, <i>P</i> < .001). Integrated AI<sub>triage</sub> achieved higher sensitivity (+1.33%, <i>P</i> < .001), PPV (+0.36%, <i>P</i> = .03), and NPV (+0.01%, <i>P</i> < .001) but lower arbitration rate (-0.88%, <i>P</i> < .001). Replacing one or both readers with AI seems feasible; however, the site of application in the workflow can have clinically relevant effects on accuracy and workload. <b>Keywords:</b> Mammography, Breast, Neoplasms-Primary, Screening, Epidemiology, Diagnosis, Convolutional Neural Network (CNN) <i>Supplemental material is available for this article.</i> Published under a CC BY 4.0 license.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e230529"},"PeriodicalIF":8.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11605135/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.230529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Mammography screening supported by deep learning-based artificial intelligence (AI) solutions can potentially reduce workload without compromising breast cancer detection accuracy, but the site of deployment in the workflow might be crucial. This retrospective study compared three simulated AI-integrated screening scenarios with standard double reading with arbitration in a sample of 249 402 mammograms from a representative screening population. A commercial AI system replaced the first reader (scenario 1: integrated AIfirst), the second reader (scenario 2: integrated AIsecond), or both readers for triaging of low- and high-risk cases (scenario 3: integrated AItriage). AI threshold values were chosen based partly on previous validation and setting the screen-read volume reduction at approximately 50% across scenarios. Detection accuracy measures were calculated. Compared with standard double reading, integrated AIfirst showed no evidence of a difference in accuracy metrics except for a higher arbitration rate (+0.99%, P < .001). Integrated AIsecond had lower sensitivity (-1.58%, P < .001), negative predictive value (NPV) (-0.01%, P < .001), and recall rate (-0.06%, P = .04) but a higher positive predictive value (PPV) (+0.03%, P < .001) and arbitration rate (+1.22%, P < .001). Integrated AItriage achieved higher sensitivity (+1.33%, P < .001), PPV (+0.36%, P = .03), and NPV (+0.01%, P < .001) but lower arbitration rate (-0.88%, P < .001). Replacing one or both readers with AI seems feasible; however, the site of application in the workflow can have clinically relevant effects on accuracy and workload. Keywords: Mammography, Breast, Neoplasms-Primary, Screening, Epidemiology, Diagnosis, Convolutional Neural Network (CNN) Supplemental material is available for this article. Published under a CC BY 4.0 license.
期刊介绍:
Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.