Jeeyoun Kim, Kyungwha Han, Keum Won Kim, Won Hwa Kim, Jaeil Kim, Jung Hyun Yoon
{"title":"Feasibility of Using an AI System for Breast Ultrasonography Interpretation According to Clinical Expertise: Results of a Pilot Study.","authors":"Jeeyoun Kim, Kyungwha Han, Keum Won Kim, Won Hwa Kim, Jaeil Kim, Jung Hyun Yoon","doi":"10.3348/jksr.2024.0144","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the benefits of using a commercially available AI system for breast ultrasonography (US) among readers with varying levels of expertise.</p><p><strong>Materials and methods: </strong>A total of 285 breast lesions from 141 women who underwent breast US between February 2012 and April 2015 were retrospectively analyzed using a deep-learning-based AI system for lesion detection and diagnosis. Five readers, comprising experienced (two breast radiologists and one breast surgeon) and inexperienced (one gynecologist and one radiology resident) groups, reviewed the grayscale US images in two sessions: without AI assistance (session 1) and with AI assistance after a two-week washout period (session 2). Diagnostic performance was compared between sessions.</p><p><strong>Results: </strong>The mean area under the curve for all readers significantly improved with AI, increasing from 0.885 to 0.927 (<i>p</i> < 0.001). The inexperienced group demonstrated significant improvements in mean sensitivity (56.9%-87.5%, <i>p</i> < 0.001), negative predictive value (NPV) (77.9%-90.1%, <i>p</i> < 0.001), and accuracy (76.1%-84.4%, <i>p</i> = 0.005). However, no significant improvements were observed for the experienced readers (all <i>p</i>-values > 0.05).</p><p><strong>Conclusion: </strong>The AI system for breast US significantly enhanced the diagnostic performance of inexperienced readers, augmenting sensitivity, NPV, and accuracy, while experienced readers demonstrated minimal improvement, likely due to their already high baseline performance.</p>","PeriodicalId":101329,"journal":{"name":"Journal of the Korean Society of Radiology","volume":"87 2","pages":"314-327"},"PeriodicalIF":0.6000,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13062396/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Korean Society of Radiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3348/jksr.2024.0144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/17 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To evaluate the benefits of using a commercially available AI system for breast ultrasonography (US) among readers with varying levels of expertise.
Materials and methods: A total of 285 breast lesions from 141 women who underwent breast US between February 2012 and April 2015 were retrospectively analyzed using a deep-learning-based AI system for lesion detection and diagnosis. Five readers, comprising experienced (two breast radiologists and one breast surgeon) and inexperienced (one gynecologist and one radiology resident) groups, reviewed the grayscale US images in two sessions: without AI assistance (session 1) and with AI assistance after a two-week washout period (session 2). Diagnostic performance was compared between sessions.
Results: The mean area under the curve for all readers significantly improved with AI, increasing from 0.885 to 0.927 (p < 0.001). The inexperienced group demonstrated significant improvements in mean sensitivity (56.9%-87.5%, p < 0.001), negative predictive value (NPV) (77.9%-90.1%, p < 0.001), and accuracy (76.1%-84.4%, p = 0.005). However, no significant improvements were observed for the experienced readers (all p-values > 0.05).
Conclusion: The AI system for breast US significantly enhanced the diagnostic performance of inexperienced readers, augmenting sensitivity, NPV, and accuracy, while experienced readers demonstrated minimal improvement, likely due to their already high baseline performance.