Value of Using a Generative AI Model in Chest Radiography Reporting: A Reader Study.

IF 12.1 1区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Radiology Pub Date : 2025-03-01 DOI:10.1148/radiol.241646
Eun Kyoung Hong, Byungseok Roh, Beomhee Park, Jae-Bock Jo, Woong Bae, Jai Soung Park, Dong-Wook Sung
{"title":"Value of Using a Generative AI Model in Chest Radiography Reporting: A Reader Study.","authors":"Eun Kyoung Hong, Byungseok Roh, Beomhee Park, Jae-Bock Jo, Woong Bae, Jai Soung Park, Dong-Wook Sung","doi":"10.1148/radiol.241646","DOIUrl":null,"url":null,"abstract":"<p><p>Background Multimodal generative artificial intelligence (AI) technologies can produce preliminary radiology reports, and validation with reader studies is crucial for understanding the clinical value of these technologies. Purpose To assess the clinical value of the use of a domain-specific multimodal generative AI tool for chest radiograph interpretation by means of a reader study. Materials and Methods A retrospective, sequential, multireader, multicase reader study was conducted using 758 chest radiographs from a publicly available dataset from 2009 to 2017. Five radiologists interpreted the chest radiographs in two sessions: without AI-generated reports and with AI-generated reports as preliminary reports. Reading times, reporting agreement (RADPEER), and quality scores (five-point scale) were evaluated by two experienced thoracic radiologists and compared between the first and second sessions from October to December 2023. Reading times, report agreement, and quality scores were analyzed using a generalized linear mixed model. Additionally, a subset of 258 chest radiographs was used to assess the factual correctness of the reports, and sensitivities and specificities were compared between the reports from the first and second sessions with use of the McNemar test. Results The introduction of AI-generated reports significantly reduced average reading times from 34.2 seconds ± 20.4 to 19.8 seconds ± 12.5 (<i>P</i> < .001). Report agreement scores shifted from a median of 5.0 (IQR, 4.0-5.0) without AI reports to 5.0 (IQR, 4.5-5.0) with AI reports (<i>P</i> < .001). Report quality scores changed from 4.5 (IQR, 4.0-5.0) without AI reports to 4.5 (IQR, 4.5-5.0) with AI reports (<i>P</i> < .001). From the subset analysis of factual correctness, the sensitivity for detecting various abnormalities increased significantly, including widened mediastinal silhouettes (84.3% to 90.8%; <i>P</i> < .001) and pleural lesions (77.7% to 87.4%; <i>P</i> < .001). While the overall diagnostic performance improved, variability among individual radiologists was noted. Conclusion The use of a domain-specific multimodal generative AI model increased the efficiency and quality of radiology report generation. © RSNA, 2025 <i>Supplemental material is available for this article.</i> See also the editorial by Babyn and Adams in this issue.</p>","PeriodicalId":20896,"journal":{"name":"Radiology","volume":"314 3","pages":"e241646"},"PeriodicalIF":12.1000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1148/radiol.241646","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background Multimodal generative artificial intelligence (AI) technologies can produce preliminary radiology reports, and validation with reader studies is crucial for understanding the clinical value of these technologies. Purpose To assess the clinical value of the use of a domain-specific multimodal generative AI tool for chest radiograph interpretation by means of a reader study. Materials and Methods A retrospective, sequential, multireader, multicase reader study was conducted using 758 chest radiographs from a publicly available dataset from 2009 to 2017. Five radiologists interpreted the chest radiographs in two sessions: without AI-generated reports and with AI-generated reports as preliminary reports. Reading times, reporting agreement (RADPEER), and quality scores (five-point scale) were evaluated by two experienced thoracic radiologists and compared between the first and second sessions from October to December 2023. Reading times, report agreement, and quality scores were analyzed using a generalized linear mixed model. Additionally, a subset of 258 chest radiographs was used to assess the factual correctness of the reports, and sensitivities and specificities were compared between the reports from the first and second sessions with use of the McNemar test. Results The introduction of AI-generated reports significantly reduced average reading times from 34.2 seconds ± 20.4 to 19.8 seconds ± 12.5 (P < .001). Report agreement scores shifted from a median of 5.0 (IQR, 4.0-5.0) without AI reports to 5.0 (IQR, 4.5-5.0) with AI reports (P < .001). Report quality scores changed from 4.5 (IQR, 4.0-5.0) without AI reports to 4.5 (IQR, 4.5-5.0) with AI reports (P < .001). From the subset analysis of factual correctness, the sensitivity for detecting various abnormalities increased significantly, including widened mediastinal silhouettes (84.3% to 90.8%; P < .001) and pleural lesions (77.7% to 87.4%; P < .001). While the overall diagnostic performance improved, variability among individual radiologists was noted. Conclusion The use of a domain-specific multimodal generative AI model increased the efficiency and quality of radiology report generation. © RSNA, 2025 Supplemental material is available for this article. See also the editorial by Babyn and Adams in this issue.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Radiology
Radiology 医学-核医学
CiteScore
35.20
自引率
3.00%
发文量
596
审稿时长
3.6 months
期刊介绍: Published regularly since 1923 by the Radiological Society of North America (RSNA), Radiology has long been recognized as the authoritative reference for the most current, clinically relevant and highest quality research in the field of radiology. Each month the journal publishes approximately 240 pages of peer-reviewed original research, authoritative reviews, well-balanced commentary on significant articles, and expert opinion on new techniques and technologies. Radiology publishes cutting edge and impactful imaging research articles in radiology and medical imaging in order to help improve human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信