Radiologist Interaction with AI-Generated Preliminary Reports: A Longitudinal Multi-Reader Study.

Eun Kyoung Hong, Chong-Hyun Suh, Monika Nukala, Azadehsadat Esfahani, Andro Licaros, Rachna Madan, Andetta Hunsaker, Mark Hammer
{"title":"Radiologist Interaction with AI-Generated Preliminary Reports: A Longitudinal Multi-Reader Study.","authors":"Eun Kyoung Hong, Chong-Hyun Suh, Monika Nukala, Azadehsadat Esfahani, Andro Licaros, Rachna Madan, Andetta Hunsaker, Mark Hammer","doi":"10.1016/j.jacr.2025.09.015","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To investigate the integration of multimodal AI-generated reports into radiology workflow over time, focusing on their impact on efficiency, acceptability, and report quality.</p><p><strong>Methods: </strong>A multicase, multireader study involved 756 publicly available chest radiographs interpreted by five radiologists using preliminary reports generated by a radiology-specific multimodal AI model, divided into seven sequential batches of 108 radiographs each. Two thoracic radiologists assessed the final reports using RADPEER criteria for agreement and 5-point Likert scale for quality. Reading times, rate of acceptance without modification, agreement, and quality scores were measured, with statistical analyses evaluating trends across seven sequential batches.</p><p><strong>Results: </strong>Radiologists' reading times for chest radiographs decreased from 25.8 seconds in Batch 1 to 19.3 seconds in Batch 7 (p < .001). Acceptability increased from 54.6% to 60.2% (p < .001), with normal chest radiographs demonstrating high rates (68.9%) compared to abnormal chest radiographs (52.6%; p < .001). Median agreement and quality scores remained stable for normal chest radiographs but varied significantly for abnormal chest radiographs (ps < .05).</p><p><strong>Discussion: </strong>The introduction of AI-generated reports improved efficiency of chest radiograph interpretation, acceptability increased over time. However, agreement and quality scores showed variability, particularly in abnormal cases, emphasizing the need for oversight in the interpretation of complex chest radiographs.</p>","PeriodicalId":73968,"journal":{"name":"Journal of the American College of Radiology : JACR","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American College of Radiology : JACR","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jacr.2025.09.015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: To investigate the integration of multimodal AI-generated reports into radiology workflow over time, focusing on their impact on efficiency, acceptability, and report quality.

Methods: A multicase, multireader study involved 756 publicly available chest radiographs interpreted by five radiologists using preliminary reports generated by a radiology-specific multimodal AI model, divided into seven sequential batches of 108 radiographs each. Two thoracic radiologists assessed the final reports using RADPEER criteria for agreement and 5-point Likert scale for quality. Reading times, rate of acceptance without modification, agreement, and quality scores were measured, with statistical analyses evaluating trends across seven sequential batches.

Results: Radiologists' reading times for chest radiographs decreased from 25.8 seconds in Batch 1 to 19.3 seconds in Batch 7 (p < .001). Acceptability increased from 54.6% to 60.2% (p < .001), with normal chest radiographs demonstrating high rates (68.9%) compared to abnormal chest radiographs (52.6%; p < .001). Median agreement and quality scores remained stable for normal chest radiographs but varied significantly for abnormal chest radiographs (ps < .05).

Discussion: The introduction of AI-generated reports improved efficiency of chest radiograph interpretation, acceptability increased over time. However, agreement and quality scores showed variability, particularly in abnormal cases, emphasizing the need for oversight in the interpretation of complex chest radiographs.

放射科医生与人工智能生成的初步报告的互动:一项纵向多读者研究。
目的:研究多模式人工智能生成的报告随时间整合到放射学工作流程中,重点关注其对效率、可接受性和报告质量的影响。方法:一项多病例、多解读研究涉及756张公开的胸部x线片,由5名放射科医生使用放射学特异性多模态AI模型生成的初步报告进行解读,分为7个连续批次,每个批次108张x线片。两名胸科放射科医生使用RADPEER标准评估最终报告的一致性和5点李克特量表评估质量。通过统计分析评估七个连续批次的趋势,测量阅读时间、未修改接受率、一致性和质量分数。结果:放射科医师的胸片阅读时间从第1批的25.8秒下降到第7批的19.3秒(p < 0.001)。可接受度从54.6%增加到60.2% (p < 0.001),正常胸片的可接受度(68.9%)高于异常胸片(52.6%,p < 0.001)。正常胸片的中位一致性和质量评分保持稳定,但异常胸片的中位一致性和质量评分差异显著(ps < 0.05)。讨论:人工智能生成报告的引入提高了胸片解释的效率,可接受性随着时间的推移而增加。然而,一致性和质量评分表现出可变性,特别是在异常病例中,强调在解释复杂胸片时需要监督。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信