评估成像模式选择的大型语言模型:减少不必要的造影剂使用和辐射暴露的潜力

IF 1.5 4区 医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Eren Çamur , Turay Cesur , Yasin Celal Güneş , Yusuf Öztürk , Yunus Şerefettin , Ersin Doğanözü , Ayşegül Akçebe , Ahmet Kürşad Güneş , İbrahim Ethem Cakcak
{"title":"评估成像模式选择的大型语言模型:减少不必要的造影剂使用和辐射暴露的潜力","authors":"Eren Çamur ,&nbsp;Turay Cesur ,&nbsp;Yasin Celal Güneş ,&nbsp;Yusuf Öztürk ,&nbsp;Yunus Şerefettin ,&nbsp;Ersin Doğanözü ,&nbsp;Ayşegül Akçebe ,&nbsp;Ahmet Kürşad Güneş ,&nbsp;İbrahim Ethem Cakcak","doi":"10.1016/j.clinimag.2025.110573","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Large Language Models (LLMs) represent a transformative leap in artificial intelligence with the potential to revolutionize radiologic decision-making. This study uniquely evaluates the performance of various LLMs from different vendors in selecting appropriate imaging modalities and comparing their responses with those of clinicians across different specialties and radiologists with different experience levels.</div></div><div><h3>Methods</h3><div>In a cross-sectional experimental design, 120 clinical scenarios derived from ACR AC and 120 “Multifaceted practice-oriented clinical scenarios” (including breast, cardiac, gastrointestinal, musculoskeletal, neuro, thoracic, genitourinary, vascular sections) were assessed using three different prompts. The performance of four LLMs from different vendors were evaluated and compared with four clinicians (emergency physician, cardiologist, internist and general surgeon) and four radiologists with different experience level. Also, the performances of LLMs contrast agent use and X-ray-containing imaging modality selection were evaluated. The responses were categorized according to ACR AC. Short and long-term reproducibility were assessed in the same clinical scenarios.</div></div><div><h3>Results</h3><div>All LLMs yielded identical modality recommendations across the three distinct prompts (κ = 1). In ACR clinical scenarios, DeepSeek-R1 identified the appropriate imaging modality in 98.3 % of cases, achieving superior accuracy without inter-model differences (p &gt; 0.006). In realistic scenarios, DeepSeek-R1 again led, matching board-certified junior radiologist performance and exceeding clinician and resident performance. The short-term reproducibility ranged from κ = 0.773 to 0.886, with long-term reproducibility spanning κ = 0.507 to 0.787.</div></div><div><h3>Discussion</h3><div>This study underscores that LLMs have remarkable potential for selecting appropriate imaging modalities for different clinical scenarios related to various sections and their valuable contributions as supportive tools in clinical practice in this field.</div></div>","PeriodicalId":50680,"journal":{"name":"Clinical Imaging","volume":"125 ","pages":"Article 110573"},"PeriodicalIF":1.5000,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Large Language Models for imaging modality selection: Potential to reduce unnecessary contrast agent use and radiation exposure\",\"authors\":\"Eren Çamur ,&nbsp;Turay Cesur ,&nbsp;Yasin Celal Güneş ,&nbsp;Yusuf Öztürk ,&nbsp;Yunus Şerefettin ,&nbsp;Ersin Doğanözü ,&nbsp;Ayşegül Akçebe ,&nbsp;Ahmet Kürşad Güneş ,&nbsp;İbrahim Ethem Cakcak\",\"doi\":\"10.1016/j.clinimag.2025.110573\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Introduction</h3><div>Large Language Models (LLMs) represent a transformative leap in artificial intelligence with the potential to revolutionize radiologic decision-making. This study uniquely evaluates the performance of various LLMs from different vendors in selecting appropriate imaging modalities and comparing their responses with those of clinicians across different specialties and radiologists with different experience levels.</div></div><div><h3>Methods</h3><div>In a cross-sectional experimental design, 120 clinical scenarios derived from ACR AC and 120 “Multifaceted practice-oriented clinical scenarios” (including breast, cardiac, gastrointestinal, musculoskeletal, neuro, thoracic, genitourinary, vascular sections) were assessed using three different prompts. The performance of four LLMs from different vendors were evaluated and compared with four clinicians (emergency physician, cardiologist, internist and general surgeon) and four radiologists with different experience level. Also, the performances of LLMs contrast agent use and X-ray-containing imaging modality selection were evaluated. The responses were categorized according to ACR AC. Short and long-term reproducibility were assessed in the same clinical scenarios.</div></div><div><h3>Results</h3><div>All LLMs yielded identical modality recommendations across the three distinct prompts (κ = 1). In ACR clinical scenarios, DeepSeek-R1 identified the appropriate imaging modality in 98.3 % of cases, achieving superior accuracy without inter-model differences (p &gt; 0.006). In realistic scenarios, DeepSeek-R1 again led, matching board-certified junior radiologist performance and exceeding clinician and resident performance. The short-term reproducibility ranged from κ = 0.773 to 0.886, with long-term reproducibility spanning κ = 0.507 to 0.787.</div></div><div><h3>Discussion</h3><div>This study underscores that LLMs have remarkable potential for selecting appropriate imaging modalities for different clinical scenarios related to various sections and their valuable contributions as supportive tools in clinical practice in this field.</div></div>\",\"PeriodicalId\":50680,\"journal\":{\"name\":\"Clinical Imaging\",\"volume\":\"125 \",\"pages\":\"Article 110573\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Imaging\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0899707125001731\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0899707125001731","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

摘要

大型语言模型(llm)代表了人工智能的革命性飞跃,有可能彻底改变放射学决策。本研究独特地评估了来自不同供应商的法学硕士在选择合适的成像模式方面的表现,并将他们的反应与不同专业的临床医生和具有不同经验水平的放射科医生的反应进行了比较。方法采用横断面实验设计,采用三种不同的提示对来自ACR AC的120个临床场景和120个“面向实践的多方面临床场景”(包括乳腺、心脏、胃肠、肌肉骨骼、神经、胸部、泌尿生殖系统、血管)进行评估。对来自不同供应商的4名llm的绩效进行评估,并与4名不同经验水平的临床医生(急诊科医生、心脏科医生、内科医生和普通外科医生)和4名放射科医生进行比较。同时,对LLMs造影剂的使用和含x线成像方式的选择进行了评价。根据ACR和AC对反应进行分类。在相同的临床情况下评估短期和长期的可重复性。所有llm在三个不同的提示中产生相同的模式建议(κ = 1)。在ACR临床情况下,DeepSeek-R1在98.3%的病例中识别出合适的成像方式,在没有模型间差异的情况下实现了更高的准确性(p >;0.006)。在现实场景中,DeepSeek-R1再次领先,与委员会认证的初级放射科医生的表现相当,超过了临床医生和住院医生的表现。短期重现性范围为κ = 0.773 ~ 0.886,长期重现性范围为κ = 0.507 ~ 0.787。本研究强调llm在为不同的临床情况选择合适的成像方式方面具有显著的潜力,并且他们在该领域的临床实践中作为支持工具做出了宝贵的贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating Large Language Models for imaging modality selection: Potential to reduce unnecessary contrast agent use and radiation exposure

Introduction

Large Language Models (LLMs) represent a transformative leap in artificial intelligence with the potential to revolutionize radiologic decision-making. This study uniquely evaluates the performance of various LLMs from different vendors in selecting appropriate imaging modalities and comparing their responses with those of clinicians across different specialties and radiologists with different experience levels.

Methods

In a cross-sectional experimental design, 120 clinical scenarios derived from ACR AC and 120 “Multifaceted practice-oriented clinical scenarios” (including breast, cardiac, gastrointestinal, musculoskeletal, neuro, thoracic, genitourinary, vascular sections) were assessed using three different prompts. The performance of four LLMs from different vendors were evaluated and compared with four clinicians (emergency physician, cardiologist, internist and general surgeon) and four radiologists with different experience level. Also, the performances of LLMs contrast agent use and X-ray-containing imaging modality selection were evaluated. The responses were categorized according to ACR AC. Short and long-term reproducibility were assessed in the same clinical scenarios.

Results

All LLMs yielded identical modality recommendations across the three distinct prompts (κ = 1). In ACR clinical scenarios, DeepSeek-R1 identified the appropriate imaging modality in 98.3 % of cases, achieving superior accuracy without inter-model differences (p > 0.006). In realistic scenarios, DeepSeek-R1 again led, matching board-certified junior radiologist performance and exceeding clinician and resident performance. The short-term reproducibility ranged from κ = 0.773 to 0.886, with long-term reproducibility spanning κ = 0.507 to 0.787.

Discussion

This study underscores that LLMs have remarkable potential for selecting appropriate imaging modalities for different clinical scenarios related to various sections and their valuable contributions as supportive tools in clinical practice in this field.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Imaging
Clinical Imaging 医学-核医学
CiteScore
4.60
自引率
0.00%
发文量
265
审稿时长
35 days
期刊介绍: The mission of Clinical Imaging is to publish, in a timely manner, the very best radiology research from the United States and around the world with special attention to the impact of medical imaging on patient care. The journal''s publications cover all imaging modalities, radiology issues related to patients, policy and practice improvements, and clinically-oriented imaging physics and informatics. The journal is a valuable resource for practicing radiologists, radiologists-in-training and other clinicians with an interest in imaging. Papers are carefully peer-reviewed and selected by our experienced subject editors who are leading experts spanning the range of imaging sub-specialties, which include: -Body Imaging- Breast Imaging- Cardiothoracic Imaging- Imaging Physics and Informatics- Molecular Imaging and Nuclear Medicine- Musculoskeletal and Emergency Imaging- Neuroradiology- Practice, Policy & Education- Pediatric Imaging- Vascular and Interventional Radiology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信