Artificial intelligence in radiology: diagnostic sensitivity of ChatGPT for detecting hemorrhages in cranial computed tomography scans.

IF 1.7 4区 医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Olga Bayar-Kapıcı, Erman Altunışık, Feyza Musabeyoğlu, Şeyda Dev, Ömer Kaya
{"title":"Artificial intelligence in radiology: diagnostic sensitivity of ChatGPT for detecting hemorrhages in cranial computed tomography scans.","authors":"Olga Bayar-Kapıcı, Erman Altunışık, Feyza Musabeyoğlu, Şeyda Dev, Ömer Kaya","doi":"10.4274/dir.2025.253456","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Chat Generative Pre-trained Transformer (ChatGPT)-4V, a large language model developed by OpenAI, has been explored for its potential application in radiology. This study assesses ChatGPT-4V's diagnostic performance in identifying various types of intracranial hemorrhages in non-contrast cranial computed tomography (CT) images.</p><p><strong>Methods: </strong>Intracranial hemorrhages were presented to ChatGPT using the clearest 2D imaging slices. The first question, \"Q1: Which imaging technique is used in this image?\" was asked to determine the imaging modality. ChatGPT was then prompted with the second question, \"Q2: What do you see in this image and what is the final diagnosis?\" to assess whether the CT scan was normal or showed pathology. For CT scans containing hemorrhage that ChatGPT did not interpret correctly, a follow-up question-\"Q3: There is bleeding in this image. Which type of bleeding do you see?\"-was used to evaluate whether this guidance influenced its response.</p><p><strong>Results: </strong>ChatGPT accurately identified the imaging technique (Q1) in all cases but demonstrated difficulty diagnosing epidural hematoma (EDH), subdural hematoma (SDH), and subarachnoid hemorrhage (SAH) when no clues were provided (Q2). When a hemorrhage clue was introduced (Q3), ChatGPT correctly identified EDH in 16.7% of cases, SDH in 60%, and SAH in 15.6%, and achieved 100% diagnostic accuracy for hemorrhagic cerebrovascular disease. Its sensitivity, specificity, and accuracy for Q2 were 23.6%, 92.5%, and 57.4%, respectively. These values improved substantially with the clue in Q3, with sensitivity rising to 50.9% and accuracy to 71.3%. ChatGPT also demonstrated higher diagnostic accuracy in larger hemorrhages in EDH and SDH images.</p><p><strong>Conclusion: </strong>Although the model performs well in recognizing imaging modalities, its diagnostic accuracy substantially improves when guided by additional contextual information.</p><p><strong>Clinical significance: </strong>These findings suggest that ChatGPT's diagnostic performance improves with guided prompts, highlighting its potential as a supportive tool in clinical radiology.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostic and interventional radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4274/dir.2025.253456","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Chat Generative Pre-trained Transformer (ChatGPT)-4V, a large language model developed by OpenAI, has been explored for its potential application in radiology. This study assesses ChatGPT-4V's diagnostic performance in identifying various types of intracranial hemorrhages in non-contrast cranial computed tomography (CT) images.

Methods: Intracranial hemorrhages were presented to ChatGPT using the clearest 2D imaging slices. The first question, "Q1: Which imaging technique is used in this image?" was asked to determine the imaging modality. ChatGPT was then prompted with the second question, "Q2: What do you see in this image and what is the final diagnosis?" to assess whether the CT scan was normal or showed pathology. For CT scans containing hemorrhage that ChatGPT did not interpret correctly, a follow-up question-"Q3: There is bleeding in this image. Which type of bleeding do you see?"-was used to evaluate whether this guidance influenced its response.

Results: ChatGPT accurately identified the imaging technique (Q1) in all cases but demonstrated difficulty diagnosing epidural hematoma (EDH), subdural hematoma (SDH), and subarachnoid hemorrhage (SAH) when no clues were provided (Q2). When a hemorrhage clue was introduced (Q3), ChatGPT correctly identified EDH in 16.7% of cases, SDH in 60%, and SAH in 15.6%, and achieved 100% diagnostic accuracy for hemorrhagic cerebrovascular disease. Its sensitivity, specificity, and accuracy for Q2 were 23.6%, 92.5%, and 57.4%, respectively. These values improved substantially with the clue in Q3, with sensitivity rising to 50.9% and accuracy to 71.3%. ChatGPT also demonstrated higher diagnostic accuracy in larger hemorrhages in EDH and SDH images.

Conclusion: Although the model performs well in recognizing imaging modalities, its diagnostic accuracy substantially improves when guided by additional contextual information.

Clinical significance: These findings suggest that ChatGPT's diagnostic performance improves with guided prompts, highlighting its potential as a supportive tool in clinical radiology.

放射学中的人工智能:ChatGPT在颅内计算机断层扫描中检测出血的诊断敏感性。
目的:探索OpenAI开发的聊天生成预训练转换器(ChatGPT)-4V大型语言模型在放射学中的潜在应用。本研究评估了ChatGPT-4V在非对比颅计算机断层扫描(CT)图像中识别各种类型颅内出血的诊断性能。方法:采用最清晰的二维成像切片向ChatGPT显示颅内出血。第一个问题,“Q1:在这张图像中使用了哪种成像技术?”被要求确定成像方式。然后,ChatGPT被提示第二个问题,“Q2:你在这张图像中看到了什么,最终的诊断是什么?”以评估CT扫描是否正常或显示病理。对于含有出血的CT扫描,ChatGPT不能正确解释,一个后续问题-“Q3:在这个图像中有出血。你看到的是哪一种出血?——被用来评估这一指导是否影响了它的反应。结果:ChatGPT在所有病例中准确识别成像技术(Q1),但在没有提供线索(Q2)时难以诊断硬膜外血肿(EDH)、硬膜下血肿(SDH)和蛛网膜下腔出血(SAH)。当引入出血线索时(Q3), ChatGPT对EDH的正确率为16.7%,对SDH的正确率为60%,对SAH的正确率为15.6%,对出血性脑血管病的诊断准确率达到100%。其对Q2的敏感性、特异性和准确性分别为23.6%、92.5%和57.4%。随着第三季度线索的出现,这些值大幅提高,灵敏度上升到50.9%,准确度上升到71.3%。ChatGPT在EDH和SDH大出血图像中也显示出更高的诊断准确性。结论:尽管该模型在识别成像模式方面表现良好,但在附加上下文信息的指导下,其诊断准确性大大提高。临床意义:这些发现表明,ChatGPT的诊断性能在引导提示下得到改善,突出了其作为临床放射学辅助工具的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Diagnostic and interventional radiology
Diagnostic and interventional radiology Medicine-Radiology, Nuclear Medicine and Imaging
自引率
4.80%
发文量
0
期刊介绍: Diagnostic and Interventional Radiology (Diagn Interv Radiol) is the open access, online-only official publication of Turkish Society of Radiology. It is published bimonthly and the journal’s publication language is English. The journal is a medium for original articles, reviews, pictorial essays, technical notes related to all fields of diagnostic and interventional radiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信