评估ChatGPT-4在FRCR第一部分试题中识别放射解剖学的性能。

IF 1 Q4 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Indian Journal of Radiology and Imaging Pub Date : 2024-11-04 eCollection Date: 2025-04-01 DOI:10.1055/s-0044-1792040
Pradosh Kumar Sarangi, Suvrankar Datta, Braja Behari Panda, Swaha Panda, Himel Mondal
{"title":"评估ChatGPT-4在FRCR第一部分试题中识别放射解剖学的性能。","authors":"Pradosh Kumar Sarangi, Suvrankar Datta, Braja Behari Panda, Swaha Panda, Himel Mondal","doi":"10.1055/s-0044-1792040","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background</b>  Radiology is critical for diagnosis and patient care, relying heavily on accurate image interpretation. Recent advancements in artificial intelligence (AI) and natural language processing (NLP) have raised interest in the potential of AI models to support radiologists, although robust research on AI performance in this field is still emerging. <b>Objective</b>  This study aimed to assess the efficacy of ChatGPT-4 in answering radiological anatomy questions similar to those in the Fellowship of the Royal College of Radiologists (FRCR) Part 1 Anatomy examination. <b>Materials and Methods</b>  We used 100 mock radiological anatomy questions from a free Web site patterned after the FRCR Part 1 Anatomy examination. ChatGPT-4 was tested under two conditions: with and without context regarding the examination instructions and question format. The main query posed was: \"Identify the structure indicated by the arrow(s).\" Responses were evaluated against correct answers, and two expert radiologists (>5 and 30 years of experience in radiology diagnostics and academics) rated the explanation of the answers. We calculated four scores: correctness, sidedness, modality identification, and approximation. The latter considers partial correctness if the identified structure is present but not the focus of the question. <b>Results</b>  Both testing conditions saw ChatGPT-4 underperform, with correctness scores of 4 and 7.5% for no context and with context, respectively. However, it identified the imaging modality with 100% accuracy. The model scored over 50% on the approximation metric, where it identified present structures not indicated by the arrow. However, it struggled with identifying the correct side of the structure, scoring approximately 42 and 40% in the no context and with context settings, respectively. Only 32% of the responses were similar across the two settings. <b>Conclusion</b>  Despite its ability to correctly recognize the imaging modality, ChatGPT-4 has significant limitations in interpreting normal radiological anatomy. This indicates the necessity for enhanced training in normal anatomy to better interpret abnormal radiological images. Identifying the correct side of structures in radiological images also remains a challenge for ChatGPT-4.</p>","PeriodicalId":51597,"journal":{"name":"Indian Journal of Radiology and Imaging","volume":"35 2","pages":"287-294"},"PeriodicalIF":1.0000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12034419/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluating ChatGPT-4's Performance in Identifying Radiological Anatomy in FRCR Part 1 Examination Questions.\",\"authors\":\"Pradosh Kumar Sarangi, Suvrankar Datta, Braja Behari Panda, Swaha Panda, Himel Mondal\",\"doi\":\"10.1055/s-0044-1792040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background</b>  Radiology is critical for diagnosis and patient care, relying heavily on accurate image interpretation. Recent advancements in artificial intelligence (AI) and natural language processing (NLP) have raised interest in the potential of AI models to support radiologists, although robust research on AI performance in this field is still emerging. <b>Objective</b>  This study aimed to assess the efficacy of ChatGPT-4 in answering radiological anatomy questions similar to those in the Fellowship of the Royal College of Radiologists (FRCR) Part 1 Anatomy examination. <b>Materials and Methods</b>  We used 100 mock radiological anatomy questions from a free Web site patterned after the FRCR Part 1 Anatomy examination. ChatGPT-4 was tested under two conditions: with and without context regarding the examination instructions and question format. The main query posed was: \\\"Identify the structure indicated by the arrow(s).\\\" Responses were evaluated against correct answers, and two expert radiologists (>5 and 30 years of experience in radiology diagnostics and academics) rated the explanation of the answers. We calculated four scores: correctness, sidedness, modality identification, and approximation. The latter considers partial correctness if the identified structure is present but not the focus of the question. <b>Results</b>  Both testing conditions saw ChatGPT-4 underperform, with correctness scores of 4 and 7.5% for no context and with context, respectively. However, it identified the imaging modality with 100% accuracy. The model scored over 50% on the approximation metric, where it identified present structures not indicated by the arrow. However, it struggled with identifying the correct side of the structure, scoring approximately 42 and 40% in the no context and with context settings, respectively. Only 32% of the responses were similar across the two settings. <b>Conclusion</b>  Despite its ability to correctly recognize the imaging modality, ChatGPT-4 has significant limitations in interpreting normal radiological anatomy. This indicates the necessity for enhanced training in normal anatomy to better interpret abnormal radiological images. Identifying the correct side of structures in radiological images also remains a challenge for ChatGPT-4.</p>\",\"PeriodicalId\":51597,\"journal\":{\"name\":\"Indian Journal of Radiology and Imaging\",\"volume\":\"35 2\",\"pages\":\"287-294\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12034419/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Indian Journal of Radiology and Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1055/s-0044-1792040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q4\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indian Journal of Radiology and Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1055/s-0044-1792040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

摘要

放射学对诊断和病人护理至关重要,在很大程度上依赖于准确的图像解释。人工智能(AI)和自然语言处理(NLP)的最新进展引起了人们对人工智能模型支持放射科医生的潜力的兴趣,尽管在该领域对人工智能性能的强有力研究仍在兴起。目的本研究旨在评估ChatGPT-4在回答类似于英国皇家放射学院(FRCR)第一部分解剖考试中的放射解剖学问题方面的疗效。材料和方法我们使用了100个模拟放射解剖学问题,这些问题来自一个免费的网站,是按照FRCR第一部分解剖学检查的模式设计的。ChatGPT-4在两种情况下进行测试:有和没有关于考试说明和问题格式的背景。提出的主要查询是:“识别箭头所指示的结构。”根据正确答案对回答进行评估,两名放射专家(在放射诊断和学术方面分别有50年和30年的经验)对答案的解释进行评分。我们计算了四个分数:正确性、侧边性、模态识别和近似。如果已识别的结构存在,但不是问题的焦点,则后者考虑部分正确性。结果在两种测试条件下,ChatGPT-4都表现不佳,在没有上下文和有上下文的情况下,其正确性得分分别为4分和7.5%。然而,它以100%的准确率确定了成像模式。该模型在近似度量上的得分超过50%,其中它识别了箭头未指示的当前结构。然而,它很难识别出结构的正确一面,在没有语境和有语境设置的情况下,得分分别约为42分和40%。在两种情况下,只有32%的回答是相似的。结论尽管ChatGPT-4能够正确识别成像模式,但在解释正常放射解剖时仍有明显的局限性。这表明有必要加强正常解剖学训练,以更好地解释异常放射图像。在放射图像中识别结构的正确一侧也是ChatGPT-4的一个挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Evaluating ChatGPT-4's Performance in Identifying Radiological Anatomy in FRCR Part 1 Examination Questions.

Evaluating ChatGPT-4's Performance in Identifying Radiological Anatomy in FRCR Part 1 Examination Questions.

Evaluating ChatGPT-4's Performance in Identifying Radiological Anatomy in FRCR Part 1 Examination Questions.

Evaluating ChatGPT-4's Performance in Identifying Radiological Anatomy in FRCR Part 1 Examination Questions.

Background  Radiology is critical for diagnosis and patient care, relying heavily on accurate image interpretation. Recent advancements in artificial intelligence (AI) and natural language processing (NLP) have raised interest in the potential of AI models to support radiologists, although robust research on AI performance in this field is still emerging. Objective  This study aimed to assess the efficacy of ChatGPT-4 in answering radiological anatomy questions similar to those in the Fellowship of the Royal College of Radiologists (FRCR) Part 1 Anatomy examination. Materials and Methods  We used 100 mock radiological anatomy questions from a free Web site patterned after the FRCR Part 1 Anatomy examination. ChatGPT-4 was tested under two conditions: with and without context regarding the examination instructions and question format. The main query posed was: "Identify the structure indicated by the arrow(s)." Responses were evaluated against correct answers, and two expert radiologists (>5 and 30 years of experience in radiology diagnostics and academics) rated the explanation of the answers. We calculated four scores: correctness, sidedness, modality identification, and approximation. The latter considers partial correctness if the identified structure is present but not the focus of the question. Results  Both testing conditions saw ChatGPT-4 underperform, with correctness scores of 4 and 7.5% for no context and with context, respectively. However, it identified the imaging modality with 100% accuracy. The model scored over 50% on the approximation metric, where it identified present structures not indicated by the arrow. However, it struggled with identifying the correct side of the structure, scoring approximately 42 and 40% in the no context and with context settings, respectively. Only 32% of the responses were similar across the two settings. Conclusion  Despite its ability to correctly recognize the imaging modality, ChatGPT-4 has significant limitations in interpreting normal radiological anatomy. This indicates the necessity for enhanced training in normal anatomy to better interpret abnormal radiological images. Identifying the correct side of structures in radiological images also remains a challenge for ChatGPT-4.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Indian Journal of Radiology and Imaging
Indian Journal of Radiology and Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
1.20
自引率
0.00%
发文量
115
审稿时长
45 weeks
期刊介绍: Information not localized
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信