A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o – A feasibility study

IF 1.5 4区 医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Hasaam Uldin , Sonal Saran , Girish Gandikota , Karthikeyan. P. Iyengar , Raju Vaishya , Yogesh Parmar , Fahid Rasul , Rajesh Botchu
{"title":"A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o – A feasibility study","authors":"Hasaam Uldin ,&nbsp;Sonal Saran ,&nbsp;Girish Gandikota ,&nbsp;Karthikeyan. P. Iyengar ,&nbsp;Raju Vaishya ,&nbsp;Yogesh Parmar ,&nbsp;Fahid Rasul ,&nbsp;Rajesh Botchu","doi":"10.1016/j.clinimag.2025.110506","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Artificial Intelligence (AI) has transformed society and chatbots using Large Language Models (LLM) are playing an increasing role in scientific research. This study aims to assess and compare the efficacy of newer DeepSeek R1 and ChatGPT-4 and 4o models in answering scientific questions about recent research.</div></div><div><h3>Material and methods</h3><div>We compared output generated from ChatGPT-4, ChatGPT-4o, and DeepSeek-R1 in response to ten standardized questions in the setting of musculoskeletal (MSK) radiology. These were independently analyzed by one MSK radiologist and one final-year MSK radiology trainee and graded using a Likert scale from 1 to 5 (1 being inaccurate to 5 being accurate).</div></div><div><h3>Results</h3><div>Five DeepSeek answers were significantly inaccurate and provided fictitious references only on prompting. All ChatGPT-4 and 4o answers were well-written with good content, the latter including useful and comprehensive references.</div></div><div><h3>Conclusion</h3><div>ChatGPT-4o generates structured research answers to questions on recent MSK radiology research with useful references in all our cases, enabling reliable usage. DeepSeek-R1 generates articles that, on the other hand, may appear authentic to the unsuspecting eye but contain a higher amount of falsified and inaccurate information in the current version. Further iterations may improve these accuracies.</div></div>","PeriodicalId":50680,"journal":{"name":"Clinical Imaging","volume":"123 ","pages":"Article 110506"},"PeriodicalIF":1.5000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0899707125001068","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

Artificial Intelligence (AI) has transformed society and chatbots using Large Language Models (LLM) are playing an increasing role in scientific research. This study aims to assess and compare the efficacy of newer DeepSeek R1 and ChatGPT-4 and 4o models in answering scientific questions about recent research.

Material and methods

We compared output generated from ChatGPT-4, ChatGPT-4o, and DeepSeek-R1 in response to ten standardized questions in the setting of musculoskeletal (MSK) radiology. These were independently analyzed by one MSK radiologist and one final-year MSK radiology trainee and graded using a Likert scale from 1 to 5 (1 being inaccurate to 5 being accurate).

Results

Five DeepSeek answers were significantly inaccurate and provided fictitious references only on prompting. All ChatGPT-4 and 4o answers were well-written with good content, the latter including useful and comprehensive references.

Conclusion

ChatGPT-4o generates structured research answers to questions on recent MSK radiology research with useful references in all our cases, enabling reliable usage. DeepSeek-R1 generates articles that, on the other hand, may appear authentic to the unsuspecting eye but contain a higher amount of falsified and inaccurate information in the current version. Further iterations may improve these accuracies.
DeepSeek-R1模型对ChatGPT-4和chatgpt - 40的肌肉骨骼放射学查询的响应性能比较-可行性研究
人工智能(AI)已经改变了社会,使用大型语言模型(LLM)的聊天机器人在科学研究中发挥着越来越大的作用。本研究旨在评估和比较较新的DeepSeek R1和ChatGPT-4和40模型在回答有关最近研究的科学问题方面的功效。材料和方法我们比较了ChatGPT-4、chatgpt - 40和DeepSeek-R1对肌肉骨骼(MSK)放射学设置中十个标准化问题的响应。这些数据由一名MSK放射科医生和一名最后一年的MSK放射学实习生独立分析,并使用李克特量表从1到5进行评分(1代表不准确,5代表准确)。结果5个深度搜索答案均存在显著不准确性,仅在提示时提供虚假参考文献。所有的ChatGPT-4和40的答案都写得很好,内容很好,后者包括有用和全面的参考资料。结论chatgpt - 40对最近MSK放射学研究的问题给出了结构化的研究答案,并在我们所有的病例中提供了有用的参考,使使用可靠。另一方面,DeepSeek-R1生成的文章在不知情的人看来可能是真实的,但在当前版本中包含了更多的伪造和不准确的信息。进一步的迭代可能会提高这些准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Clinical Imaging
Clinical Imaging 医学-核医学
CiteScore
4.60
自引率
0.00%
发文量
265
审稿时长
35 days
期刊介绍: The mission of Clinical Imaging is to publish, in a timely manner, the very best radiology research from the United States and around the world with special attention to the impact of medical imaging on patient care. The journal''s publications cover all imaging modalities, radiology issues related to patients, policy and practice improvements, and clinically-oriented imaging physics and informatics. The journal is a valuable resource for practicing radiologists, radiologists-in-training and other clinicians with an interest in imaging. Papers are carefully peer-reviewed and selected by our experienced subject editors who are leading experts spanning the range of imaging sub-specialties, which include: -Body Imaging- Breast Imaging- Cardiothoracic Imaging- Imaging Physics and Informatics- Molecular Imaging and Nuclear Medicine- Musculoskeletal and Emergency Imaging- Neuroradiology- Practice, Policy & Education- Pediatric Imaging- Vascular and Interventional Radiology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信