DeepSeek-R1模型对ChatGPT-4和chatgpt - 40的肌肉骨骼放射学查询的响应性能比较-可行性研究

IF 1.5 4区医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Clinical Imaging Pub Date : 2025-05-12 DOI:10.1016/j.clinimag.2025.110506

Hasaam Uldin , Sonal Saran , Girish Gandikota , Karthikeyan. P. Iyengar , Raju Vaishya , Yogesh Parmar , Fahid Rasul , Rajesh Botchu

{"title":"DeepSeek-R1模型对ChatGPT-4和chatgpt - 40的肌肉骨骼放射学查询的响应性能比较-可行性研究","authors":"Hasaam Uldin , Sonal Saran , Girish Gandikota , Karthikeyan. P. Iyengar , Raju Vaishya , Yogesh Parmar , Fahid Rasul , Rajesh Botchu","doi":"10.1016/j.clinimag.2025.110506","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Artificial Intelligence (AI) has transformed society and chatbots using Large Language Models (LLM) are playing an increasing role in scientific research. This study aims to assess and compare the efficacy of newer DeepSeek R1 and ChatGPT-4 and 4o models in answering scientific questions about recent research.</div></div><div><h3>Material and methods</h3><div>We compared output generated from ChatGPT-4, ChatGPT-4o, and DeepSeek-R1 in response to ten standardized questions in the setting of musculoskeletal (MSK) radiology. These were independently analyzed by one MSK radiologist and one final-year MSK radiology trainee and graded using a Likert scale from 1 to 5 (1 being inaccurate to 5 being accurate).</div></div><div><h3>Results</h3><div>Five DeepSeek answers were significantly inaccurate and provided fictitious references only on prompting. All ChatGPT-4 and 4o answers were well-written with good content, the latter including useful and comprehensive references.</div></div><div><h3>Conclusion</h3><div>ChatGPT-4o generates structured research answers to questions on recent MSK radiology research with useful references in all our cases, enabling reliable usage. DeepSeek-R1 generates articles that, on the other hand, may appear authentic to the unsuspecting eye but contain a higher amount of falsified and inaccurate information in the current version. Further iterations may improve these accuracies.</div></div>","PeriodicalId":50680,"journal":{"name":"Clinical Imaging","volume":"123 ","pages":"Article 110506"},"PeriodicalIF":1.5000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o – A feasibility study\",\"authors\":\"Hasaam Uldin , Sonal Saran , Girish Gandikota , Karthikeyan. P. Iyengar , Raju Vaishya , Yogesh Parmar , Fahid Rasul , Rajesh Botchu\",\"doi\":\"10.1016/j.clinimag.2025.110506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>Artificial Intelligence (AI) has transformed society and chatbots using Large Language Models (LLM) are playing an increasing role in scientific research. This study aims to assess and compare the efficacy of newer DeepSeek R1 and ChatGPT-4 and 4o models in answering scientific questions about recent research.</div></div><div><h3>Material and methods</h3><div>We compared output generated from ChatGPT-4, ChatGPT-4o, and DeepSeek-R1 in response to ten standardized questions in the setting of musculoskeletal (MSK) radiology. These were independently analyzed by one MSK radiologist and one final-year MSK radiology trainee and graded using a Likert scale from 1 to 5 (1 being inaccurate to 5 being accurate).</div></div><div><h3>Results</h3><div>Five DeepSeek answers were significantly inaccurate and provided fictitious references only on prompting. All ChatGPT-4 and 4o answers were well-written with good content, the latter including useful and comprehensive references.</div></div><div><h3>Conclusion</h3><div>ChatGPT-4o generates structured research answers to questions on recent MSK radiology research with useful references in all our cases, enabling reliable usage. DeepSeek-R1 generates articles that, on the other hand, may appear authentic to the unsuspecting eye but contain a higher amount of falsified and inaccurate information in the current version. Further iterations may improve these accuracies.</div></div>\",\"PeriodicalId\":50680,\"journal\":{\"name\":\"Clinical Imaging\",\"volume\":\"123 \",\"pages\":\"Article 110506\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Imaging\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0899707125001068\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0899707125001068","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

人工智能（AI）已经改变了社会，使用大型语言模型（LLM）的聊天机器人在科学研究中发挥着越来越大的作用。本研究旨在评估和比较较新的DeepSeek R1和ChatGPT-4和40模型在回答有关最近研究的科学问题方面的功效。材料和方法我们比较了ChatGPT-4、chatgpt - 40和DeepSeek-R1对肌肉骨骼（MSK）放射学设置中十个标准化问题的响应。这些数据由一名MSK放射科医生和一名最后一年的MSK放射学实习生独立分析，并使用李克特量表从1到5进行评分（1代表不准确，5代表准确）。结果5个深度搜索答案均存在显著不准确性，仅在提示时提供虚假参考文献。所有的ChatGPT-4和40的答案都写得很好，内容很好，后者包括有用和全面的参考资料。结论chatgpt - 40对最近MSK放射学研究的问题给出了结构化的研究答案，并在我们所有的病例中提供了有用的参考，使使用可靠。另一方面，DeepSeek-R1生成的文章在不知情的人看来可能是真实的，但在当前版本中包含了更多的伪造和不准确的信息。进一步的迭代可能会提高这些准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o – A feasibility study

Objective

Artificial Intelligence (AI) has transformed society and chatbots using Large Language Models (LLM) are playing an increasing role in scientific research. This study aims to assess and compare the efficacy of newer DeepSeek R1 and ChatGPT-4 and 4o models in answering scientific questions about recent research.

Material and methods

We compared output generated from ChatGPT-4, ChatGPT-4o, and DeepSeek-R1 in response to ten standardized questions in the setting of musculoskeletal (MSK) radiology. These were independently analyzed by one MSK radiologist and one final-year MSK radiology trainee and graded using a Likert scale from 1 to 5 (1 being inaccurate to 5 being accurate).

Results

Five DeepSeek answers were significantly inaccurate and provided fictitious references only on prompting. All ChatGPT-4 and 4o answers were well-written with good content, the latter including useful and comprehensive references.

Conclusion

ChatGPT-4o generates structured research answers to questions on recent MSK radiology research with useful references in all our cases, enabling reliable usage. DeepSeek-R1 generates articles that, on the other hand, may appear authentic to the unsuspecting eye but contain a higher amount of falsified and inaccurate information in the current version. Further iterations may improve these accuracies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Clinical Imaging 医学-核医学

CiteScore

4.60

自引率

0.00%

发文量

265

审稿时长

35 days

期刊介绍： The mission of Clinical Imaging is to publish, in a timely manner, the very best radiology research from the United States and around the world with special attention to the impact of medical imaging on patient care. The journal''s publications cover all imaging modalities, radiology issues related to patients, policy and practice improvements, and clinically-oriented imaging physics and informatics. The journal is a valuable resource for practicing radiologists, radiologists-in-training and other clinicians with an interest in imaging. Papers are carefully peer-reviewed and selected by our experienced subject editors who are leading experts spanning the range of imaging sub-specialties, which include: -Body Imaging- Breast Imaging- Cardiothoracic Imaging- Imaging Physics and Informatics- Molecular Imaging and Nuclear Medicine- Musculoskeletal and Emergency Imaging- Neuroradiology- Practice, Policy & Education- Pediatric Imaging- Vascular and Interventional Radiology