使用大型语言模型增强超声检查性能：ChatGPT-4和Claude 3的分析研究。

IF 2.3

Medical ultrasonography Pub Date : 2025-09-17 Epub Date: 2025-04-29 DOI:10.11152/mu-4505

Xuefeng Li, Ziman Chen, Ting Xie, Jieyi Liang, Fei Chen

{"title":"使用大型语言模型增强超声检查性能：ChatGPT-4和Claude 3的分析研究。","authors":"Xuefeng Li, Ziman Chen, Ting Xie, Jieyi Liang, Fei Chen","doi":"10.11152/mu-4505","DOIUrl":null,"url":null,"abstract":"Aim: To evaluate the effectiveness of two large language models, ChatGPT-4 and Claude 3, in improving the accuracy of question responses by senior sonologist and junior sonologist.Material and methods: A senior and a junior sonologist were given a practice exam. After answering the questions, they reviewed the responses and explanations provided by ChatGPT-4 and Claude 3. The accuracy and scores before and after incorporating the models' input were analyzed to compare their effectiveness.Results: No statistically significant differences were found between the two models' responses scores for each section (all p>0.05). For junior sonologist, both ChatGPT-4 (p=0.039) and Claude 3 (p=0.039) significantly improved scores in basic knowledge. The responses provided by ChatGPT-4 also significantly improved scores in relevant professional knowledge (p=0.038), though their explanations did not (p=0.077). For all exam sections, both models' responses and explanations significantly improved scores (all p<0.05). For senior sonologist, both ChatGPT-4's responses (p=0.022) and explanations (p=0.034) improved scores in basic knowledge, as did Claude 3's explanations (p=0.003). Across all sections, Claude 3's explanations significantly improved scores (p=0.041).Conclusion: ChatGPT-4 and Claude 3 significantly improved sonologist' examination performance, particularly in basic knowledge.","PeriodicalId":94138,"journal":{"name":"Medical ultrasonography","volume":" ","pages":"294-300"},"PeriodicalIF":2.3000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing sonologist examination performance with large language models: an analytical study of ChatGPT-4 and Claude 3.\",\"authors\":\"Xuefeng Li, Ziman Chen, Ting Xie, Jieyi Liang, Fei Chen\",\"doi\":\"10.11152/mu-4505\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aim: To evaluate the effectiveness of two large language models, ChatGPT-4 and Claude 3, in improving the accuracy of question responses by senior sonologist and junior sonologist.Material and methods: A senior and a junior sonologist were given a practice exam. After answering the questions, they reviewed the responses and explanations provided by ChatGPT-4 and Claude 3. The accuracy and scores before and after incorporating the models' input were analyzed to compare their effectiveness.Results: No statistically significant differences were found between the two models' responses scores for each section (all p>0.05). For junior sonologist, both ChatGPT-4 (p=0.039) and Claude 3 (p=0.039) significantly improved scores in basic knowledge. The responses provided by ChatGPT-4 also significantly improved scores in relevant professional knowledge (p=0.038), though their explanations did not (p=0.077). For all exam sections, both models' responses and explanations significantly improved scores (all p<0.05). For senior sonologist, both ChatGPT-4's responses (p=0.022) and explanations (p=0.034) improved scores in basic knowledge, as did Claude 3's explanations (p=0.003). Across all sections, Claude 3's explanations significantly improved scores (p=0.041).Conclusion: ChatGPT-4 and Claude 3 significantly improved sonologist' examination performance, particularly in basic knowledge.\",\"PeriodicalId\":94138,\"journal\":{\"name\":\"Medical ultrasonography\",\"volume\":\" \",\"pages\":\"294-300\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical ultrasonography\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11152/mu-4505\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/29 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical ultrasonography","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11152/mu-4505","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/29 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目的：评估两个大型语言模型ChatGPT-4和Claude 3在提高高级和初级超声专家问题回答准确性方面的有效性。材料与方法：对一名高级和一名初级超声医师进行实习考试。在回答完问题后，他们回顾了ChatGPT-4和Claude 3的回答和解释。分析模型输入前后的准确率和分数，比较其有效性。结果：两种模型各部分的反应得分差异无统计学意义（p < 0.05）。对于初级超声医师，ChatGPT-4 （p=0.039）和Claude 3 （p=0.039）均显著提高了基础知识得分。ChatGPT-4的回答也显著提高了相关专业知识的得分（p=0.038），但其解释没有显著提高（p=0.077）。结论：ChatGPT-4和Claude 3显著提高了超声医师的考试成绩，特别是在基础知识方面。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhancing sonologist examination performance with large language models: an analytical study of ChatGPT-4 and Claude 3.

Aim: To evaluate the effectiveness of two large language models, ChatGPT-4 and Claude 3, in improving the accuracy of question responses by senior sonologist and junior sonologist.

Material and methods: A senior and a junior sonologist were given a practice exam. After answering the questions, they reviewed the responses and explanations provided by ChatGPT-4 and Claude 3. The accuracy and scores before and after incorporating the models' input were analyzed to compare their effectiveness.

Results: No statistically significant differences were found between the two models' responses scores for each section (all p>0.05). For junior sonologist, both ChatGPT-4 (p=0.039) and Claude 3 (p=0.039) significantly improved scores in basic knowledge. The responses provided by ChatGPT-4 also significantly improved scores in relevant professional knowledge (p=0.038), though their explanations did not (p=0.077). For all exam sections, both models' responses and explanations significantly improved scores (all p<0.05). For senior sonologist, both ChatGPT-4's responses (p=0.022) and explanations (p=0.034) improved scores in basic knowledge, as did Claude 3's explanations (p=0.003). Across all sections, Claude 3's explanations significantly improved scores (p=0.041).

Conclusion: ChatGPT-4 and Claude 3 significantly improved sonologist' examination performance, particularly in basic knowledge.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical ultrasonography

自引率

0.00%

发文量