Assessing the knowledge of ChatGPT and Google Gemini in answering peripheral artery disease-related questions.

IF 1 4区医学 Q4 PERIPHERAL VASCULAR DISEASE

Vascular Pub Date : 2025-01-21 DOI:10.1177/17085381251315999

Hakkı Kursat Cetin, Tolga Demir

{"title":"Assessing the knowledge of ChatGPT and Google Gemini in answering peripheral artery disease-related questions.","authors":"Hakkı Kursat Cetin, Tolga Demir","doi":"10.1177/17085381251315999","DOIUrl":null,"url":null,"abstract":"Introduction: To assess and compare the knowledge of ChatGPT and Google Gemini in answering public-based and scientific questions about peripheral artery disease (PAD).Methods: Frequently asked questions (FAQs) about PAD were generated by evaluating posts on social media, and the latest edition of the European Society of Cardiology (ESC) guideline was evaluated and recommendations about PAD were translated into questions. All questions were prepared in English and were asked to ChatGPT 4 and Google Gemini (formerly Google Bard) applications. The specialists assigned a Global Quality Score (GQS) for each response.Results: Finally, 72 FAQs and 63 ESC guideline-based questions were identified. In total, 51 (70.8%) answers by ChatGPT for FAQs were categorized as GQS 5. Moreover, 44 (69.8%) ChatGPT answers to ESC guideline-based questions about PAD scored GQS 5. A total of 40 (55.6%) answers by Google Gemini for FAQs related with PAD obtained GQS 5. In addition, 50.8% (32 of 63) Google Gemini answers to ESC guideline-based questions were classified as GQS 5. Comparison of ChatGPT and Google Gemini with regards to GQS score revealed that both for FAQs about PAD, and ESC guideline-based scientific questions about PAD, ChatGPT gave more accurate and satisfactory answers (p = 0.031 and p = 0.026). In contrast, response time was significantly shorter for Google Gemini for both FAQs and scientific questions about PAD (p = 0.008 and p = 0.001).Conclusion: Our findings revealed that both ChatGPT and Google Gemini had limited capacity to answer FAQs and scientific questions related with PDA, but accuracy and satisfactory rate of answers for both FAQs and scientific questions about PAD were significantly higher in favor of ChatGPT.","PeriodicalId":23549,"journal":{"name":"Vascular","volume":" ","pages":"17085381251315999"},"PeriodicalIF":1.0000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vascular","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/17085381251315999","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PERIPHERAL VASCULAR DISEASE","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: To assess and compare the knowledge of ChatGPT and Google Gemini in answering public-based and scientific questions about peripheral artery disease (PAD).

Methods: Frequently asked questions (FAQs) about PAD were generated by evaluating posts on social media, and the latest edition of the European Society of Cardiology (ESC) guideline was evaluated and recommendations about PAD were translated into questions. All questions were prepared in English and were asked to ChatGPT 4 and Google Gemini (formerly Google Bard) applications. The specialists assigned a Global Quality Score (GQS) for each response.

Results: Finally, 72 FAQs and 63 ESC guideline-based questions were identified. In total, 51 (70.8%) answers by ChatGPT for FAQs were categorized as GQS 5. Moreover, 44 (69.8%) ChatGPT answers to ESC guideline-based questions about PAD scored GQS 5. A total of 40 (55.6%) answers by Google Gemini for FAQs related with PAD obtained GQS 5. In addition, 50.8% (32 of 63) Google Gemini answers to ESC guideline-based questions were classified as GQS 5. Comparison of ChatGPT and Google Gemini with regards to GQS score revealed that both for FAQs about PAD, and ESC guideline-based scientific questions about PAD, ChatGPT gave more accurate and satisfactory answers (p = 0.031 and p = 0.026). In contrast, response time was significantly shorter for Google Gemini for both FAQs and scientific questions about PAD (p = 0.008 and p = 0.001).

Conclusion: Our findings revealed that both ChatGPT and Google Gemini had limited capacity to answer FAQs and scientific questions related with PDA, but accuracy and satisfactory rate of answers for both FAQs and scientific questions about PAD were significantly higher in favor of ChatGPT.

查看原文本刊更多论文

评估ChatGPT和谷歌Gemini在回答外周动脉疾病相关问题中的知识。

目的：评估和比较ChatGPT和谷歌Gemini在回答有关外周动脉疾病（PAD）的公众和科学问题方面的知识。方法：通过对社交媒体上的帖子进行评价，生成有关PAD的常见问题（FAQs），并对最新版欧洲心脏病学会（ESC）指南进行评价，将有关PAD的建议转化为问题。所有问题都是用英语准备的，并被要求参加gpt 4和谷歌双子座（以前是谷歌巴德）申请。专家们为每个回答分配了一个全球质量分数（GQS）。结果：最终确定了72个常见问题和63个ESC指南问题。在ChatGPT的常见问题解答中，共有51个（70.8%）被归类为GQS 5。此外，有44个（69.8%）ChatGPT在ESC指南基础问题中获得GQS 5分。谷歌Gemini对PAD相关常见问题的回答中，有40个（55.6%）获得了GQS 5。此外，50.8%（63个问题中的32个）双子座回答的ESC指南基础问题被归类为GQS 5。ChatGPT与谷歌Gemini在GQS评分方面的比较显示，无论是对于PAD的常见问题，还是基于ESC指南的关于PAD的科学问题，ChatGPT给出的答案都更加准确和令人满意（p = 0.031和p = 0.026）。相比之下，谷歌Gemini在关于PAD的常见问题和科学问题上的反应时间明显更短（p = 0.008和p = 0.001）。结论：ChatGPT和谷歌Gemini对PDA相关的常见问题和科学问题的回答能力有限，但ChatGPT对PAD相关的常见问题和科学问题的回答准确率和满意率均明显高于谷歌Gemini。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Vascular 医学-外周血管病

CiteScore

2.30

自引率

9.10%

发文量

196

审稿时长

6-12 weeks

期刊介绍： Vascular provides readers with new and unusual up-to-date articles and case reports focusing on vascular and endovascular topics. It is a highly international forum for the discussion and debate of all aspects of this distinct surgical specialty. It also features opinion pieces, literature reviews and controversial issues presented from various points of view.