Gemini AI 与 ChatGPT:眼科住院医师医学知识综合考试

Daniel Bahir, Omri Zur, Leah Attal, Zaki Nujeidat, Ariela Knaanie, Joseph Pikkel, Michael Mimouni, Gilad Plopsky
{"title":"Gemini AI 与 ChatGPT:眼科住院医师医学知识综合考试","authors":"Daniel Bahir, Omri Zur, Leah Attal, Zaki Nujeidat, Ariela Knaanie, Joseph Pikkel, Michael Mimouni, Gilad Plopsky","doi":"10.1007/s00417-024-06625-4","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Introduction</h3><p>The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models – ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced – underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care.</p><h3 data-test=\"abstract-sub-heading\">Key Messages</h3><p><i>What is known</i></p><ul>\n<li>\n<p>AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma.</p>\n</li>\n</ul><p><i>What is new</i></p><ul>\n<li>\n<p>This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams.</p>\n</li>\n<li>\n<p>Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities.</p>\n</li>\n<li>\n<p>These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.</p>\n</li>\n</ul>","PeriodicalId":12748,"journal":{"name":"Graefe's Archive for Clinical and Experimental Ophthalmology","volume":"75 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge\",\"authors\":\"Daniel Bahir, Omri Zur, Leah Attal, Zaki Nujeidat, Ariela Knaanie, Joseph Pikkel, Michael Mimouni, Gilad Plopsky\",\"doi\":\"10.1007/s00417-024-06625-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Introduction</h3><p>The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams.</p><h3 data-test=\\\"abstract-sub-heading\\\">Methods</h3><p>A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models – ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced – underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment.</p><h3 data-test=\\\"abstract-sub-heading\\\">Results</h3><p>Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy.</p><h3 data-test=\\\"abstract-sub-heading\\\">Conclusion</h3><p>The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care.</p><h3 data-test=\\\"abstract-sub-heading\\\">Key Messages</h3><p><i>What is known</i></p><ul>\\n<li>\\n<p>AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma.</p>\\n</li>\\n</ul><p><i>What is new</i></p><ul>\\n<li>\\n<p>This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams.</p>\\n</li>\\n<li>\\n<p>Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities.</p>\\n</li>\\n<li>\\n<p>These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.</p>\\n</li>\\n</ul>\",\"PeriodicalId\":12748,\"journal\":{\"name\":\"Graefe's Archive for Clinical and Experimental Ophthalmology\",\"volume\":\"75 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Graefe's Archive for Clinical and Experimental Ophthalmology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00417-024-06625-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Graefe's Archive for Clinical and Experimental Ophthalmology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00417-024-06625-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

导言人工智能(AI)的快速发展,尤其是大型语言模型(如 ChatGPT 和谷歌的双子座 AI)的发展,标志着技术创新进入了一个变革时代。本研究探讨了人工智能在眼科领域的潜力,重点关注 ChatGPT 和 Gemini AI 的功能。虽然这些模型在医学教育和临床支持方面大有可为,但它们的整合需要全面的评估。本研究旨在通过比较 Gemini AI 和 ChatGPT,评估它们在眼科住院医师考试中的表现,从而弥补文献中的空白。方法从以色列眼科住院医师考试中整理出一个数据集,其中包含 12 个亚专科的 600 个问题,包括文本和基于图像的格式。四种人工智能模型--ChatGPT-3.5、ChatGPT-4、Gemini 和 Gemini Advanced--在该数据集上进行了测试。研究包括与以色列眼科住院医师的比较分析,并采用特定指标进行性能评估。值得注意的是,ChatGPT-4 的准确率为 62%,Gemini 的准确率为 58%,而 ChatGPT-3.5 的准确率为 46%。与住院医生的对比分析让我们深入了解了人工智能模型相对于人类医学知识的表现。该研究揭示了人工智能模型在眼科领域的细微差别,强调了特定领域的差异。Gemini Advanced 的卓越表现表明了其显著的进步,而 ChatGPT-4 的改进也值得关注。Gemini 和 ChatGPT-3.5 的表现都值得称赞。对比分析强调了人工智能在医学教育中作为辅助工具所发挥的不断演变的作用。这项研究有助于深入了解人工智能在眼科中的应用效果,并突出了有待改进的领域。随着人工智能模型的发展,有针对性的改进可以提高各亚专科的适应性,使其成为医疗专业人员的宝贵工具,并丰富患者护理。在眼科领域,人工智能的整合彻底改变了临床工作流程,尤其是在分析糖尿病视网膜病变和青光眼等疾病的图像方面。这项研究对 Gemini AI 和 ChatGPT 进行了开创性的比较,使用从真实世界眼科委员会考试中精心策划的数据集,针对眼科住院医师对它们的性能进行了评估。这些研究结果为人工智能在眼科领域的应用提供了重要的洞察力,并揭示了有待进一步提高和优化的领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge

Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge

Introduction

The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams.

Methods

A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models – ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced – underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment.

Results

Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy.

Conclusion

The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care.

Key Messages

What is known

  • AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma.

What is new

  • This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams.

  • Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities.

  • These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信