Daniel Bahir, Omri Zur, Leah Attal, Zaki Nujeidat, Ariela Knaanie, Joseph Pikkel, Michael Mimouni, Gilad Plopsky
{"title":"Gemini AI 与 ChatGPT:眼科住院医师医学知识综合考试","authors":"Daniel Bahir, Omri Zur, Leah Attal, Zaki Nujeidat, Ariela Knaanie, Joseph Pikkel, Michael Mimouni, Gilad Plopsky","doi":"10.1007/s00417-024-06625-4","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Introduction</h3><p>The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models – ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced – underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care.</p><h3 data-test=\"abstract-sub-heading\">Key Messages</h3><p><i>What is known</i></p><ul>\n<li>\n<p>AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma.</p>\n</li>\n</ul><p><i>What is new</i></p><ul>\n<li>\n<p>This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams.</p>\n</li>\n<li>\n<p>Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities.</p>\n</li>\n<li>\n<p>These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.</p>\n</li>\n</ul>","PeriodicalId":12748,"journal":{"name":"Graefe's Archive for Clinical and Experimental Ophthalmology","volume":"75 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge\",\"authors\":\"Daniel Bahir, Omri Zur, Leah Attal, Zaki Nujeidat, Ariela Knaanie, Joseph Pikkel, Michael Mimouni, Gilad Plopsky\",\"doi\":\"10.1007/s00417-024-06625-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Introduction</h3><p>The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams.</p><h3 data-test=\\\"abstract-sub-heading\\\">Methods</h3><p>A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models – ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced – underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment.</p><h3 data-test=\\\"abstract-sub-heading\\\">Results</h3><p>Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy.</p><h3 data-test=\\\"abstract-sub-heading\\\">Conclusion</h3><p>The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care.</p><h3 data-test=\\\"abstract-sub-heading\\\">Key Messages</h3><p><i>What is known</i></p><ul>\\n<li>\\n<p>AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma.</p>\\n</li>\\n</ul><p><i>What is new</i></p><ul>\\n<li>\\n<p>This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams.</p>\\n</li>\\n<li>\\n<p>Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities.</p>\\n</li>\\n<li>\\n<p>These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.</p>\\n</li>\\n</ul>\",\"PeriodicalId\":12748,\"journal\":{\"name\":\"Graefe's Archive for Clinical and Experimental Ophthalmology\",\"volume\":\"75 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Graefe's Archive for Clinical and Experimental Ophthalmology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00417-024-06625-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Graefe's Archive for Clinical and Experimental Ophthalmology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00417-024-06625-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge
Introduction
The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams.
Methods
A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models – ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced – underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment.
Results
Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy.
Conclusion
The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care.
Key Messages
What is known
AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma.
What is new
This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams.
Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities.
These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.