Tevfik Serhat Bahar, Olgar Öcal, Asli Çetinkaya Yaprak
{"title":"Comparison of ChatGPT-4, Microsoft Copilot, and Google Gemini for Pediatric Ophthalmology Questions.","authors":"Tevfik Serhat Bahar, Olgar Öcal, Asli Çetinkaya Yaprak","doi":"10.3928/01913913-20250404-03","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the success of Chat Generative Pre-trained Transformer (ChatGPT; OpenAl), Google Gemini (Alphabet, Inc), and Microsoft Copilot (Microsoft Corporation) artificial intelligence (AI) programs, which are offered free of charge by three different manufacturers, in answering questions related to pediatric ophthalmology correctly and to investigate whether they are superior to each other.</p><p><strong>Methods: </strong>ChatGPT, Gemini, and Copilot were each asked 100 multiple-choice questions from the Ophtho-Questions online question bank, which is widely used for preparing for the high-stakes Ophthalmic Knowledge Evaluation Program examination. Their answers were compared to the official answer keys and categorized as correct or incorrect. The readability of the responses was assessed using the Flesch-Kincaid Grade Level, Flesch Reading Ease Score, and the Coleman-Liau Index.</p><p><strong>Results: </strong>ChatGPT, Gemini, and Copilot chatbots answered 61 (61%), 60 (60%), and 74 (74%) questions correctly, respectively. The Copilot AI program had a significantly higher rate of correct answers to questions than ChatGPT and Gemini (<i>P</i> = .049 and .035). Three readability analyses revealed that Copilot had the highest average score, followed by ChatGPT and Gemini, which were more challenging than the recommended level.</p><p><strong>Conclusions: </strong>Although AI chatbots can serve as useful tools for acquiring information on pediatric ophthalmology, their responses should be interpreted with caution due to potential inaccuracies. <b>[<i>J Pediatr Ophthalmol Strabismus</i>. 20XX;X(X):XXX-XXX.]</b>.</p>","PeriodicalId":50095,"journal":{"name":"Journal of Pediatric Ophthalmology & Strabismus","volume":" ","pages":"1-7"},"PeriodicalIF":1.0000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pediatric Ophthalmology & Strabismus","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3928/01913913-20250404-03","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To evaluate the success of Chat Generative Pre-trained Transformer (ChatGPT; OpenAl), Google Gemini (Alphabet, Inc), and Microsoft Copilot (Microsoft Corporation) artificial intelligence (AI) programs, which are offered free of charge by three different manufacturers, in answering questions related to pediatric ophthalmology correctly and to investigate whether they are superior to each other.
Methods: ChatGPT, Gemini, and Copilot were each asked 100 multiple-choice questions from the Ophtho-Questions online question bank, which is widely used for preparing for the high-stakes Ophthalmic Knowledge Evaluation Program examination. Their answers were compared to the official answer keys and categorized as correct or incorrect. The readability of the responses was assessed using the Flesch-Kincaid Grade Level, Flesch Reading Ease Score, and the Coleman-Liau Index.
Results: ChatGPT, Gemini, and Copilot chatbots answered 61 (61%), 60 (60%), and 74 (74%) questions correctly, respectively. The Copilot AI program had a significantly higher rate of correct answers to questions than ChatGPT and Gemini (P = .049 and .035). Three readability analyses revealed that Copilot had the highest average score, followed by ChatGPT and Gemini, which were more challenging than the recommended level.
Conclusions: Although AI chatbots can serve as useful tools for acquiring information on pediatric ophthalmology, their responses should be interpreted with caution due to potential inaccuracies. [J Pediatr Ophthalmol Strabismus. 20XX;X(X):XXX-XXX.].
期刊介绍:
The Journal of Pediatric Ophthalmology & Strabismus is a bimonthly peer-reviewed publication for pediatric ophthalmologists. The Journal has published original articles on the diagnosis, treatment, and prevention of eye disorders in the pediatric age group and the treatment of strabismus in all age groups for over 50 years.