{"title":"Performance of 4 Artificial Intelligence Chatbots in Answering Endodontic Questions.","authors":"Saleem Abdulrab, Hisham Abada, Mohammed Mashyakhy, Nawras Mostafa, Hatem Alhadainy, Esam Halboub","doi":"10.1016/j.joen.2025.01.002","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Artificial intelligence models have shown potential as educational tools in healthcare, such as answering exam questions. This study aimed to assess the performance of 4 prominent chatbots: ChatGPT-4o, MedGebra GPT-4o, Meta LIama 3, and Gemini Advanced in answering multiple-choice questions (MCQs) in endodontics.</p><p><strong>Methods: </strong>The study utilized 100 MCQs, each with 4 potential answers. These MCQs were obtained from 2 well-known endodontic textbooks. The performance of the above chatbots regarding choosing the correct answers was assessed twice with a 1-week interval.</p><p><strong>Results: </strong>The stability of the performance in the 2 rounds was highest for ChatGPT-4o, followed by Gemini Advanced and Meta Llama 3. MedGebra GPT-4o provided the highest percentage of true answers in the first round (93%) followed by ChatGPT-4o in the second round (90%). Meta Llama 3 provided the lowest percentages in the first (73%) and second rounds (75%). Although the performance of MedGebra GPT-4o was the best in the first round, it was less stable upon the second round (McNemar P > .05; Kappa = 0.725, P < .001).</p><p><strong>Conclusions: </strong>ChatGPT-4o and MedGebra GPT-4o answered a high fraction of endodontic MCQs, while Meta LIama 3 and Gemini Advanced showed lower performance. Further training and development are required to improve their accuracy and reliability in endodontics.</p>","PeriodicalId":15703,"journal":{"name":"Journal of endodontics","volume":" ","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of endodontics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.joen.2025.01.002","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Artificial intelligence models have shown potential as educational tools in healthcare, such as answering exam questions. This study aimed to assess the performance of 4 prominent chatbots: ChatGPT-4o, MedGebra GPT-4o, Meta LIama 3, and Gemini Advanced in answering multiple-choice questions (MCQs) in endodontics.
Methods: The study utilized 100 MCQs, each with 4 potential answers. These MCQs were obtained from 2 well-known endodontic textbooks. The performance of the above chatbots regarding choosing the correct answers was assessed twice with a 1-week interval.
Results: The stability of the performance in the 2 rounds was highest for ChatGPT-4o, followed by Gemini Advanced and Meta Llama 3. MedGebra GPT-4o provided the highest percentage of true answers in the first round (93%) followed by ChatGPT-4o in the second round (90%). Meta Llama 3 provided the lowest percentages in the first (73%) and second rounds (75%). Although the performance of MedGebra GPT-4o was the best in the first round, it was less stable upon the second round (McNemar P > .05; Kappa = 0.725, P < .001).
Conclusions: ChatGPT-4o and MedGebra GPT-4o answered a high fraction of endodontic MCQs, while Meta LIama 3 and Gemini Advanced showed lower performance. Further training and development are required to improve their accuracy and reliability in endodontics.
期刊介绍:
The Journal of Endodontics, the official journal of the American Association of Endodontists, publishes scientific articles, case reports and comparison studies evaluating materials and methods of pulp conservation and endodontic treatment. Endodontists and general dentists can learn about new concepts in root canal treatment and the latest advances in techniques and instrumentation in the one journal that helps them keep pace with rapid changes in this field.