Maurice Salem, Duygu Karasan, Marta Revilla-León, Abdul B Barmak, Irena Sailer
{"title":"Performance of Artificial Intelligence-Based Chatbots (ChatGPT-3.5 and ChatGPT-4.0) Answering the International Team of Implantology Exam Questions.","authors":"Maurice Salem, Duygu Karasan, Marta Revilla-León, Abdul B Barmak, Irena Sailer","doi":"10.1111/jerd.13496","DOIUrl":null,"url":null,"abstract":"<p><strong>Aim: </strong>This study aims to compare the performance of licensed dentists and two versions of ChatGPT (v.3.5 and v.4.0) in answering the International Team for Implantology (ITI) certification exam questions in implant dentistry.</p><p><strong>Materials and methods: </strong>The study involved 93 licensed dentists and the two chatbot versions answering 48 text-only multiple-choice questions from the ITI implant certification exam. The 48 questions passed through ChatGPT-3.5 and ChatGPT-4 93 times, and then the data were collected on an Excel sheet (Excel version 2024, Microsoft). Pearson correlation matrix was used to analyze the linear relationship among the tested groups. Additionally, inter- and intraoperator reliability was analyzed using Cronbach's alpha coefficient. One-way Welch's ANOVA and Tukey post-hoc tests were used to determine any significant differences among the groups tested on the exam scores obtained.</p><p><strong>Results: </strong>Licensed dentists obtained a higher score on the test compared to ChatGPT-3.5, while ChatGPT-4.0 and licensed dentists performed similarly. ChatGPT 4.0 resulted in significantly higher scores than ChatGPT-3.5. All groups were able to obtain scores high enough to pass the exam.</p><p><strong>Conclusion: </strong>Both ChatGPT-3.5 and ChatGPT-4.0 are powerful tools that can assist and guide dental licensed dentists and patients. ChatGPT-4.0 showed better results than ChatGPT-3.5; however, more studies should be conducted including new chatbots that are more sophisticated, with the ability to interpret videos and images-chatbots that were not available when this study was performed.</p>","PeriodicalId":15988,"journal":{"name":"Journal of Esthetic and Restorative Dentistry","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Esthetic and Restorative Dentistry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/jerd.13496","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Aim: This study aims to compare the performance of licensed dentists and two versions of ChatGPT (v.3.5 and v.4.0) in answering the International Team for Implantology (ITI) certification exam questions in implant dentistry.
Materials and methods: The study involved 93 licensed dentists and the two chatbot versions answering 48 text-only multiple-choice questions from the ITI implant certification exam. The 48 questions passed through ChatGPT-3.5 and ChatGPT-4 93 times, and then the data were collected on an Excel sheet (Excel version 2024, Microsoft). Pearson correlation matrix was used to analyze the linear relationship among the tested groups. Additionally, inter- and intraoperator reliability was analyzed using Cronbach's alpha coefficient. One-way Welch's ANOVA and Tukey post-hoc tests were used to determine any significant differences among the groups tested on the exam scores obtained.
Results: Licensed dentists obtained a higher score on the test compared to ChatGPT-3.5, while ChatGPT-4.0 and licensed dentists performed similarly. ChatGPT 4.0 resulted in significantly higher scores than ChatGPT-3.5. All groups were able to obtain scores high enough to pass the exam.
Conclusion: Both ChatGPT-3.5 and ChatGPT-4.0 are powerful tools that can assist and guide dental licensed dentists and patients. ChatGPT-4.0 showed better results than ChatGPT-3.5; however, more studies should be conducted including new chatbots that are more sophisticated, with the ability to interpret videos and images-chatbots that were not available when this study was performed.
期刊介绍:
The Journal of Esthetic and Restorative Dentistry (JERD) is the longest standing peer-reviewed journal devoted solely to advancing the knowledge and practice of esthetic dentistry. Its goal is to provide the very latest evidence-based information in the realm of contemporary interdisciplinary esthetic dentistry through high quality clinical papers, sound research reports and educational features.
The range of topics covered in the journal includes:
- Interdisciplinary esthetic concepts
- Implants
- Conservative adhesive restorations
- Tooth Whitening
- Prosthodontic materials and techniques
- Dental materials
- Orthodontic, periodontal and endodontic esthetics
- Esthetics related research
- Innovations in esthetics