Abdulrahman Almalki, Ramzi O Althubaitiy, Fahad Alkhtani, Evanthia Anadioti, Heba Wageh Abozaed
{"title":"Assessment of ChatGPT's Performance on the ACP 2024 National Prosthodontics Resident Exam (NPRE).","authors":"Abdulrahman Almalki, Ramzi O Althubaitiy, Fahad Alkhtani, Evanthia Anadioti, Heba Wageh Abozaed","doi":"10.1111/eje.70045","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the performance of ChatGPT on the National Prosthodontics Resident Exam (NPRE).</p><p><strong>Methods: </strong>Two separate OpenAI accounts were used for ChatGPT 3.5 and ChatGPT 4.0, each managed by independent examiners. The dataset was sourced from the American College of Prosthodontics (ACP) 2024 National Prosthodontics Resident Exam (NPRE), which includes 150 multiple-choice board-style questions on various prosthodontic topics. Questions were inputted as they appeared in the NPRE, and responses were recorded as correct or incorrect. Accuracy was assessed using a two-tailed t-test, with statistical significance set at p < 0.05. After the study was completed, OpenAI accounts were deleted to ensure data privacy and security.</p><p><strong>Results: </strong>ChatGPT 3.5 correctly answered 84 out of 150 questions, achieving a score of 56.0%; while ChatGPT 4 significantly outperformed it with a score of 73.7%, correctly answering 109 out of 150 questions (p < 0.001). In specific subjects, ChatGPT 4 consistently scored higher, with significant improvements in Basic Science (71.2% vs. 61.3%), Implant Surgery (67.5% vs. 41.2%), Diagnosis and Treatment Planning (66.6% vs. 53.4%) and Fixed Prosthodontics (86.9% vs. 62.5%). The highest scores for both versions were in Dental Materials, with ChatGPT 4 achieving 91.6% compared to ChatGPT 3.5's 73.1%.</p><p><strong>Conclusion: </strong>ChatGPT 4.0 shows promising potential as an educational tool for prosthodontics residents by effectively addressing board-style questions. However, due to a significant presence of misinformation in ChatGPT's current prosthodontics knowledge base, residents should exercise caution and supplement AI-generated content with evidence-based information from credible sources to ensure accuracy and reliability.</p>","PeriodicalId":50488,"journal":{"name":"European Journal of Dental Education","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Dental Education","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1111/eje.70045","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To evaluate the performance of ChatGPT on the National Prosthodontics Resident Exam (NPRE).
Methods: Two separate OpenAI accounts were used for ChatGPT 3.5 and ChatGPT 4.0, each managed by independent examiners. The dataset was sourced from the American College of Prosthodontics (ACP) 2024 National Prosthodontics Resident Exam (NPRE), which includes 150 multiple-choice board-style questions on various prosthodontic topics. Questions were inputted as they appeared in the NPRE, and responses were recorded as correct or incorrect. Accuracy was assessed using a two-tailed t-test, with statistical significance set at p < 0.05. After the study was completed, OpenAI accounts were deleted to ensure data privacy and security.
Results: ChatGPT 3.5 correctly answered 84 out of 150 questions, achieving a score of 56.0%; while ChatGPT 4 significantly outperformed it with a score of 73.7%, correctly answering 109 out of 150 questions (p < 0.001). In specific subjects, ChatGPT 4 consistently scored higher, with significant improvements in Basic Science (71.2% vs. 61.3%), Implant Surgery (67.5% vs. 41.2%), Diagnosis and Treatment Planning (66.6% vs. 53.4%) and Fixed Prosthodontics (86.9% vs. 62.5%). The highest scores for both versions were in Dental Materials, with ChatGPT 4 achieving 91.6% compared to ChatGPT 3.5's 73.1%.
Conclusion: ChatGPT 4.0 shows promising potential as an educational tool for prosthodontics residents by effectively addressing board-style questions. However, due to a significant presence of misinformation in ChatGPT's current prosthodontics knowledge base, residents should exercise caution and supplement AI-generated content with evidence-based information from credible sources to ensure accuracy and reliability.
期刊介绍:
The aim of the European Journal of Dental Education is to publish original topical and review articles of the highest quality in the field of Dental Education. The Journal seeks to disseminate widely the latest information on curriculum development teaching methodologies assessment techniques and quality assurance in the fields of dental undergraduate and postgraduate education and dental auxiliary personnel training. The scope includes the dental educational aspects of the basic medical sciences the behavioural sciences the interface with medical education information technology and distance learning and educational audit. Papers embodying the results of high-quality educational research of relevance to dentistry are particularly encouraged as are evidence-based reports of novel and established educational programmes and their outcomes.