ChatGPT在ACP 2024全国口腔修复学住院医师考试（NPRE）中的表现评估

IF 1.9 4区教育学 Q3 DENTISTRY, ORAL SURGERY & MEDICINE

European Journal of Dental Education Pub Date : 2025-08-20 DOI:10.1111/eje.70045

Abdulrahman Almalki, Ramzi O Althubaitiy, Fahad Alkhtani, Evanthia Anadioti, Heba Wageh Abozaed

{"title":"ChatGPT在ACP 2024全国口腔修复学住院医师考试（NPRE）中的表现评估","authors":"Abdulrahman Almalki, Ramzi O Althubaitiy, Fahad Alkhtani, Evanthia Anadioti, Heba Wageh Abozaed","doi":"10.1111/eje.70045","DOIUrl":null,"url":null,"abstract":"Purpose: To evaluate the performance of ChatGPT on the National Prosthodontics Resident Exam (NPRE).Methods: Two separate OpenAI accounts were used for ChatGPT 3.5 and ChatGPT 4.0, each managed by independent examiners. The dataset was sourced from the American College of Prosthodontics (ACP) 2024 National Prosthodontics Resident Exam (NPRE), which includes 150 multiple-choice board-style questions on various prosthodontic topics. Questions were inputted as they appeared in the NPRE, and responses were recorded as correct or incorrect. Accuracy was assessed using a two-tailed t-test, with statistical significance set at p < 0.05. After the study was completed, OpenAI accounts were deleted to ensure data privacy and security.Results: ChatGPT 3.5 correctly answered 84 out of 150 questions, achieving a score of 56.0%; while ChatGPT 4 significantly outperformed it with a score of 73.7%, correctly answering 109 out of 150 questions (p < 0.001). In specific subjects, ChatGPT 4 consistently scored higher, with significant improvements in Basic Science (71.2% vs. 61.3%), Implant Surgery (67.5% vs. 41.2%), Diagnosis and Treatment Planning (66.6% vs. 53.4%) and Fixed Prosthodontics (86.9% vs. 62.5%). The highest scores for both versions were in Dental Materials, with ChatGPT 4 achieving 91.6% compared to ChatGPT 3.5's 73.1%.Conclusion: ChatGPT 4.0 shows promising potential as an educational tool for prosthodontics residents by effectively addressing board-style questions. However, due to a significant presence of misinformation in ChatGPT's current prosthodontics knowledge base, residents should exercise caution and supplement AI-generated content with evidence-based information from credible sources to ensure accuracy and reliability.","PeriodicalId":50488,"journal":{"name":"European Journal of Dental Education","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessment of ChatGPT's Performance on the ACP 2024 National Prosthodontics Resident Exam (NPRE).\",\"authors\":\"Abdulrahman Almalki, Ramzi O Althubaitiy, Fahad Alkhtani, Evanthia Anadioti, Heba Wageh Abozaed\",\"doi\":\"10.1111/eje.70045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: To evaluate the performance of ChatGPT on the National Prosthodontics Resident Exam (NPRE).Methods: Two separate OpenAI accounts were used for ChatGPT 3.5 and ChatGPT 4.0, each managed by independent examiners. The dataset was sourced from the American College of Prosthodontics (ACP) 2024 National Prosthodontics Resident Exam (NPRE), which includes 150 multiple-choice board-style questions on various prosthodontic topics. Questions were inputted as they appeared in the NPRE, and responses were recorded as correct or incorrect. Accuracy was assessed using a two-tailed t-test, with statistical significance set at p < 0.05. After the study was completed, OpenAI accounts were deleted to ensure data privacy and security.Results: ChatGPT 3.5 correctly answered 84 out of 150 questions, achieving a score of 56.0%; while ChatGPT 4 significantly outperformed it with a score of 73.7%, correctly answering 109 out of 150 questions (p < 0.001). In specific subjects, ChatGPT 4 consistently scored higher, with significant improvements in Basic Science (71.2% vs. 61.3%), Implant Surgery (67.5% vs. 41.2%), Diagnosis and Treatment Planning (66.6% vs. 53.4%) and Fixed Prosthodontics (86.9% vs. 62.5%). The highest scores for both versions were in Dental Materials, with ChatGPT 4 achieving 91.6% compared to ChatGPT 3.5's 73.1%.Conclusion: ChatGPT 4.0 shows promising potential as an educational tool for prosthodontics residents by effectively addressing board-style questions. However, due to a significant presence of misinformation in ChatGPT's current prosthodontics knowledge base, residents should exercise caution and supplement AI-generated content with evidence-based information from credible sources to ensure accuracy and reliability.\",\"PeriodicalId\":50488,\"journal\":{\"name\":\"European Journal of Dental Education\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Dental Education\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1111/eje.70045\",\"RegionNum\":4,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Dental Education","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1111/eje.70045","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

摘要

目的：评价ChatGPT在全国口腔修复学住院医师考试（NPRE）中的表现。方法：ChatGPT 3.5和ChatGPT 4.0使用两个独立的OpenAI帐户，每个帐户由独立审查员管理。该数据集来自美国口腔修复学会（ACP） 2024年全国口腔修复学住院医师考试（NPRE），其中包括150个关于各种口腔修复主题的多项选择题。问题在NPRE中出现时输入，回答被记录为正确或不正确。使用双尾t检验评估准确性，统计显著性设置为p。结果：ChatGPT 3.5正确回答了150个问题中的84个，得分为56.0%；而ChatGPT 4的得分为73.7%，在150个问题中正确回答了109个问题(p结论：ChatGPT 4.0通过有效地解决板式问题，显示出作为修复学住院医师教育工具的潜力。然而，由于ChatGPT目前的修复知识库中存在大量错误信息，居民应谨慎使用，并使用来自可靠来源的循证信息补充人工智能生成的内容，以确保准确性和可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Assessment of ChatGPT's Performance on the ACP 2024 National Prosthodontics Resident Exam (NPRE).

Purpose: To evaluate the performance of ChatGPT on the National Prosthodontics Resident Exam (NPRE).

Methods: Two separate OpenAI accounts were used for ChatGPT 3.5 and ChatGPT 4.0, each managed by independent examiners. The dataset was sourced from the American College of Prosthodontics (ACP) 2024 National Prosthodontics Resident Exam (NPRE), which includes 150 multiple-choice board-style questions on various prosthodontic topics. Questions were inputted as they appeared in the NPRE, and responses were recorded as correct or incorrect. Accuracy was assessed using a two-tailed t-test, with statistical significance set at p < 0.05. After the study was completed, OpenAI accounts were deleted to ensure data privacy and security.

Results: ChatGPT 3.5 correctly answered 84 out of 150 questions, achieving a score of 56.0%; while ChatGPT 4 significantly outperformed it with a score of 73.7%, correctly answering 109 out of 150 questions (p < 0.001). In specific subjects, ChatGPT 4 consistently scored higher, with significant improvements in Basic Science (71.2% vs. 61.3%), Implant Surgery (67.5% vs. 41.2%), Diagnosis and Treatment Planning (66.6% vs. 53.4%) and Fixed Prosthodontics (86.9% vs. 62.5%). The highest scores for both versions were in Dental Materials, with ChatGPT 4 achieving 91.6% compared to ChatGPT 3.5's 73.1%.

Conclusion: ChatGPT 4.0 shows promising potential as an educational tool for prosthodontics residents by effectively addressing board-style questions. However, due to a significant presence of misinformation in ChatGPT's current prosthodontics knowledge base, residents should exercise caution and supplement AI-generated content with evidence-based information from credible sources to ensure accuracy and reliability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Journal of Dental Education 医学-学科教育

CiteScore

4.10

自引率

16.70%

发文量

127

审稿时长

6-12 weeks

期刊介绍： The aim of the European Journal of Dental Education is to publish original topical and review articles of the highest quality in the field of Dental Education. The Journal seeks to disseminate widely the latest information on curriculum development teaching methodologies assessment techniques and quality assurance in the fields of dental undergraduate and postgraduate education and dental auxiliary personnel training. The scope includes the dental educational aspects of the basic medical sciences the behavioural sciences the interface with medical education information technology and distance learning and educational audit. Papers embodying the results of high-quality educational research of relevance to dentistry are particularly encouraged as are evidence-based reports of novel and established educational programmes and their outcomes.