{"title":"Performance of ChatGPT in answering the oral pathology questions of various types or subjects from Taiwan National Dental Licensing Examinations","authors":"Yu-Hsueh Wu , Kai-Yun Tso , Chun-Pin Chiang","doi":"10.1016/j.jds.2025.03.030","DOIUrl":null,"url":null,"abstract":"<div><h3>Background/purpose</h3><div>ChatGPT, a large language model, can provide an instant and personalized solution in a conversational format. Our study aimed to assess the potential application of ChatGPT-4, ChatGPT-4o without a prompt (ChatGPT-4o-P<sup>-</sup>), and ChatGPT-4o with a prompt (ChatGPT-4o-P<sup>+</sup>) in helping dental students to study oral pathology (OP) by evaluating their performance in answering the OP multiple choice questions (MCQs) of various types or subjects.</div></div><div><h3>Materials and methods</h3><div>A total of 280 OP MCQs were collected from Taiwan National Dental Licensing Examinations. The chatbots of ChatGPT-4, ChatGPT-4o-P<sup>-</sup>, and ChatGPT-4o-P<sup>+</sup> were instructed to answer the OP MCQs of various types and subjects.</div></div><div><h3>Results</h3><div>ChatGPT-4o-P<sup>+</sup> achieved the highest overall accuracy rate (AR) of 90.0 %, slightly outperforming ChatGPT-4o-P<sup>-</sup> (88.6 % AR) and significantly exceeding ChatGPT-4 (79.6 % AR, <em>P</em> < 0.001). There was a significant difference in the AR of odd-one-out questions between ChatGPT-4 (77.2 % AR) and ChatGPT-4o-P<sup>-</sup> (91.3 % AR, <em>P</em> = 0.015) or ChatGPT-4o-P<sup>+</sup> (92.4 % AR, <em>P</em> = 0.008). However, there was no significant difference in the AR among three different models when answering the image-based and case-based questions. Of the 11 different OP subjects of single-disease, all three different models achieved a 100 % AR in three subjects; ChatGPT-4o-P<sup>+</sup> outperformed ChatGPT-4 and ChatGPT-4o-P<sup>-</sup> in other 3 subjects; ChatGPT-4o-P<sup>-</sup> was superior to ChatGPT-4 and ChatGPT-4o-P<sup>+</sup> in another 3 subjects; and ChatGPT-4o-P<sup>-</sup> and ChatGPT-4o-P<sup>+</sup> had equal performance and both were better than ChatGPT-4 in the rest of two subjects.</div></div><div><h3>Conclusion</h3><div>In overall evaluation, ChatGPT-4o-P<sup>+</sup> has better performance than ChatGPT-4o-P<sup>-</sup> and ChatGPT-4 in answering the OP MCQs.</div></div>","PeriodicalId":15583,"journal":{"name":"Journal of Dental Sciences","volume":"20 3","pages":"Pages 1709-1715"},"PeriodicalIF":3.1000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dental Sciences","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1991790225001035","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
摘要
背景/用途sechatgpt是一种大型语言模型,可以提供会话格式的即时和个性化解决方案。本研究旨在评估ChatGPT-4、chatgpt - 40 -无提示(chatgpt - 40 - p -)和chatgpt - 40 -带提示(chatgpt - 40 - p +)在帮助牙科学生学习口腔病理学(OP)方面的潜在应用,通过评估他们在回答各种类型或主题的口腔病理学多项选择题(mcq)中的表现。ChatGPT-4、chatgpt - 40 - p -和chatgpt - 40 - p +的聊天机器人被指示回答各种类型和主题的OP mcq。结果schatgpt - 40 -P+的总体准确率(AR)最高,为90.0%,略优于chatgpt - 40 -P- (AR) 88.6%,显著优于ChatGPT-4 (AR) 79.6%;0.001)。ChatGPT-4(77.2%)与chatgpt - 40 -P- (91.3%, P = 0.015)或chatgpt - 40 -P+ (92.4%, P = 0.008)的奇一出问题的AR差异有统计学意义。然而,在回答基于图像和基于案例的问题时,三种不同模型之间的AR没有显著差异。在单一疾病的11例不同OP受试者中,三种不同模型均在3例受试者中实现了100%的AR;chatgpt - 40 - p +在其他3个受试者中表现优于ChatGPT-4和chatgpt - 40 - p -;另有3名受试者的chatgpt - 40 - p -优于ChatGPT-4和chatgpt - 40 - p +;chatgpt - 40 - p -和chatgpt - 40 - p +的表现相同,并且在其余两个受试者中都优于ChatGPT-4。结论综合评价,chatgpt - 40 - p +在回答OP mcq方面优于chatgpt - 40 - p -和ChatGPT-4。
Performance of ChatGPT in answering the oral pathology questions of various types or subjects from Taiwan National Dental Licensing Examinations
Background/purpose
ChatGPT, a large language model, can provide an instant and personalized solution in a conversational format. Our study aimed to assess the potential application of ChatGPT-4, ChatGPT-4o without a prompt (ChatGPT-4o-P-), and ChatGPT-4o with a prompt (ChatGPT-4o-P+) in helping dental students to study oral pathology (OP) by evaluating their performance in answering the OP multiple choice questions (MCQs) of various types or subjects.
Materials and methods
A total of 280 OP MCQs were collected from Taiwan National Dental Licensing Examinations. The chatbots of ChatGPT-4, ChatGPT-4o-P-, and ChatGPT-4o-P+ were instructed to answer the OP MCQs of various types and subjects.
Results
ChatGPT-4o-P+ achieved the highest overall accuracy rate (AR) of 90.0 %, slightly outperforming ChatGPT-4o-P- (88.6 % AR) and significantly exceeding ChatGPT-4 (79.6 % AR, P < 0.001). There was a significant difference in the AR of odd-one-out questions between ChatGPT-4 (77.2 % AR) and ChatGPT-4o-P- (91.3 % AR, P = 0.015) or ChatGPT-4o-P+ (92.4 % AR, P = 0.008). However, there was no significant difference in the AR among three different models when answering the image-based and case-based questions. Of the 11 different OP subjects of single-disease, all three different models achieved a 100 % AR in three subjects; ChatGPT-4o-P+ outperformed ChatGPT-4 and ChatGPT-4o-P- in other 3 subjects; ChatGPT-4o-P- was superior to ChatGPT-4 and ChatGPT-4o-P+ in another 3 subjects; and ChatGPT-4o-P- and ChatGPT-4o-P+ had equal performance and both were better than ChatGPT-4 in the rest of two subjects.
Conclusion
In overall evaluation, ChatGPT-4o-P+ has better performance than ChatGPT-4o-P- and ChatGPT-4 in answering the OP MCQs.
期刊介绍:
he Journal of Dental Sciences (JDS), published quarterly, is the official and open access publication of the Association for Dental Sciences of the Republic of China (ADS-ROC). The precedent journal of the JDS is the Chinese Dental Journal (CDJ) which had already been covered by MEDLINE in 1988. As the CDJ continued to prove its importance in the region, the ADS-ROC decided to move to the international community by publishing an English journal. Hence, the birth of the JDS in 2006. The JDS is indexed in the SCI Expanded since 2008. It is also indexed in Scopus, and EMCare, ScienceDirect, SIIC Data Bases.
The topics covered by the JDS include all fields of basic and clinical dentistry. Some manuscripts focusing on the study of certain endemic diseases such as dental caries and periodontal diseases in particular regions of any country as well as oral pre-cancers, oral cancers, and oral submucous fibrosis related to betel nut chewing habit are also considered for publication. Besides, the JDS also publishes articles about the efficacy of a new treatment modality on oral verrucous hyperplasia or early oral squamous cell carcinoma.