- Book学术

发布求助

文献互助智能选刊最新文献

IF 3.1 3区医学 Q1 DENTISTRY, ORAL SURGERY & MEDICINE

Journal of Dental Sciences Pub Date : 2025-07-19 DOI:10.1016/j.jds.2025.07.010

Yu-Hsueh Wu , Kai-Yun Tso , Chun-Pin Chiang

{"title":"Impact of language and question types on ChatGPT-4o's performance in answering oral pathology questions from Taiwan National Dental Licensing Examinations","authors":"Yu-Hsueh Wu , Kai-Yun Tso , Chun-Pin Chiang","doi":"10.1016/j.jds.2025.07.010","DOIUrl":null,"url":null,"abstract":"<div><h3>Background/purpose</h3><div>ChatGPT has been utilized in medical and dental education, but its performance is potentially influenced by factors like language, question types, and content complexity. This study aimed to assess how English translation and question types affect ChatGPT-4o's accuracy in answering English-translated oral pathology (OP) multiple choice questions (MCQs).</div></div><div><h3>Materials and methods</h3><div>A total of 280 OP MCQs were collected from Taiwan National Dental Licensing Examinations and English-translated as a testing set for ChatGPT-4o. The mean overall accuracy rates (ARs) for English-translated and non-translated MCQs were compared by the dependent <em>t</em>-test. The difference in ARs between English-translated and non-translated OP MCQs within each of three question types (image-based, case-based, and odd-one-out questions) was assessed by chi-square test. The binary logistic regression was used to determine which type of question was more likely to be answered incorrectly.</div></div><div><h3>Results</h3><div>ChatGPT-4o showed significantly higher mean overall AR (93.2 ± 5.7 %) for English-translated MCQs than for non-translated MCQs (88.6 ± 6.5 %, <em>P</em> < 0.001). There were no significant differences in the ARs between English-translated and non-translated MCQs within each question type. The binary logistic regression revealed that, within the English-translated condition, image-based questions were significantly more likely to be answered incorrectly (odds ratio = 9.085, <em>P</em> = 0.001).</div></div><div><h3>Conclusion</h3><div>Translation of exam questions into English significantly improved ChatGPT-4o's overall performance. Error pattern analysis confirmed that image-based questions were more likely to result in incorrect answers, reflecting the model's current limitations in visual reasoning. Nevertheless, ChatGPT-4o still demonstrated its strong potential as an educational support tool.</div></div>","PeriodicalId":15583,"journal":{"name":"Journal of Dental Sciences","volume":"20 4","pages":"Pages 2176-2180"},"PeriodicalIF":3.1000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dental Sciences","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1991790225002491","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

摘要

背景/目的sechatgpt已用于医学和牙科教育，但其性能可能受到语言、问题类型和内容复杂性等因素的影响。本研究旨在评估英语翻译和问题类型如何影响chatgpt - 40在回答英语翻译的口腔病理（OP）多项选择题（mcq）时的准确性。英文翻译和非翻译mcq的平均总准确率（ARs）通过相关t检验进行比较。通过卡方检验评估英语翻译和非翻译OP mcq在三种问题类型（基于图像、基于案例和奇数出问题）中的ar差异。使用二元逻辑回归来确定哪种类型的问题更有可能回答错误。结果schatgpt - 40显示，英语翻译mcq组的平均总体AR（93.2±5.7%）显著高于非翻译mcq组（88.6±6.5%,P < 0.001）。在每个问题类型中，英语翻译的mcq和非翻译的mcq之间的ar没有显著差异。二元逻辑回归显示，在英语翻译条件下，基于图像的问题更有可能回答错误（优势比= 9.085,P = 0.001）。结论将试题翻译成英文后，chatgpt - 40的整体表现得到了显著提高。错误模式分析证实，基于图像的问题更有可能导致错误的答案，这反映了该模型目前在视觉推理方面的局限性。尽管如此，chatgpt - 40仍然显示出其作为教育支持工具的强大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Impact of language and question types on ChatGPT-4o's performance in answering oral pathology questions from Taiwan National Dental Licensing Examinations

Background/purpose

ChatGPT has been utilized in medical and dental education, but its performance is potentially influenced by factors like language, question types, and content complexity. This study aimed to assess how English translation and question types affect ChatGPT-4o's accuracy in answering English-translated oral pathology (OP) multiple choice questions (MCQs).

Materials and methods

A total of 280 OP MCQs were collected from Taiwan National Dental Licensing Examinations and English-translated as a testing set for ChatGPT-4o. The mean overall accuracy rates (ARs) for English-translated and non-translated MCQs were compared by the dependent t-test. The difference in ARs between English-translated and non-translated OP MCQs within each of three question types (image-based, case-based, and odd-one-out questions) was assessed by chi-square test. The binary logistic regression was used to determine which type of question was more likely to be answered incorrectly.

Results

ChatGPT-4o showed significantly higher mean overall AR (93.2 ± 5.7 %) for English-translated MCQs than for non-translated MCQs (88.6 ± 6.5 %, P < 0.001). There were no significant differences in the ARs between English-translated and non-translated MCQs within each question type. The binary logistic regression revealed that, within the English-translated condition, image-based questions were significantly more likely to be answered incorrectly (odds ratio = 9.085, P = 0.001).

Conclusion

Translation of exam questions into English significantly improved ChatGPT-4o's overall performance. Error pattern analysis confirmed that image-based questions were more likely to result in incorrect answers, reflecting the model's current limitations in visual reasoning. Nevertheless, ChatGPT-4o still demonstrated its strong potential as an educational support tool.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Dental Sciences 医学-牙科与口腔外科

CiteScore

5.10

自引率

14.30%

发文量

348

审稿时长

6 days

期刊介绍： he Journal of Dental Sciences (JDS), published quarterly, is the official and open access publication of the Association for Dental Sciences of the Republic of China (ADS-ROC). The precedent journal of the JDS is the Chinese Dental Journal (CDJ) which had already been covered by MEDLINE in 1988. As the CDJ continued to prove its importance in the region, the ADS-ROC decided to move to the international community by publishing an English journal. Hence, the birth of the JDS in 2006. The JDS is indexed in the SCI Expanded since 2008. It is also indexed in Scopus, and EMCare, ScienceDirect, SIIC Data Bases. The topics covered by the JDS include all fields of basic and clinical dentistry. Some manuscripts focusing on the study of certain endemic diseases such as dental caries and periodontal diseases in particular regions of any country as well as oral pre-cancers, oral cancers, and oral submucous fibrosis related to betel nut chewing habit are also considered for publication. Besides, the JDS also publishes articles about the efficacy of a new treatment modality on oral verrucous hyperplasia or early oral squamous cell carcinoma.