Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination.

IF 1.4 4区 医学 Q3 DENTISTRY, ORAL SURGERY & MEDICINE
Osamu Uehara, Tetsuro Morikawa, Fumiya Harada, Nodoka Sugiyama, Yuko Matsuki, Daichi Hiraki, Hinako Sakurai, Takashi Kado, Koki Yoshida, Yukie Murata, Hirofumi Matsuoka, Toshiyuki Nagasawa, Yasushi Furuichi, Yoshihiro Abiko, Hiroko Miura
{"title":"Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination.","authors":"Osamu Uehara, Tetsuro Morikawa, Fumiya Harada, Nodoka Sugiyama, Yuko Matsuki, Daichi Hiraki, Hinako Sakurai, Takashi Kado, Koki Yoshida, Yukie Murata, Hirofumi Matsuoka, Toshiyuki Nagasawa, Yasushi Furuichi, Yoshihiro Abiko, Hiroko Miura","doi":"10.1002/jdd.13766","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>In this study, we compared the performance of ChatGPT-3.5 to that of ChatGPT-4o in the context of the Japanese National Dental Examination, which assesses clinical reasoning skills and dental knowledge, to determine their potential usefulness in dental education.</p><p><strong>Methods: </strong>ChatGPT's performance was assessed using 1399 (55% of the exam) of 2520 questions from the Japanese National Dental Examinations (111-117). The 1121 excluded questions (45% of the exam) contained figures or tables that ChatGPT could not recognize. The questions were categorized into 18 different subjects based on dental specialty. Statistical analysis was performed using SPSS software, with McNemar's test applied to assess differences in performance.</p><p><strong>Results: </strong>A significant improvement was noted in the percentage of correct answers from ChatGPT-4o (84.63%) compared with those from ChatGPT-3.5 (45.46%), demonstrating enhanced reliability and subject knowledge. ChatGPT-4o consistently outperformed ChatGPT-3.5 across all dental subjects, with significant improvements in subjects such as oral surgery, pathology, pharmacology, and microbiology. Heatmap analysis revealed that ChatGPT-4o provided more stable and higher correct answer rates, especially for complex subjects.</p><p><strong>Conclusions: </strong>This study found that advanced natural language processing models, such as ChatGPT-4o, potentially have sufficiently advanced clinical reasoning skills and dental knowledge to function as a supplementary tool in dental education and exam preparation.</p>","PeriodicalId":50216,"journal":{"name":"Journal of Dental Education","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dental Education","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/jdd.13766","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: In this study, we compared the performance of ChatGPT-3.5 to that of ChatGPT-4o in the context of the Japanese National Dental Examination, which assesses clinical reasoning skills and dental knowledge, to determine their potential usefulness in dental education.

Methods: ChatGPT's performance was assessed using 1399 (55% of the exam) of 2520 questions from the Japanese National Dental Examinations (111-117). The 1121 excluded questions (45% of the exam) contained figures or tables that ChatGPT could not recognize. The questions were categorized into 18 different subjects based on dental specialty. Statistical analysis was performed using SPSS software, with McNemar's test applied to assess differences in performance.

Results: A significant improvement was noted in the percentage of correct answers from ChatGPT-4o (84.63%) compared with those from ChatGPT-3.5 (45.46%), demonstrating enhanced reliability and subject knowledge. ChatGPT-4o consistently outperformed ChatGPT-3.5 across all dental subjects, with significant improvements in subjects such as oral surgery, pathology, pharmacology, and microbiology. Heatmap analysis revealed that ChatGPT-4o provided more stable and higher correct answer rates, especially for complex subjects.

Conclusions: This study found that advanced natural language processing models, such as ChatGPT-4o, potentially have sufficiently advanced clinical reasoning skills and dental knowledge to function as a supplementary tool in dental education and exam preparation.

ChatGPT-3.5 和 ChatGPT-4o 在日本全国牙科考试中的表现。
研究目的本研究比较了 ChatGPT-3.5 和 ChatGPT-4o 在日本全国牙科考试(评估临床推理能力和牙科知识)中的表现,以确定它们在牙科教育中的潜在作用:在日本全国牙科考试(111-117)的 2520 道试题中,使用 1399 道试题(占考试的 55%)对 ChatGPT 的性能进行了评估。被排除在外的 1121 道试题(占考试的 45%)包含 ChatGPT 无法识别的数字或表格。这些试题根据牙科专业分为 18 个不同的科目。使用 SPSS 软件进行统计分析,并用 McNemar 检验来评估成绩差异:结果:与 ChatGPT-3.5 的正确率(45.46%)相比,ChatGPT-4o 的正确率(84.63%)有了明显提高,这表明其可靠性和学科知识得到了增强。在所有牙科科目中,ChatGPT-4o 的表现始终优于 ChatGPT-3.5,在口腔外科、病理学、药理学和微生物学等科目中都有显著提高。热图分析显示,ChatGPT-4o 提供了更稳定和更高的正确答案率,尤其是在复杂的科目上:本研究发现,ChatGPT-4o 等先进的自然语言处理模型具有足够先进的临床推理技能和牙科知识,可以作为牙科教育和考试准备的辅助工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Dental Education
Journal of Dental Education 医学-牙科与口腔外科
CiteScore
3.50
自引率
21.70%
发文量
274
审稿时长
3-8 weeks
期刊介绍: The Journal of Dental Education (JDE) is a peer-reviewed monthly journal that publishes a wide variety of educational and scientific research in dental, allied dental and advanced dental education. Published continuously by the American Dental Education Association since 1936 and internationally recognized as the premier journal for academic dentistry, the JDE publishes articles on such topics as curriculum reform, education research methods, innovative educational and assessment methodologies, faculty development, community-based dental education, student recruitment and admissions, professional and educational ethics, dental education around the world and systematic reviews of educational interest. The JDE is one of the top scholarly journals publishing the most important work in oral health education today; it celebrated its 80th anniversary in 2016.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信