评估 GPT-4V 在日本全国牙科考试中的表现:挑战探索

IF 3.4 3区 医学 Q1 DENTISTRY, ORAL SURGERY & MEDICINE
Masaki Morishita , Hikaru Fukuda , Kosuke Muraoka , Taiji Nakamura , Masanari Hayashi , Izumi Yoshioka , Kentaro Ono , Shuji Awano
{"title":"评估 GPT-4V 在日本全国牙科考试中的表现:挑战探索","authors":"Masaki Morishita ,&nbsp;Hikaru Fukuda ,&nbsp;Kosuke Muraoka ,&nbsp;Taiji Nakamura ,&nbsp;Masanari Hayashi ,&nbsp;Izumi Yoshioka ,&nbsp;Kentaro Ono ,&nbsp;Shuji Awano","doi":"10.1016/j.jds.2023.12.007","DOIUrl":null,"url":null,"abstract":"<div><h3>Background/purpose</h3><p>Rapid advancements in AI technology have led to significant interest in its application across various fields, including medicine and dentistry. This study aimed to assess the capabilities of ChatGPT-4V with image recognition in answering image-based questions from the Japanese National Dental Examination (JNDE) to explore its potential as an educational support tool for dental students.</p></div><div><h3>Materials and methods</h3><p>The dataset used questions from the JNDE, which was conducted in January 2023, with a focus on image-related queries. ChatGPT-4V was utilized, and standardized prompts, question texts, and images were input. Data and statistical analyses were conducted using Qlik Sense® and GraphPad Prism.</p></div><div><h3>Results</h3><p>The overall correct response rate of ChatGPT-4V for image-based JNDE questions was 35.0 %. The correct response rates were 57.1 % for compulsory questions, 43.6 % for general questions, and 28.6 % for clinical practical questions. In specialties like Dental Anesthesiology and Endodontics, ChatGPT-4V achieved correct response rates above 70 %, while response rates for Orthodontics and Oral Surgery were lower. A higher number of images in questions was correlated with lower accuracy, suggesting an impact of the number of images on correct and incorrect responses.</p></div><div><h3>Conclusion</h3><p>While innovative, ChatGPT-4V’s image recognition feature exhibited limitations, especially in handling image-intensive and complex clinical practical questions, and is not yet fully suitable as an educational support tool for dental students at its current stage. Further technological refinement and re-evaluation with a broader dataset are recommended.</p></div>","PeriodicalId":15583,"journal":{"name":"Journal of Dental Sciences","volume":"19 3","pages":"Pages 1595-1600"},"PeriodicalIF":3.4000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1991790223003999/pdfft?md5=db38e189f0707713f0742868bb8f73ce&pid=1-s2.0-S1991790223003999-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Evaluating GPT-4V’s performance in the Japanese national dental examination: A challenge explored\",\"authors\":\"Masaki Morishita ,&nbsp;Hikaru Fukuda ,&nbsp;Kosuke Muraoka ,&nbsp;Taiji Nakamura ,&nbsp;Masanari Hayashi ,&nbsp;Izumi Yoshioka ,&nbsp;Kentaro Ono ,&nbsp;Shuji Awano\",\"doi\":\"10.1016/j.jds.2023.12.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background/purpose</h3><p>Rapid advancements in AI technology have led to significant interest in its application across various fields, including medicine and dentistry. This study aimed to assess the capabilities of ChatGPT-4V with image recognition in answering image-based questions from the Japanese National Dental Examination (JNDE) to explore its potential as an educational support tool for dental students.</p></div><div><h3>Materials and methods</h3><p>The dataset used questions from the JNDE, which was conducted in January 2023, with a focus on image-related queries. ChatGPT-4V was utilized, and standardized prompts, question texts, and images were input. Data and statistical analyses were conducted using Qlik Sense® and GraphPad Prism.</p></div><div><h3>Results</h3><p>The overall correct response rate of ChatGPT-4V for image-based JNDE questions was 35.0 %. The correct response rates were 57.1 % for compulsory questions, 43.6 % for general questions, and 28.6 % for clinical practical questions. In specialties like Dental Anesthesiology and Endodontics, ChatGPT-4V achieved correct response rates above 70 %, while response rates for Orthodontics and Oral Surgery were lower. A higher number of images in questions was correlated with lower accuracy, suggesting an impact of the number of images on correct and incorrect responses.</p></div><div><h3>Conclusion</h3><p>While innovative, ChatGPT-4V’s image recognition feature exhibited limitations, especially in handling image-intensive and complex clinical practical questions, and is not yet fully suitable as an educational support tool for dental students at its current stage. Further technological refinement and re-evaluation with a broader dataset are recommended.</p></div>\",\"PeriodicalId\":15583,\"journal\":{\"name\":\"Journal of Dental Sciences\",\"volume\":\"19 3\",\"pages\":\"Pages 1595-1600\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1991790223003999/pdfft?md5=db38e189f0707713f0742868bb8f73ce&pid=1-s2.0-S1991790223003999-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Dental Sciences\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1991790223003999\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dental Sciences","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1991790223003999","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0

摘要

背景/目的人工智能技术的快速发展引起了人们对其在包括医学和牙科在内的各个领域应用的浓厚兴趣。本研究旨在评估带有图像识别功能的 ChatGPT-4V 在回答日本全国牙科考试(JNDE)中基于图像的问题时的能力,以探索其作为牙科学生教育支持工具的潜力。使用了 ChatGPT-4V,并输入了标准化的提示、问题文本和图像。结果对于基于图像的 JNDE 问题,ChatGPT-4V 的总体正确回复率为 35.0%。必答题的正确率为 57.1%,综合题为 43.6%,临床实践题为 28.6%。在牙科麻醉学和牙髓病学等专业中,ChatGPT-4V 的正确回答率超过 70%,而正畸学和口腔外科的回答率较低。结论 ChatGPT-4V 的图像识别功能虽然具有创新性,但也存在局限性,尤其是在处理图像密集型和复杂的临床实践问题时,在现阶段还不完全适合作为牙科学生的教学辅助工具。建议进一步改进技术,并使用更广泛的数据集进行重新评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating GPT-4V’s performance in the Japanese national dental examination: A challenge explored

Background/purpose

Rapid advancements in AI technology have led to significant interest in its application across various fields, including medicine and dentistry. This study aimed to assess the capabilities of ChatGPT-4V with image recognition in answering image-based questions from the Japanese National Dental Examination (JNDE) to explore its potential as an educational support tool for dental students.

Materials and methods

The dataset used questions from the JNDE, which was conducted in January 2023, with a focus on image-related queries. ChatGPT-4V was utilized, and standardized prompts, question texts, and images were input. Data and statistical analyses were conducted using Qlik Sense® and GraphPad Prism.

Results

The overall correct response rate of ChatGPT-4V for image-based JNDE questions was 35.0 %. The correct response rates were 57.1 % for compulsory questions, 43.6 % for general questions, and 28.6 % for clinical practical questions. In specialties like Dental Anesthesiology and Endodontics, ChatGPT-4V achieved correct response rates above 70 %, while response rates for Orthodontics and Oral Surgery were lower. A higher number of images in questions was correlated with lower accuracy, suggesting an impact of the number of images on correct and incorrect responses.

Conclusion

While innovative, ChatGPT-4V’s image recognition feature exhibited limitations, especially in handling image-intensive and complex clinical practical questions, and is not yet fully suitable as an educational support tool for dental students at its current stage. Further technological refinement and re-evaluation with a broader dataset are recommended.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Dental Sciences
Journal of Dental Sciences 医学-牙科与口腔外科
CiteScore
5.10
自引率
14.30%
发文量
348
审稿时长
6 days
期刊介绍: he Journal of Dental Sciences (JDS), published quarterly, is the official and open access publication of the Association for Dental Sciences of the Republic of China (ADS-ROC). The precedent journal of the JDS is the Chinese Dental Journal (CDJ) which had already been covered by MEDLINE in 1988. As the CDJ continued to prove its importance in the region, the ADS-ROC decided to move to the international community by publishing an English journal. Hence, the birth of the JDS in 2006. The JDS is indexed in the SCI Expanded since 2008. It is also indexed in Scopus, and EMCare, ScienceDirect, SIIC Data Bases. The topics covered by the JDS include all fields of basic and clinical dentistry. Some manuscripts focusing on the study of certain endemic diseases such as dental caries and periodontal diseases in particular regions of any country as well as oral pre-cancers, oral cancers, and oral submucous fibrosis related to betel nut chewing habit are also considered for publication. Besides, the JDS also publishes articles about the efficacy of a new treatment modality on oral verrucous hyperplasia or early oral squamous cell carcinoma.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信