{"title":"Evaluating GPT-4V’s performance in the Japanese national dental examination: A challenge explored","authors":"Masaki Morishita , Hikaru Fukuda , Kosuke Muraoka , Taiji Nakamura , Masanari Hayashi , Izumi Yoshioka , Kentaro Ono , Shuji Awano","doi":"10.1016/j.jds.2023.12.007","DOIUrl":null,"url":null,"abstract":"<div><h3>Background/purpose</h3><p>Rapid advancements in AI technology have led to significant interest in its application across various fields, including medicine and dentistry. This study aimed to assess the capabilities of ChatGPT-4V with image recognition in answering image-based questions from the Japanese National Dental Examination (JNDE) to explore its potential as an educational support tool for dental students.</p></div><div><h3>Materials and methods</h3><p>The dataset used questions from the JNDE, which was conducted in January 2023, with a focus on image-related queries. ChatGPT-4V was utilized, and standardized prompts, question texts, and images were input. Data and statistical analyses were conducted using Qlik Sense® and GraphPad Prism.</p></div><div><h3>Results</h3><p>The overall correct response rate of ChatGPT-4V for image-based JNDE questions was 35.0 %. The correct response rates were 57.1 % for compulsory questions, 43.6 % for general questions, and 28.6 % for clinical practical questions. In specialties like Dental Anesthesiology and Endodontics, ChatGPT-4V achieved correct response rates above 70 %, while response rates for Orthodontics and Oral Surgery were lower. A higher number of images in questions was correlated with lower accuracy, suggesting an impact of the number of images on correct and incorrect responses.</p></div><div><h3>Conclusion</h3><p>While innovative, ChatGPT-4V’s image recognition feature exhibited limitations, especially in handling image-intensive and complex clinical practical questions, and is not yet fully suitable as an educational support tool for dental students at its current stage. Further technological refinement and re-evaluation with a broader dataset are recommended.</p></div>","PeriodicalId":15583,"journal":{"name":"Journal of Dental Sciences","volume":"19 3","pages":"Pages 1595-1600"},"PeriodicalIF":3.4000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1991790223003999/pdfft?md5=db38e189f0707713f0742868bb8f73ce&pid=1-s2.0-S1991790223003999-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dental Sciences","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1991790223003999","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Background/purpose
Rapid advancements in AI technology have led to significant interest in its application across various fields, including medicine and dentistry. This study aimed to assess the capabilities of ChatGPT-4V with image recognition in answering image-based questions from the Japanese National Dental Examination (JNDE) to explore its potential as an educational support tool for dental students.
Materials and methods
The dataset used questions from the JNDE, which was conducted in January 2023, with a focus on image-related queries. ChatGPT-4V was utilized, and standardized prompts, question texts, and images were input. Data and statistical analyses were conducted using Qlik Sense® and GraphPad Prism.
Results
The overall correct response rate of ChatGPT-4V for image-based JNDE questions was 35.0 %. The correct response rates were 57.1 % for compulsory questions, 43.6 % for general questions, and 28.6 % for clinical practical questions. In specialties like Dental Anesthesiology and Endodontics, ChatGPT-4V achieved correct response rates above 70 %, while response rates for Orthodontics and Oral Surgery were lower. A higher number of images in questions was correlated with lower accuracy, suggesting an impact of the number of images on correct and incorrect responses.
Conclusion
While innovative, ChatGPT-4V’s image recognition feature exhibited limitations, especially in handling image-intensive and complex clinical practical questions, and is not yet fully suitable as an educational support tool for dental students at its current stage. Further technological refinement and re-evaluation with a broader dataset are recommended.
期刊介绍:
he Journal of Dental Sciences (JDS), published quarterly, is the official and open access publication of the Association for Dental Sciences of the Republic of China (ADS-ROC). The precedent journal of the JDS is the Chinese Dental Journal (CDJ) which had already been covered by MEDLINE in 1988. As the CDJ continued to prove its importance in the region, the ADS-ROC decided to move to the international community by publishing an English journal. Hence, the birth of the JDS in 2006. The JDS is indexed in the SCI Expanded since 2008. It is also indexed in Scopus, and EMCare, ScienceDirect, SIIC Data Bases.
The topics covered by the JDS include all fields of basic and clinical dentistry. Some manuscripts focusing on the study of certain endemic diseases such as dental caries and periodontal diseases in particular regions of any country as well as oral pre-cancers, oral cancers, and oral submucous fibrosis related to betel nut chewing habit are also considered for publication. Besides, the JDS also publishes articles about the efficacy of a new treatment modality on oral verrucous hyperplasia or early oral squamous cell carcinoma.