Busra Yilmaz, Emine Nur Kahraman, Michael T Brennan, Amardeep S Grewal, Aynur Aktas
{"title":"Accuracy of ChatGPT-4 Plus in Providing Information on Oral Cancer Management.","authors":"Busra Yilmaz, Emine Nur Kahraman, Michael T Brennan, Amardeep S Grewal, Aynur Aktas","doi":"10.1111/odi.70110","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Artificial intelligence (AI)-driven large language models, such as Chat Generative Pre-Trained Transformer (ChatGPT)-4 Plus, are increasingly used for patient education and clinical decision support in oral oncology, although their accuracy in oral cancer (OC) management remains uncertain. This study evaluates the accuracy of ChatGPT-4 Plus responses to clinically relevant questions regarding OC diagnosis, treatment, recovery, and prevention.</p><p><strong>Methods: </strong>A cross-sectional study assessed 65 clinically relevant OC-related questions using a paid ChatGPT-4 Plus subscription without modifications. Three oral medicine specialists and one radiation oncologist rated accuracy on a four-point scoring system. Interrater reliability was measured with the intraclass correlation coefficient (ICC), and chi-square tests were used for comparisons.</p><p><strong>Results: </strong>Among 65 questions, 63% of responses were Score 1, with none rated as Score 4. Score 1 was most frequent in Recovery (72%), followed by Treatment (62%), Prevention (60%), and Diagnosis (55%). Scores 2 and 3 responses were highest in Diagnosis (45%). Recovery had significantly higher Score 1 responses than Diagnosis (p < 0.05), while other comparisons were not significant. ICC ranged from 0.85 to 0.93.</p><p><strong>Conclusions: </strong>ChatGPT-4 Plus provided accurate responses to clinically relevant OC-related questions, particularly regarding recovery. However, diagnostic inconsistencies highlight the need for clinician oversight before integrating AI into practice.</p>","PeriodicalId":19615,"journal":{"name":"Oral diseases","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Oral diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/odi.70110","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: Artificial intelligence (AI)-driven large language models, such as Chat Generative Pre-Trained Transformer (ChatGPT)-4 Plus, are increasingly used for patient education and clinical decision support in oral oncology, although their accuracy in oral cancer (OC) management remains uncertain. This study evaluates the accuracy of ChatGPT-4 Plus responses to clinically relevant questions regarding OC diagnosis, treatment, recovery, and prevention.
Methods: A cross-sectional study assessed 65 clinically relevant OC-related questions using a paid ChatGPT-4 Plus subscription without modifications. Three oral medicine specialists and one radiation oncologist rated accuracy on a four-point scoring system. Interrater reliability was measured with the intraclass correlation coefficient (ICC), and chi-square tests were used for comparisons.
Results: Among 65 questions, 63% of responses were Score 1, with none rated as Score 4. Score 1 was most frequent in Recovery (72%), followed by Treatment (62%), Prevention (60%), and Diagnosis (55%). Scores 2 and 3 responses were highest in Diagnosis (45%). Recovery had significantly higher Score 1 responses than Diagnosis (p < 0.05), while other comparisons were not significant. ICC ranged from 0.85 to 0.93.
Conclusions: ChatGPT-4 Plus provided accurate responses to clinically relevant OC-related questions, particularly regarding recovery. However, diagnostic inconsistencies highlight the need for clinician oversight before integrating AI into practice.
期刊介绍:
Oral Diseases is a multidisciplinary and international journal with a focus on head and neck disorders, edited by leaders in the field, Professor Giovanni Lodi (Editor-in-Chief, Milan, Italy), Professor Stefano Petti (Deputy Editor, Rome, Italy) and Associate Professor Gulshan Sunavala-Dossabhoy (Deputy Editor, Shreveport, LA, USA). The journal is pre-eminent in oral medicine. Oral Diseases specifically strives to link often-isolated areas of dentistry and medicine through broad-based scholarship that includes well-designed and controlled clinical research, analytical epidemiology, and the translation of basic science in pre-clinical studies. The journal typically publishes articles relevant to many related medical specialties including especially dermatology, gastroenterology, hematology, immunology, infectious diseases, neuropsychiatry, oncology and otolaryngology. The essential requirement is that all submitted research is hypothesis-driven, with significant positive and negative results both welcomed. Equal publication emphasis is placed on etiology, pathogenesis, diagnosis, prevention and treatment.