Kevser Kurt Demirsoy, Suleyman Kutalmış Buyuk, Tayyip Bicer
{"title":"How reliable is the artificial intelligence product large language model ChatGPT in orthodontics?","authors":"Kevser Kurt Demirsoy, Suleyman Kutalmış Buyuk, Tayyip Bicer","doi":"10.2319/031224-207.1","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To evaluate the reliability of information produced by the artificial intelligence-based program ChatGPT in terms of accuracy and relevance, as assessed by orthodontists, dental students, and individuals seeking orthodontic treatment.</p><p><strong>Materials and methods: </strong>Frequently asked and curious questions in four basic areas related to orthodontics were prepared and asked in ChatGPT (Version 4.0), and answers were evaluated by three different groups (senior dental students, individuals seeking orthodontic treatment, orthodontists). Questions asked in these basic areas of orthodontics were about: clear aligners (CA), lingual orthodontics (LO), esthetic braces (EB), and temporomandibular disorders (TMD). The answers were evaluated by the Global Quality Scale (GQS) and Quality Criteria for Consumer Health Information (DISCERN) scale.</p><p><strong>Results: </strong>The total mean DISCERN score for answers on CA for students was 51.7 ± 9.38, for patients was 57.2 ± 10.73 and, for orthodontists was 47.4 ± 4.78 (P = .001). Comparison of GQS scores for LO among groups: students (3.53 ± 0.78), patients (4.40 ± 0.72), and orthodontists (3.63 ± 0.72) (P < .001). Intergroup comparison of ChatGPT evaluations about TMD was examined in terms of the DISCERN scale, with the highest value given in the patients group (57.83 ± 11.47) and lowest value in the orthodontist group (45.90 ± 11.84). When information quality evaluation about EB was examined, it GQS scores were >3 in all three groups (students: 3.50 ± 0.78; patients: 4.17 ± 0.87; orthodontists: 3.50 ± 0.82).</p><p><strong>Conclusions: </strong>ChatGPT has significant potential in terms of usability for patient information and education in the field of orthodontics if it is developed and necessary updates are made.</p>","PeriodicalId":94224,"journal":{"name":"The Angle orthodontist","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493421/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Angle orthodontist","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2319/031224-207.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: To evaluate the reliability of information produced by the artificial intelligence-based program ChatGPT in terms of accuracy and relevance, as assessed by orthodontists, dental students, and individuals seeking orthodontic treatment.
Materials and methods: Frequently asked and curious questions in four basic areas related to orthodontics were prepared and asked in ChatGPT (Version 4.0), and answers were evaluated by three different groups (senior dental students, individuals seeking orthodontic treatment, orthodontists). Questions asked in these basic areas of orthodontics were about: clear aligners (CA), lingual orthodontics (LO), esthetic braces (EB), and temporomandibular disorders (TMD). The answers were evaluated by the Global Quality Scale (GQS) and Quality Criteria for Consumer Health Information (DISCERN) scale.
Results: The total mean DISCERN score for answers on CA for students was 51.7 ± 9.38, for patients was 57.2 ± 10.73 and, for orthodontists was 47.4 ± 4.78 (P = .001). Comparison of GQS scores for LO among groups: students (3.53 ± 0.78), patients (4.40 ± 0.72), and orthodontists (3.63 ± 0.72) (P < .001). Intergroup comparison of ChatGPT evaluations about TMD was examined in terms of the DISCERN scale, with the highest value given in the patients group (57.83 ± 11.47) and lowest value in the orthodontist group (45.90 ± 11.84). When information quality evaluation about EB was examined, it GQS scores were >3 in all three groups (students: 3.50 ± 0.78; patients: 4.17 ± 0.87; orthodontists: 3.50 ± 0.82).
Conclusions: ChatGPT has significant potential in terms of usability for patient information and education in the field of orthodontics if it is developed and necessary updates are made.