How reliable is the artificial intelligence product large language model ChatGPT in orthodontics?

The Angle orthodontist Pub Date : 2024-11-01 DOI:10.2319/031224-207.1

Kevser Kurt Demirsoy, Suleyman Kutalmış Buyuk, Tayyip Bicer

{"title":"How reliable is the artificial intelligence product large language model ChatGPT in orthodontics?","authors":"Kevser Kurt Demirsoy, Suleyman Kutalmış Buyuk, Tayyip Bicer","doi":"10.2319/031224-207.1","DOIUrl":null,"url":null,"abstract":"Objectives: To evaluate the reliability of information produced by the artificial intelligence-based program ChatGPT in terms of accuracy and relevance, as assessed by orthodontists, dental students, and individuals seeking orthodontic treatment.Materials and methods: Frequently asked and curious questions in four basic areas related to orthodontics were prepared and asked in ChatGPT (Version 4.0), and answers were evaluated by three different groups (senior dental students, individuals seeking orthodontic treatment, orthodontists). Questions asked in these basic areas of orthodontics were about: clear aligners (CA), lingual orthodontics (LO), esthetic braces (EB), and temporomandibular disorders (TMD). The answers were evaluated by the Global Quality Scale (GQS) and Quality Criteria for Consumer Health Information (DISCERN) scale.Results: The total mean DISCERN score for answers on CA for students was 51.7 ± 9.38, for patients was 57.2 ± 10.73 and, for orthodontists was 47.4 ± 4.78 (P = .001). Comparison of GQS scores for LO among groups: students (3.53 ± 0.78), patients (4.40 ± 0.72), and orthodontists (3.63 ± 0.72) (P < .001). Intergroup comparison of ChatGPT evaluations about TMD was examined in terms of the DISCERN scale, with the highest value given in the patients group (57.83 ± 11.47) and lowest value in the orthodontist group (45.90 ± 11.84). When information quality evaluation about EB was examined, it GQS scores were >3 in all three groups (students: 3.50 ± 0.78; patients: 4.17 ± 0.87; orthodontists: 3.50 ± 0.82).Conclusions: ChatGPT has significant potential in terms of usability for patient information and education in the field of orthodontics if it is developed and necessary updates are made.","PeriodicalId":94224,"journal":{"name":"The Angle orthodontist","volume":" ","pages":"602-607"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493421/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Angle orthodontist","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2319/031224-207.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: To evaluate the reliability of information produced by the artificial intelligence-based program ChatGPT in terms of accuracy and relevance, as assessed by orthodontists, dental students, and individuals seeking orthodontic treatment.

Materials and methods: Frequently asked and curious questions in four basic areas related to orthodontics were prepared and asked in ChatGPT (Version 4.0), and answers were evaluated by three different groups (senior dental students, individuals seeking orthodontic treatment, orthodontists). Questions asked in these basic areas of orthodontics were about: clear aligners (CA), lingual orthodontics (LO), esthetic braces (EB), and temporomandibular disorders (TMD). The answers were evaluated by the Global Quality Scale (GQS) and Quality Criteria for Consumer Health Information (DISCERN) scale.

Results: The total mean DISCERN score for answers on CA for students was 51.7 ± 9.38, for patients was 57.2 ± 10.73 and, for orthodontists was 47.4 ± 4.78 (P = .001). Comparison of GQS scores for LO among groups: students (3.53 ± 0.78), patients (4.40 ± 0.72), and orthodontists (3.63 ± 0.72) (P < .001). Intergroup comparison of ChatGPT evaluations about TMD was examined in terms of the DISCERN scale, with the highest value given in the patients group (57.83 ± 11.47) and lowest value in the orthodontist group (45.90 ± 11.84). When information quality evaluation about EB was examined, it GQS scores were >3 in all three groups (students: 3.50 ± 0.78; patients: 4.17 ± 0.87; orthodontists: 3.50 ± 0.82).

Conclusions: ChatGPT has significant potential in terms of usability for patient information and education in the field of orthodontics if it is developed and necessary updates are made.

查看原文本刊更多论文

人工智能产品大型语言模型 ChatGPT 在正畸学中的可靠性如何？

目的：评估基于人工智能的 ChatGPT 程序所生成信息的准确性和相关性：评估基于人工智能的 ChatGPT 程序所生成的信息在准确性和相关性方面的可靠性，由正畸医生、牙科学生和寻求正畸治疗的个人进行评估：在 ChatGPT（4.0 版）中准备了与正畸相关的四个基本领域中的常见问题和好奇问题，并由三个不同的小组（高年级牙科学生、寻求正畸治疗的个人、正畸医生）对答案进行评估。在正畸的这些基本领域中提出的问题涉及：透明矫治器 (CA)、舌侧正畸 (LO)、美学矫治器 (EB) 和颞下颌关节紊乱 (TMD)。答案采用全球质量量表（GQS）和消费者健康信息质量标准（DISCERN）量表进行评估：学生关于 CA 答案的 DISCERN 总平均得分为 51.7 ± 9.38，患者为 57.2 ± 10.73，正畸医生为 47.4 ± 4.78（P = .001）。各组间 LO 的 GQS 分数比较：学生（3.53 ± 0.78）、患者（4.40 ± 0.72）和正畸医生（3.63 ± 0.72）（P < .001）。根据 DISCERN 量表对 ChatGPT 关于 TMD 的评价进行了组间比较，患者组的评价值最高（57.83 ± 11.47），正畸医师组的评价值最低（45.90 ± 11.84）。在对 EB 的信息质量进行评估时，三组的 GQS 分数均大于 3（学生：3.50 ± 0.78；患者：4.17 ± 0.87；正畸医生：3.50 ± 0.82）：如果对 ChatGPT 进行开发和必要的更新，它在正畸领域患者信息和教育的可用性方面具有巨大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Angle orthodontist

自引率

0.00%

发文量