How reliable is the artificial intelligence product large language model ChatGPT in orthodontics?

Kevser Kurt Demirsoy, Suleyman Kutalmış Buyuk, Tayyip Bicer
{"title":"How reliable is the artificial intelligence product large language model ChatGPT in orthodontics?","authors":"Kevser Kurt Demirsoy, Suleyman Kutalmış Buyuk, Tayyip Bicer","doi":"10.2319/031224-207.1","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To evaluate the reliability of information produced by the artificial intelligence-based program ChatGPT in terms of accuracy and relevance, as assessed by orthodontists, dental students, and individuals seeking orthodontic treatment.</p><p><strong>Materials and methods: </strong>Frequently asked and curious questions in four basic areas related to orthodontics were prepared and asked in ChatGPT (Version 4.0), and answers were evaluated by three different groups (senior dental students, individuals seeking orthodontic treatment, orthodontists). Questions asked in these basic areas of orthodontics were about: clear aligners (CA), lingual orthodontics (LO), esthetic braces (EB), and temporomandibular disorders (TMD). The answers were evaluated by the Global Quality Scale (GQS) and Quality Criteria for Consumer Health Information (DISCERN) scale.</p><p><strong>Results: </strong>The total mean DISCERN score for answers on CA for students was 51.7 ± 9.38, for patients was 57.2 ± 10.73 and, for orthodontists was 47.4 ± 4.78 (P = .001). Comparison of GQS scores for LO among groups: students (3.53 ± 0.78), patients (4.40 ± 0.72), and orthodontists (3.63 ± 0.72) (P < .001). Intergroup comparison of ChatGPT evaluations about TMD was examined in terms of the DISCERN scale, with the highest value given in the patients group (57.83 ± 11.47) and lowest value in the orthodontist group (45.90 ± 11.84). When information quality evaluation about EB was examined, it GQS scores were >3 in all three groups (students: 3.50 ± 0.78; patients: 4.17 ± 0.87; orthodontists: 3.50 ± 0.82).</p><p><strong>Conclusions: </strong>ChatGPT has significant potential in terms of usability for patient information and education in the field of orthodontics if it is developed and necessary updates are made.</p>","PeriodicalId":94224,"journal":{"name":"The Angle orthodontist","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493421/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Angle orthodontist","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2319/031224-207.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: To evaluate the reliability of information produced by the artificial intelligence-based program ChatGPT in terms of accuracy and relevance, as assessed by orthodontists, dental students, and individuals seeking orthodontic treatment.

Materials and methods: Frequently asked and curious questions in four basic areas related to orthodontics were prepared and asked in ChatGPT (Version 4.0), and answers were evaluated by three different groups (senior dental students, individuals seeking orthodontic treatment, orthodontists). Questions asked in these basic areas of orthodontics were about: clear aligners (CA), lingual orthodontics (LO), esthetic braces (EB), and temporomandibular disorders (TMD). The answers were evaluated by the Global Quality Scale (GQS) and Quality Criteria for Consumer Health Information (DISCERN) scale.

Results: The total mean DISCERN score for answers on CA for students was 51.7 ± 9.38, for patients was 57.2 ± 10.73 and, for orthodontists was 47.4 ± 4.78 (P = .001). Comparison of GQS scores for LO among groups: students (3.53 ± 0.78), patients (4.40 ± 0.72), and orthodontists (3.63 ± 0.72) (P < .001). Intergroup comparison of ChatGPT evaluations about TMD was examined in terms of the DISCERN scale, with the highest value given in the patients group (57.83 ± 11.47) and lowest value in the orthodontist group (45.90 ± 11.84). When information quality evaluation about EB was examined, it GQS scores were >3 in all three groups (students: 3.50 ± 0.78; patients: 4.17 ± 0.87; orthodontists: 3.50 ± 0.82).

Conclusions: ChatGPT has significant potential in terms of usability for patient information and education in the field of orthodontics if it is developed and necessary updates are made.

人工智能产品大型语言模型 ChatGPT 在正畸学中的可靠性如何?
目的:评估基于人工智能的 ChatGPT 程序所生成信息的准确性和相关性:评估基于人工智能的 ChatGPT 程序所生成的信息在准确性和相关性方面的可靠性,由正畸医生、牙科学生和寻求正畸治疗的个人进行评估:在 ChatGPT(4.0 版)中准备了与正畸相关的四个基本领域中的常见问题和好奇问题,并由三个不同的小组(高年级牙科学生、寻求正畸治疗的个人、正畸医生)对答案进行评估。在正畸的这些基本领域中提出的问题涉及:透明矫治器 (CA)、舌侧正畸 (LO)、美学矫治器 (EB) 和颞下颌关节紊乱 (TMD)。答案采用全球质量量表(GQS)和消费者健康信息质量标准(DISCERN)量表进行评估:学生关于 CA 答案的 DISCERN 总平均得分为 51.7 ± 9.38,患者为 57.2 ± 10.73,正畸医生为 47.4 ± 4.78(P = .001)。各组间 LO 的 GQS 分数比较:学生(3.53 ± 0.78)、患者(4.40 ± 0.72)和正畸医生(3.63 ± 0.72)(P < .001)。根据 DISCERN 量表对 ChatGPT 关于 TMD 的评价进行了组间比较,患者组的评价值最高(57.83 ± 11.47),正畸医师组的评价值最低(45.90 ± 11.84)。在对 EB 的信息质量进行评估时,三组的 GQS 分数均大于 3(学生:3.50 ± 0.78;患者:4.17 ± 0.87;正畸医生:3.50 ± 0.82):如果对 ChatGPT 进行开发和必要的更新,它在正畸领域患者信息和教育的可用性方面具有巨大潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信