Performance in answering orthodontic patients' frequently asked questions: Conversational artificial intelligence versus orthodontists.

IF 2.6 Q1 DENTISTRY, ORAL SURGERY & MEDICINE

Journal of the World Federation of Orthodontists Pub Date : 2025-03-25 DOI:10.1016/j.ejwf.2025.02.001

Xinlianyi Zhou, Yao Chen, Ehab A Abdulghani, Xu Zhang, Wei Zheng, Yu Li

{"title":"Performance in answering orthodontic patients' frequently asked questions: Conversational artificial intelligence versus orthodontists.","authors":"Xinlianyi Zhou, Yao Chen, Ehab A Abdulghani, Xu Zhang, Wei Zheng, Yu Li","doi":"10.1016/j.ejwf.2025.02.001","DOIUrl":null,"url":null,"abstract":"Objectives: Can conversational artificial intelligence (AI) help alleviate orthodontic patients' general doubts? This study aimed to investigate the performance of conversational AI in answering frequently asked questions (FAQs) from orthodontic patients, with comparison to orthodontists.Materials and methods: Thirty FAQs were selected covering the pre-, during-, and postorthodontic treatment stages. Each question was respectively answered by AI (Chat Generative Pretrained Transformer [ChatGPT]-4) and two orthodontists (Ortho. A and Ortho. B), randomly drawn out of a panel. Their responses to the 30 FAQs were ranked by four raters, randomly selected from another panel of orthodontists, resulting in 120 rankings. All the participants were Chinese, and all the questions and answers were conducted in Chinese.Results: Among the 120 rankings, ChatGPT was ranked first in 61 instances (50.8%), second in 35 instances (29.2%), and third in 24 instances (20.0%). Furthermore, the mean rank of ChatGPT was 1.69 ± 0.79, significantly better than that of Ortho. A (2.23 ± 0.79, P < 0.001) and Ortho. B (2.08 ± 0.79, P < 0.05). No significant difference was found between the two orthodontist groups. Additionally, the Spearman correlation coefficient between the average ranking of ChatGPT and the inter-rater agreement was 0.69 (P < 0.001), indicating a strong positive correlation between the two variables.Conclusions: Overall, the conversational AI ChatGPT-4 may outperform orthodontists in addressing orthodontic patients' FAQs, even in a non-English language. In addition, ChatGPT tends to perform better when responding to questions with answers widely accepted among orthodontic professionals, and vice versa.","PeriodicalId":43456,"journal":{"name":"Journal of the World Federation of Orthodontists","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the World Federation of Orthodontists","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.ejwf.2025.02.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Can conversational artificial intelligence (AI) help alleviate orthodontic patients' general doubts? This study aimed to investigate the performance of conversational AI in answering frequently asked questions (FAQs) from orthodontic patients, with comparison to orthodontists.

Materials and methods: Thirty FAQs were selected covering the pre-, during-, and postorthodontic treatment stages. Each question was respectively answered by AI (Chat Generative Pretrained Transformer [ChatGPT]-4) and two orthodontists (Ortho. A and Ortho. B), randomly drawn out of a panel. Their responses to the 30 FAQs were ranked by four raters, randomly selected from another panel of orthodontists, resulting in 120 rankings. All the participants were Chinese, and all the questions and answers were conducted in Chinese.

Results: Among the 120 rankings, ChatGPT was ranked first in 61 instances (50.8%), second in 35 instances (29.2%), and third in 24 instances (20.0%). Furthermore, the mean rank of ChatGPT was 1.69 ± 0.79, significantly better than that of Ortho. A (2.23 ± 0.79, P < 0.001) and Ortho. B (2.08 ± 0.79, P < 0.05). No significant difference was found between the two orthodontist groups. Additionally, the Spearman correlation coefficient between the average ranking of ChatGPT and the inter-rater agreement was 0.69 (P < 0.001), indicating a strong positive correlation between the two variables.

Conclusions: Overall, the conversational AI ChatGPT-4 may outperform orthodontists in addressing orthodontic patients' FAQs, even in a non-English language. In addition, ChatGPT tends to perform better when responding to questions with answers widely accepted among orthodontic professionals, and vice versa.

查看原文本刊更多论文

回答正畸患者常见问题的表现：会话人工智能与正畸医生。

目的：对话式人工智能（AI）能否帮助缓解正畸患者的普遍疑虑？本研究旨在调查对话式人工智能在回答正畸患者常见问题（FAQs）方面的表现，并与正畸医生进行比较。材料和方法：选择了30个常见问题，涵盖了正畸治疗前、中、后三个阶段。每个问题分别由人工智能（聊天生成预训练变压器[ChatGPT]-4）和两名正畸医生(Ortho。A和Ortho。B)从一个小组中随机抽取。他们对30个常见问题的回答由四名评分者进行排名，这些评分者是从另一组正畸医生中随机挑选出来的，总共有120个排名。所有的参与者都是中国人，所有的问题和回答都是用中文进行的。结果：在120例排名中，ChatGPT排名第一的61例（50.8%），排名第二的35例（29.2%），排名第三的24例（20.0%）。ChatGPT的平均评分为1.69±0.79，显著优于Ortho。A(2.23±0.79,P < 0.001)；B（2.08±0.79,p < 0.05）。正畸组与正畸组间无显著差异。此外，ChatGPT的平均排名与评分间一致性之间的Spearman相关系数为0.69 (P < 0.001)，表明两者之间存在很强的正相关关系。结论：总体而言，会话AI ChatGPT-4在解决正畸患者常见问题方面可能优于正畸医生，即使在非英语语言中也是如此。此外，ChatGPT在回答正畸专业人员广泛接受的答案时往往表现更好，反之亦然。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the World Federation of Orthodontists DENTISTRY, ORAL SURGERY & MEDICINE-

CiteScore

3.80

自引率

4.80%

发文量