Performance of the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models in responding to dental implantology inquiries.

IF 4.3 2区医学 Q1 DENTISTRY, ORAL SURGERY & MEDICINE

Journal of Prosthetic Dentistry Pub Date : 2025-01-04 DOI:10.1016/j.prosdent.2024.12.016

Noha Taymour, Shaimaa M Fouda, Hams H Abdelrahaman, Mohamed G Hassan

{"title":"Performance of the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models in responding to dental implantology inquiries.","authors":"Noha Taymour, Shaimaa M Fouda, Hams H Abdelrahaman, Mohamed G Hassan","doi":"10.1016/j.prosdent.2024.12.016","DOIUrl":null,"url":null,"abstract":"Statement of problem: Artificial intelligence (AI) chatbots have been proposed as promising resources for oral health information. However, the quality and readability of existing online health-related information is often inconsistent and challenging.Purpose: This study aimed to compare the reliability and usefulness of dental implantology-related information provided by the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models (LLMs).Material and methods: A total of 75 questions were developed covering various dental implant domains. These questions were then presented to 3 different LLMs: ChatGPT-3.5, ChatGPT-4, and Google Gemini. The responses generated were recorded and independently assessed by 2 specialists who were blinded to the source of the responses. The evaluation focused on the accuracy of the generated answers using a modified 5-point Likert scale to measure the reliability and usefulness of the information provided. Additionally, the ability of the AI-chatbots to offer definitive responses to closed questions, provide reference citation, and advise scheduling consultations with a dental specialist was also analyzed. The Friedman, Mann Whitney U and Spearman Correlation tests were used for data analysis (α=.05).Results: Google Gemini exhibited higher reliability and usefulness scores compared with ChatGPT-3.5 and ChatGPT-4 (P<.001). Google Gemini also demonstrated superior proficiency in identifying closed questions (25 questions, 41%) and recommended specialist consultations for 74 questions (98.7%), significantly outperforming ChatGPT-4 (30 questions, 40.0%) and ChatGPT-3.5 (28 questions, 37.3%) (P<.001). A positive correlation was found between reliability and usefulness scores, with Google Gemini showing the strongest correlation (ρ=.702).Conclusions: The 3 AI Chatbots showed acceptable levels of reliability and usefulness in addressing dental implant-related queries. Google Gemini distinguished itself by providing responses consistent with specialist consultations.","PeriodicalId":16866,"journal":{"name":"Journal of Prosthetic Dentistry","volume":" ","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Prosthetic Dentistry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.prosdent.2024.12.016","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

Abstract

Statement of problem: Artificial intelligence (AI) chatbots have been proposed as promising resources for oral health information. However, the quality and readability of existing online health-related information is often inconsistent and challenging.

Purpose: This study aimed to compare the reliability and usefulness of dental implantology-related information provided by the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models (LLMs).

Material and methods: A total of 75 questions were developed covering various dental implant domains. These questions were then presented to 3 different LLMs: ChatGPT-3.5, ChatGPT-4, and Google Gemini. The responses generated were recorded and independently assessed by 2 specialists who were blinded to the source of the responses. The evaluation focused on the accuracy of the generated answers using a modified 5-point Likert scale to measure the reliability and usefulness of the information provided. Additionally, the ability of the AI-chatbots to offer definitive responses to closed questions, provide reference citation, and advise scheduling consultations with a dental specialist was also analyzed. The Friedman, Mann Whitney U and Spearman Correlation tests were used for data analysis (α=.05).

Results: Google Gemini exhibited higher reliability and usefulness scores compared with ChatGPT-3.5 and ChatGPT-4 (P<.001). Google Gemini also demonstrated superior proficiency in identifying closed questions (25 questions, 41%) and recommended specialist consultations for 74 questions (98.7%), significantly outperforming ChatGPT-4 (30 questions, 40.0%) and ChatGPT-3.5 (28 questions, 37.3%) (P<.001). A positive correlation was found between reliability and usefulness scores, with Google Gemini showing the strongest correlation (ρ=.702).

Conclusions: The 3 AI Chatbots showed acceptable levels of reliability and usefulness in addressing dental implant-related queries. Google Gemini distinguished itself by providing responses consistent with specialist consultations.

查看原文本刊更多论文

ChatGPT-3.5、ChatGPT-4 和谷歌双子座大语言模型在回复牙科种植咨询时的表现。

问题陈述：人工智能（AI）聊天机器人已被提出作为有前途的口腔健康信息资源。目的：本研究旨在比较 ChatGPT-3.5、ChatGPT-4 和谷歌双子座大型语言模型（LLM）提供的牙科植入相关信息的可靠性和实用性：共开发了 75 个问题，涵盖了不同的牙科植入领域。材料和方法：共开发了 75 个问题，涵盖了不同的牙科种植领域：ChatGPT-3.5、ChatGPT-4 和 Google Gemini。生成的回答由两名专家记录并独立评估，他们对回答的来源视而不见。评估的重点是所生成答案的准确性，采用修改后的 5 点李克特量表来衡量所提供信息的可靠性和实用性。此外，还分析了人工智能聊天机器人对封闭式问题提供明确答复、提供参考引文以及建议安排与牙科专家进行咨询的能力。数据分析采用了弗里德曼检验、曼-惠特尼U检验和斯皮尔曼相关检验（α=.05）：结果：与 ChatGPT-3.5 和 ChatGPT-4 相比，谷歌双子座表现出更高的可靠性和实用性得分：这 3 个人工智能聊天机器人在解决种植牙相关问题时表现出了可接受的可靠性和实用性水平。Google Gemini 通过提供与专家咨询一致的回复而脱颖而出。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Prosthetic Dentistry 医学-牙科与口腔外科

CiteScore

7.00

自引率

13.00%

发文量

599

审稿时长

69 days

期刊介绍： The Journal of Prosthetic Dentistry is the leading professional journal devoted exclusively to prosthetic and restorative dentistry. The Journal is the official publication for 24 leading U.S. international prosthodontic organizations. The monthly publication features timely, original peer-reviewed articles on the newest techniques, dental materials, and research findings. The Journal serves prosthodontists and dentists in advanced practice, and features color photos that illustrate many step-by-step procedures. The Journal of Prosthetic Dentistry is included in Index Medicus and CINAHL.