Parental education in pediatric dysphagia: A comparative analysis of three large language models.

IF 2.4 3区 医学 Q3 GASTROENTEROLOGY & HEPATOLOGY
Bülent Alyanak, Burak Tayyip Dede, Fatih Bağcıer, Mazlum Serdar Akaltun
{"title":"Parental education in pediatric dysphagia: A comparative analysis of three large language models.","authors":"Bülent Alyanak, Burak Tayyip Dede, Fatih Bağcıer, Mazlum Serdar Akaltun","doi":"10.1002/jpn3.70069","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study evaluates the effectiveness of three widely used large language models (LLMs)-ChatGPT-4, Copilot, and Gemini-in providing accurate, reliable, and understandable answers to frequently asked questions about pediatric dysphagia.</p><p><strong>Methods: </strong>Twenty-five questions, selected based on Google Trends data, were presented to ChatGPT-4, Copilot, and Gemini, and the responses were evaluated using a 5-point Likert scale for accuracy, the Ensuring Quality Information for Patients (EQIP) and DISCERN scales for information quality and reliability, and the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) scores for readability. The performance of ChatGPT-4, Copilot, and Gemini was assessed by presenting the same set of questions at three different time points: August, September, and October 2024. Statistical analyses included analysis of variance, Kruskal-Wallis tests, and post hoc comparisons, with p values below 0.05 considered significant.</p><p><strong>Results: </strong>ChatGPT-4 achieved the highest mean accuracy score (4.1 ± 0.7) compared to Copilot (3.1 ± 0.7) and Gemini (3.8 ± 0.8), with significant differences observed in quality ratings (p < 0.001 and p < 0.05, respectively). EQIP and DISCERN scores further confirmed the superior performance of ChatGPT-4. In terms of readability, Gemini achieved the highest scores (FRE = 48.7 ± 9.9 and FKGL = 10.1 ± 1.6).</p><p><strong>Conclusions: </strong>While ChatGPT-4 generally provided more accurate and reliable information, Gemini produced more readable content. However, variability in overall information quality indicates that, although LLMs hold potential as tools for pediatric dysphagia education, further improvements are necessary to ensure consistent delivery of reliable and accessible information.</p>","PeriodicalId":16694,"journal":{"name":"Journal of Pediatric Gastroenterology and Nutrition","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pediatric Gastroenterology and Nutrition","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/jpn3.70069","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: This study evaluates the effectiveness of three widely used large language models (LLMs)-ChatGPT-4, Copilot, and Gemini-in providing accurate, reliable, and understandable answers to frequently asked questions about pediatric dysphagia.

Methods: Twenty-five questions, selected based on Google Trends data, were presented to ChatGPT-4, Copilot, and Gemini, and the responses were evaluated using a 5-point Likert scale for accuracy, the Ensuring Quality Information for Patients (EQIP) and DISCERN scales for information quality and reliability, and the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) scores for readability. The performance of ChatGPT-4, Copilot, and Gemini was assessed by presenting the same set of questions at three different time points: August, September, and October 2024. Statistical analyses included analysis of variance, Kruskal-Wallis tests, and post hoc comparisons, with p values below 0.05 considered significant.

Results: ChatGPT-4 achieved the highest mean accuracy score (4.1 ± 0.7) compared to Copilot (3.1 ± 0.7) and Gemini (3.8 ± 0.8), with significant differences observed in quality ratings (p < 0.001 and p < 0.05, respectively). EQIP and DISCERN scores further confirmed the superior performance of ChatGPT-4. In terms of readability, Gemini achieved the highest scores (FRE = 48.7 ± 9.9 and FKGL = 10.1 ± 1.6).

Conclusions: While ChatGPT-4 generally provided more accurate and reliable information, Gemini produced more readable content. However, variability in overall information quality indicates that, although LLMs hold potential as tools for pediatric dysphagia education, further improvements are necessary to ensure consistent delivery of reliable and accessible information.

儿童吞咽困难的父母教育:三种大型语言模型的比较分析。
目的:本研究评估了三种广泛使用的大语言模型(llm)——chatgpt -4、Copilot和gemini——在为儿童吞咽困难的常见问题提供准确、可靠和可理解的答案方面的有效性。方法:根据谷歌Trends数据选择25个问题,提交给ChatGPT-4、Copilot和Gemini,并使用5点Likert量表进行准确性评估,使用EQIP和DISCERN量表进行信息质量和可靠性评估,使用Flesch- kincaid Grade Level (FKGL)和Flesch Reading Ease (FRE)评分进行可读性评估。ChatGPT-4、Copilot和Gemini的性能通过在2024年8月、9月和10月三个不同的时间点提出相同的一组问题来评估。统计分析包括方差分析、Kruskal-Wallis检验和事后比较,p值低于0.05认为显著。结果:ChatGPT-4与Copilot(3.1±0.7)和Gemini(3.8±0.8)相比,获得了最高的平均准确度评分(4.1±0.7),在质量评分上存在显著差异(p)。结论:ChatGPT-4通常提供更准确可靠的信息,而Gemini提供了更可读的内容。然而,总体信息质量的可变性表明,尽管法学硕士具有作为儿科吞咽困难教育工具的潜力,但需要进一步改进以确保始终如一地提供可靠和可访问的信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
13.80%
发文量
467
审稿时长
3-6 weeks
期刊介绍: ​The Journal of Pediatric Gastroenterology and Nutrition (JPGN) provides a forum for original papers and reviews dealing with pediatric gastroenterology and nutrition, including normal and abnormal functions of the alimentary tract and its associated organs, including the salivary glands, pancreas, gallbladder, and liver. Particular emphasis is on development and its relation to infant and childhood nutrition.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信