Evaluating large language models as patient education tools for inflammatory bowel disease: A comparative study.

IF 4.3 3区 医学 Q1 GASTROENTEROLOGY & HEPATOLOGY
Yan Zhang, Xiao-Han Wan, Qing-Zhou Kong, Han Liu, Jun Liu, Jing Guo, Xiao-Yun Yang, Xiu-Li Zuo, Yan-Qing Li
{"title":"Evaluating large language models as patient education tools for inflammatory bowel disease: A comparative study.","authors":"Yan Zhang, Xiao-Han Wan, Qing-Zhou Kong, Han Liu, Jun Liu, Jing Guo, Xiao-Yun Yang, Xiu-Li Zuo, Yan-Qing Li","doi":"10.3748/wjg.v31.i6.102090","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Inflammatory bowel disease (IBD) is a global health burden that affects millions of individuals worldwide, necessitating extensive patient education. Large language models (LLMs) hold promise for addressing patient information needs. However, LLM use to deliver accurate and comprehensible IBD-related medical information has yet to be thoroughly investigated.</p><p><strong>Aim: </strong>To assess the utility of three LLMs (ChatGPT-4.0, Claude-3-Opus, and Gemini-1.5-Pro) as a reference point for patients with IBD.</p><p><strong>Methods: </strong>In this comparative study, two gastroenterology experts generated 15 IBD-related questions that reflected common patient concerns. These questions were used to evaluate the performance of the three LLMs. The answers provided by each model were independently assessed by three IBD-related medical experts using a Likert scale focusing on accuracy, comprehensibility, and correlation. Simultaneously, three patients were invited to evaluate the comprehensibility of their answers. Finally, a readability assessment was performed.</p><p><strong>Results: </strong>Overall, each of the LLMs achieved satisfactory levels of accuracy, comprehensibility, and completeness when answering IBD-related questions, although their performance varies. All of the investigated models demonstrated strengths in providing basic disease information such as IBD definition as well as its common symptoms and diagnostic methods. Nevertheless, when dealing with more complex medical advice, such as medication side effects, dietary adjustments, and complication risks, the quality of answers was inconsistent between the LLMs. Notably, Claude-3-Opus generated answers with better readability than the other two models.</p><p><strong>Conclusion: </strong>LLMs have the potential as educational tools for patients with IBD; however, there are discrepancies between the models. Further optimization and the development of specialized models are necessary to ensure the accuracy and safety of the information provided.</p>","PeriodicalId":23778,"journal":{"name":"World Journal of Gastroenterology","volume":"31 6","pages":"102090"},"PeriodicalIF":4.3000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11752706/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Journal of Gastroenterology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3748/wjg.v31.i6.102090","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Inflammatory bowel disease (IBD) is a global health burden that affects millions of individuals worldwide, necessitating extensive patient education. Large language models (LLMs) hold promise for addressing patient information needs. However, LLM use to deliver accurate and comprehensible IBD-related medical information has yet to be thoroughly investigated.

Aim: To assess the utility of three LLMs (ChatGPT-4.0, Claude-3-Opus, and Gemini-1.5-Pro) as a reference point for patients with IBD.

Methods: In this comparative study, two gastroenterology experts generated 15 IBD-related questions that reflected common patient concerns. These questions were used to evaluate the performance of the three LLMs. The answers provided by each model were independently assessed by three IBD-related medical experts using a Likert scale focusing on accuracy, comprehensibility, and correlation. Simultaneously, three patients were invited to evaluate the comprehensibility of their answers. Finally, a readability assessment was performed.

Results: Overall, each of the LLMs achieved satisfactory levels of accuracy, comprehensibility, and completeness when answering IBD-related questions, although their performance varies. All of the investigated models demonstrated strengths in providing basic disease information such as IBD definition as well as its common symptoms and diagnostic methods. Nevertheless, when dealing with more complex medical advice, such as medication side effects, dietary adjustments, and complication risks, the quality of answers was inconsistent between the LLMs. Notably, Claude-3-Opus generated answers with better readability than the other two models.

Conclusion: LLMs have the potential as educational tools for patients with IBD; however, there are discrepancies between the models. Further optimization and the development of specialized models are necessary to ensure the accuracy and safety of the information provided.

求助全文
约1分钟内获得全文 求助全文
来源期刊
World Journal of Gastroenterology
World Journal of Gastroenterology 医学-胃肠肝病学
CiteScore
7.80
自引率
4.70%
发文量
464
审稿时长
2.4 months
期刊介绍: The primary aims of the WJG are to improve diagnostic, therapeutic and preventive modalities and the skills of clinicians and to guide clinical practice in gastroenterology and hepatology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信