Large language models: unlocking new potential in patient education for thyroid eye disease.

IF 2.9 3区 医学 Q2 ENDOCRINOLOGY & METABOLISM
Yuwan Gao, Qi Xu, Ou Zhang, Hongliang Wang, Yunlong Wang, Jiale Wang, Xiaohui Chen
{"title":"Large language models: unlocking new potential in patient education for thyroid eye disease.","authors":"Yuwan Gao, Qi Xu, Ou Zhang, Hongliang Wang, Yunlong Wang, Jiale Wang, Xiaohui Chen","doi":"10.1007/s12020-025-04339-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study aims to evaluate the performance of three large language models (LLMs) in generating patient education materials (PEMs) for thyroid eye disease (TED), intending to improve patients' understanding and awareness of TED.</p><p><strong>Methods: </strong>We evaluated the performance of ChatGPT-4o, Claude 3.5, and Gemini 1.5 in generating PEMs for TED by designing different prompts. First, we produced TED patient educational brochures based on prompts A and B, respectively. Prompt B asked to make the content simple for sixth graders. Next, we designed two responses to frequently asked questions (FAQs) about TED: standard responses and simplified responses, where the simplified responses were optimized through specific prompts. All generated content was systematically evaluated based on dimensions such as quality, understandability, actionability, accuracy, and empathy. The readability of the content was analyzed using the online tool Readable.com (including FKGL: Flesch-Kincaid Grade Level and SMOG: Simple Measure of Gobbledygook).</p><p><strong>Results: </strong>Both prompt A and prompt B generated brochures that performed excellently in terms of quality (DISCERN ≥ 4), understandability (PEMAT Understandability ≥70%), accuracy (Score ≥4), and empathy (Score ≥4), with no significant differences between the two. However, both failed to meet the \"actionable\" standard (PEMAT Actionability <70%). Regarding readability, prompt B was easier to understand than prompt A, although the optimized version of prompt B still did not reach the ideal readability level. Additionally, a comparative analysis of FAQs about TED on Google using LLMs showed that, regardless of whether the response was standard or simplified, the LLM's performance outperformed Google, yielding results similar to those generated by the brochures.</p><p><strong>Conclusion: </strong>Overall, LLMs, as a powerful tool, demonstrate significant potential in generating PEMs for TED. They are capable of producing high-quality, understandable, accurate, and empathetic content, but there is still room for improvement in terms of readability.</p>","PeriodicalId":49211,"journal":{"name":"Endocrine","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Endocrine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12020-025-04339-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: This study aims to evaluate the performance of three large language models (LLMs) in generating patient education materials (PEMs) for thyroid eye disease (TED), intending to improve patients' understanding and awareness of TED.

Methods: We evaluated the performance of ChatGPT-4o, Claude 3.5, and Gemini 1.5 in generating PEMs for TED by designing different prompts. First, we produced TED patient educational brochures based on prompts A and B, respectively. Prompt B asked to make the content simple for sixth graders. Next, we designed two responses to frequently asked questions (FAQs) about TED: standard responses and simplified responses, where the simplified responses were optimized through specific prompts. All generated content was systematically evaluated based on dimensions such as quality, understandability, actionability, accuracy, and empathy. The readability of the content was analyzed using the online tool Readable.com (including FKGL: Flesch-Kincaid Grade Level and SMOG: Simple Measure of Gobbledygook).

Results: Both prompt A and prompt B generated brochures that performed excellently in terms of quality (DISCERN ≥ 4), understandability (PEMAT Understandability ≥70%), accuracy (Score ≥4), and empathy (Score ≥4), with no significant differences between the two. However, both failed to meet the "actionable" standard (PEMAT Actionability <70%). Regarding readability, prompt B was easier to understand than prompt A, although the optimized version of prompt B still did not reach the ideal readability level. Additionally, a comparative analysis of FAQs about TED on Google using LLMs showed that, regardless of whether the response was standard or simplified, the LLM's performance outperformed Google, yielding results similar to those generated by the brochures.

Conclusion: Overall, LLMs, as a powerful tool, demonstrate significant potential in generating PEMs for TED. They are capable of producing high-quality, understandable, accurate, and empathetic content, but there is still room for improvement in terms of readability.

大型语言模型:释放甲状腺眼病患者教育的新潜力。
目的:本研究旨在评估三种大型语言模型(llm)在甲状腺眼病(TED)患者教育材料(PEMs)生成中的表现,旨在提高患者对TED的理解和认识。方法:我们通过设计不同的提示来评估chatgpt - 40、Claude 3.5和Gemini 1.5在TED生成PEMs方面的性能。首先,我们分别根据提示A和B制作了TED患者教育手册。提示B要求为六年级学生提供简单的内容。接下来,我们设计了两种关于TED的常见问题的回答:标准回答和简化回答,其中简化的回答通过特定的提示进行优化。根据质量、可理解性、可操作性、准确性和移情等维度对所有生成的内容进行系统评估。使用在线工具Readable.com分析内容的可读性(包括FKGL: Flesch-Kincaid Grade Level和SMOG: Simple Measure of Gobbledygook)。结果:提示A和提示B生成的宣传册在质量(DISCERN≥4)、可理解性(PEMAT可理解性≥70%)、准确性(Score≥4)和共情性(Score≥4)方面表现优异,两者之间无显著差异。结论:总体而言,法学硕士作为一种强大的工具,在为TED生成PEMs方面显示出巨大的潜力。它们能够产生高质量的、可理解的、准确的和令人感同身受的内容,但在可读性方面仍有改进的空间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Endocrine
Endocrine ENDOCRINOLOGY & METABOLISM-
CiteScore
6.50
自引率
5.40%
发文量
295
审稿时长
1.5 months
期刊介绍: Well-established as a major journal in today’s rapidly advancing experimental and clinical research areas, Endocrine publishes original articles devoted to basic (including molecular, cellular and physiological studies), translational and clinical research in all the different fields of endocrinology and metabolism. Articles will be accepted based on peer-reviews, priority, and editorial decision. Invited reviews, mini-reviews and viewpoints on relevant pathophysiological and clinical topics, as well as Editorials on articles appearing in the Journal, are published. Unsolicited Editorials will be evaluated by the editorial team. Outcomes of scientific meetings, as well as guidelines and position statements, may be submitted. The Journal also considers special feature articles in the field of endocrine genetics and epigenetics, as well as articles devoted to novel methods and techniques in endocrinology. Endocrine covers controversial, clinical endocrine issues. Meta-analyses on endocrine and metabolic topics are also accepted. Descriptions of single clinical cases and/or small patients studies are not published unless of exceptional interest. However, reports of novel imaging studies and endocrine side effects in single patients may be considered. Research letters and letters to the editor related or unrelated to recently published articles can be submitted. Endocrine covers leading topics in endocrinology such as neuroendocrinology, pituitary and hypothalamic peptides, thyroid physiological and clinical aspects, bone and mineral metabolism and osteoporosis, obesity, lipid and energy metabolism and food intake control, insulin, Type 1 and Type 2 diabetes, hormones of male and female reproduction, adrenal diseases pediatric and geriatric endocrinology, endocrine hypertension and endocrine oncology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信