Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models.

IF 3.2 Q3 CLINICAL NEUROLOGY

Neurology. Clinical practice Pub Date : 2025-02-01 Epub Date: 2024-10-08 DOI:10.1212/CPJ.0000000000200366

Qais A Dihan, Andrew D Brown, Ana T Zaldivar, Muhammad Z Chauhan, Taher K Eleiwa, Amr K Hassan, Omar Solyman, Ryan Gise, Paul H Phillips, Ahmed B Sallam, Abdelrahman M Elhusseiny

{"title":"Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models.","authors":"Qais A Dihan, Andrew D Brown, Ana T Zaldivar, Muhammad Z Chauhan, Taher K Eleiwa, Amr K Hassan, Omar Solyman, Ryan Gise, Paul H Phillips, Ahmed B Sallam, Abdelrahman M Elhusseiny","doi":"10.1212/CPJ.0000000000200366","DOIUrl":null,"url":null,"abstract":"Background and objectives: We evaluated the performance of 3 large language models (LLMs) in generating patient education materials (PEMs) and enhancing the readability of prewritten PEMs on idiopathic intracranial hypertension (IIH).Methods: This cross-sectional comparative study compared 3 LLMs, ChatGPT-3.5, ChatGPT-4, and Google Bard, for their ability to generate PEMs on IIH using 3 prompts. Prompt A (control prompt): \"Can you write a patient-targeted health information handout on idiopathic intracranial hypertension that is easily understandable by the average American?\", Prompt B (modifier statement + control prompt): \"Given patient education materials are recommended to be written at a 6th-grade reading level, using the SMOG readability formula, can you write a patient-targeted health information handout on idiopathic intracranial hypertension that is easily understandable by the average American?\", and Prompt C: \"Given patient education materials are recommended to be written at a 6th-grade reading level, using the SMOG readability formula, can you rewrite the following text to a 6th-grade reading level: [insert text].\" We compared generated and rewritten PEMs, along with the first 20 googled eligible PEMs on IIH, on readability (Simple Measure of Gobbledygook [SMOG] and Flesch-Kincaid Grade Level [FKGL]), quality (DISCERN and Patient Education Materials Assessment tool [PEMAT]), and accuracy (Likert misinformation scale).Results: Generated PEMs were of high quality, understandability, and accuracy (median DISCERN score ≥4, PEMAT understandability ≥70%, Likert misinformation scale = 1). Only ChatGPT-4 was able to generate PEMs at the specified 6th-grade reading level (SMOG: 5.5 ± 0.6, FKGL: 5.6 ± 0.7). Original published PEMs were rewritten to below a 6th-grade reading level with Prompt C, without a decrease in quality, understandability, or accuracy only by ChatGPT-4 (SMOG: 5.6 ± 0.6, FKGL: 5.7 ± 0.8, p < 0.001, DISCERN ≥4, Likert misinformation = 1).Discussion: In conclusion, LLMs, particularly ChatGPT-4, can produce high-quality, readable PEMs on IIH. They can also serve as supplementary tools to improve the readability of prewritten PEMs while maintaining quality and accuracy.","PeriodicalId":19136,"journal":{"name":"Neurology. Clinical practice","volume":"15 1","pages":"e200366"},"PeriodicalIF":3.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464234/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurology. Clinical practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1212/CPJ.0000000000200366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/8 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background and objectives: We evaluated the performance of 3 large language models (LLMs) in generating patient education materials (PEMs) and enhancing the readability of prewritten PEMs on idiopathic intracranial hypertension (IIH).

Methods: This cross-sectional comparative study compared 3 LLMs, ChatGPT-3.5, ChatGPT-4, and Google Bard, for their ability to generate PEMs on IIH using 3 prompts. Prompt A (control prompt): "Can you write a patient-targeted health information handout on idiopathic intracranial hypertension that is easily understandable by the average American?", Prompt B (modifier statement + control prompt): "Given patient education materials are recommended to be written at a 6th-grade reading level, using the SMOG readability formula, can you write a patient-targeted health information handout on idiopathic intracranial hypertension that is easily understandable by the average American?", and Prompt C: "Given patient education materials are recommended to be written at a 6th-grade reading level, using the SMOG readability formula, can you rewrite the following text to a 6th-grade reading level: [insert text]." We compared generated and rewritten PEMs, along with the first 20 googled eligible PEMs on IIH, on readability (Simple Measure of Gobbledygook [SMOG] and Flesch-Kincaid Grade Level [FKGL]), quality (DISCERN and Patient Education Materials Assessment tool [PEMAT]), and accuracy (Likert misinformation scale).

Results: Generated PEMs were of high quality, understandability, and accuracy (median DISCERN score ≥4, PEMAT understandability ≥70%, Likert misinformation scale = 1). Only ChatGPT-4 was able to generate PEMs at the specified 6th-grade reading level (SMOG: 5.5 ± 0.6, FKGL: 5.6 ± 0.7). Original published PEMs were rewritten to below a 6th-grade reading level with Prompt C, without a decrease in quality, understandability, or accuracy only by ChatGPT-4 (SMOG: 5.6 ± 0.6, FKGL: 5.7 ± 0.8, p < 0.001, DISCERN ≥4, Likert misinformation = 1).

Discussion: In conclusion, LLMs, particularly ChatGPT-4, can produce high-quality, readable PEMs on IIH. They can also serve as supplementary tools to improve the readability of prewritten PEMs while maintaining quality and accuracy.

查看原文本刊更多论文

推进特发性颅内高压患者教育：大语言模型的前景。

背景和目的：我们评估了 3 种大型语言模型（LLMs）在生成患者教育材料（PEMs）和提高预写的特发性颅内高压（IIH）PEMs 可读性方面的性能：这项横向比较研究比较了 ChatGPT-3.5、ChatGPT-4 和 Google Bard 这 3 种 LLM 使用 3 种提示生成 IIH PEM 的能力。提示 A（对照提示）："提示 B（修饰语陈述 + 对照提示）："您能否编写一份普通美国人易于理解的、针对患者的特发性颅内高压健康信息手册？提示 C："鉴于建议患者教育材料以六年级阅读水平编写，使用 SMOG 可读性公式，您能否编写一份普通美国人易于理解的特发性颅内高压症患者目标健康信息手册？"以及提示 C："鉴于建议患者教育材料以六年级阅读水平编写，使用 SMOG 可读性公式，您能否将以下文本改写为六年级阅读水平：[插入文本]"。我们比较了生成的和改写的患者教育材料，以及在谷歌上搜索到的符合条件的前 20 篇关于 IIH 的患者教育材料的可读性（Simple Measure of Gobbledygook [SMOG] 和 Flesch-Kincaid Grade Level [FKGL]）、质量（DISCERN 和患者教育材料评估工具 [PEMAT]）和准确性（Likert 错误信息量表）：生成的患者教育材料具有较高的质量、可理解性和准确性（DISCERN 中位数得分≥4，PEMAT 可理解性≥70%，Likert 错误信息量表 = 1）。只有 ChatGPT-4 能够生成达到规定的六年级阅读水平的 PEM（SMOG：5.5 ± 0.6，FKGL：5.6 ± 0.7）。只有 ChatGPT-4 能将原始发表的 PEM 重写到低于六年级的阅读水平（SMOG：5.6 ± 0.6，FKGL：5.7 ± 0.8，p < 0.001，DISCERN ≥4，Likert 错误信息 = 1）：总之，LLM，尤其是 ChatGPT-4，可以制作高质量、可读性强的 IIH PEM。它们还可以作为辅助工具，在保证质量和准确性的同时提高预写 PEM 的可读性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurology. Clinical practice CLINICAL NEUROLOGY-

CiteScore

4.00

自引率

0.00%

发文量

期刊介绍： Neurology® Genetics is an online open access journal publishing peer-reviewed reports in the field of neurogenetics. The journal publishes original articles in all areas of neurogenetics including rare and common genetic variations, genotype-phenotype correlations, outlier phenotypes as a result of mutations in known disease genes, and genetic variations with a putative link to diseases. Articles include studies reporting on genetic disease risk, pharmacogenomics, and results of gene-based clinical trials (viral, ASO, etc.). Genetically engineered model systems are not a primary focus of Neurology® Genetics, but studies using model systems for treatment trials, including well-powered studies reporting negative results, are welcome.