Austin R Swisher, Arthur W Wu, Gene C Liu, Matthew K Lee, Taylor R Carle, Dennis M Tang
{"title":"Enhancing Health Literacy: Evaluating the Readability of Patient Handouts Revised by ChatGPT's Large Language Model.","authors":"Austin R Swisher, Arthur W Wu, Gene C Liu, Matthew K Lee, Taylor R Carle, Dennis M Tang","doi":"10.1002/ohn.927","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To use an artificial intelligence (AI)-powered large language model (LLM) to improve readability of patient handouts.</p><p><strong>Study design: </strong>Review of online material modified by AI.</p><p><strong>Setting: </strong>Academic center.</p><p><strong>Methods: </strong>Five handout materials obtained from the American Rhinologic Society (ARS) and the American Academy of Facial Plastic and Reconstructive Surgery websites were assessed using validated readability metrics. The handouts were inputted into OpenAI's ChatGPT-4 after prompting: \"Rewrite the following at a 6th-grade reading level.\" The understandability and actionability of both native and LLM-revised versions were evaluated using the Patient Education Materials Assessment Tool (PEMAT). Results were compared using Wilcoxon rank-sum tests.</p><p><strong>Results: </strong>The mean readability scores of the standard (ARS, American Academy of Facial Plastic and Reconstructive Surgery) materials corresponded to \"difficult,\" with reading categories ranging between high school and university grade levels. Conversely, the LLM-revised handouts had an average seventh-grade reading level. LLM-revised handouts had better readability in nearly all metrics tested: Flesch-Kincaid Reading Ease (70.8 vs 43.9; P < .05), Gunning Fog Score (10.2 vs 14.42; P < .05), Simple Measure of Gobbledygook (9.9 vs 13.1; P < .05), Coleman-Liau (8.8 vs 12.6; P < .05), and Automated Readability Index (8.2 vs 10.7; P = .06). PEMAT scores were significantly higher in the LLM-revised handouts for understandability (91 vs 74%; P < .05) with similar actionability (42 vs 34%; P = .15) when compared to the standard materials.</p><p><strong>Conclusion: </strong>Patient-facing handouts can be augmented by ChatGPT with simple prompting to tailor information with improved readability. This study demonstrates the utility of LLMs to aid in rewriting patient handouts and may serve as a tool to help optimize education materials.</p><p><strong>Level of evidence: </strong>Level VI.</p>","PeriodicalId":19707,"journal":{"name":"Otolaryngology- Head and Neck Surgery","volume":" ","pages":"1751-1757"},"PeriodicalIF":2.6000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Otolaryngology- Head and Neck Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/ohn.927","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/6 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: To use an artificial intelligence (AI)-powered large language model (LLM) to improve readability of patient handouts.
Study design: Review of online material modified by AI.
Setting: Academic center.
Methods: Five handout materials obtained from the American Rhinologic Society (ARS) and the American Academy of Facial Plastic and Reconstructive Surgery websites were assessed using validated readability metrics. The handouts were inputted into OpenAI's ChatGPT-4 after prompting: "Rewrite the following at a 6th-grade reading level." The understandability and actionability of both native and LLM-revised versions were evaluated using the Patient Education Materials Assessment Tool (PEMAT). Results were compared using Wilcoxon rank-sum tests.
Results: The mean readability scores of the standard (ARS, American Academy of Facial Plastic and Reconstructive Surgery) materials corresponded to "difficult," with reading categories ranging between high school and university grade levels. Conversely, the LLM-revised handouts had an average seventh-grade reading level. LLM-revised handouts had better readability in nearly all metrics tested: Flesch-Kincaid Reading Ease (70.8 vs 43.9; P < .05), Gunning Fog Score (10.2 vs 14.42; P < .05), Simple Measure of Gobbledygook (9.9 vs 13.1; P < .05), Coleman-Liau (8.8 vs 12.6; P < .05), and Automated Readability Index (8.2 vs 10.7; P = .06). PEMAT scores were significantly higher in the LLM-revised handouts for understandability (91 vs 74%; P < .05) with similar actionability (42 vs 34%; P = .15) when compared to the standard materials.
Conclusion: Patient-facing handouts can be augmented by ChatGPT with simple prompting to tailor information with improved readability. This study demonstrates the utility of LLMs to aid in rewriting patient handouts and may serve as a tool to help optimize education materials.
期刊介绍:
Otolaryngology–Head and Neck Surgery (OTO-HNS) is the official peer-reviewed publication of the American Academy of Otolaryngology–Head and Neck Surgery Foundation. The mission of Otolaryngology–Head and Neck Surgery is to publish contemporary, ethical, clinically relevant information in otolaryngology, head and neck surgery (ear, nose, throat, head, and neck disorders) that can be used by otolaryngologists, clinicians, scientists, and specialists to improve patient care and public health.