比较大型语言模型作为健康素养工具:评估和简化性别确认手术的文本。

IF 2.4 2区 医学 Q1 COMMUNICATION
Victoria N Yi, Angel P Scialdone, Ann Marie Flusche, Kendall Reitz, Holly C Lewis, William M Tian, Elda Fisher, Kristen Rezak, Ash Patel
{"title":"比较大型语言模型作为健康素养工具:评估和简化性别确认手术的文本。","authors":"Victoria N Yi, Angel P Scialdone, Ann Marie Flusche, Kendall Reitz, Holly C Lewis, William M Tian, Elda Fisher, Kristen Rezak, Ash Patel","doi":"10.1080/10810730.2025.2547321","DOIUrl":null,"url":null,"abstract":"<p><p>Patient-facing materials in gender-affirming surgery are often written at a level higher than the NIH-recommended eighth grade reading level for patient education materials. In efforts to make patient resources more accessible, ChatGPT has successfully optimized linguistic content for patients seeking care in various medical fields. This study aims to evaluate and compare the ability of large language models (LLMs) to analyze readability and simplify online patient-facing resources for gender-affirming procedures. Google Incognito searches were performed on 15 terms relating to gender-affirming surgery. The first 20 text results were analyzed for reading level difficulty by an online readability calculator, Readability Scoring System v2.0 (RSS). Eight easily accessible LLMs were used to assess texts for readability and simplify texts to an eighth grade reading level, which were reevaluated by the RSS. Descriptive statistics, t-tests, and one-way ANOVA tests were used for statistical analyses. Online resources were written with a mean reading grade level of 12.66 ± 2.54. Google Gemini was most successful at simplifying texts (8.39 ± 1.49), followed by Anthropic Claude (9.53 ± 1.85) and ChatGPT 4 (10.19 ± 1.83). LLMs had a greater margin of error when assessing readability of feminizing and facial procedures and when simplifying genital procedures (<i>p</i> < .017) Online texts on gender-affirming procedures are written with a readability more challenging than is recommended for patient-facing resources. Certain LLMs were better at simplifying texts than others. Providers should use caution when using LLMs for patient education in gender-affirming care, as they are prone to variability and bias.</p>","PeriodicalId":16026,"journal":{"name":"Journal of Health Communication","volume":" ","pages":"1-19"},"PeriodicalIF":2.4000,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing Large Language Models as Health Literacy Tools: Evaluating and Simplifying Texts on gender-Affirming Surgery.\",\"authors\":\"Victoria N Yi, Angel P Scialdone, Ann Marie Flusche, Kendall Reitz, Holly C Lewis, William M Tian, Elda Fisher, Kristen Rezak, Ash Patel\",\"doi\":\"10.1080/10810730.2025.2547321\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Patient-facing materials in gender-affirming surgery are often written at a level higher than the NIH-recommended eighth grade reading level for patient education materials. In efforts to make patient resources more accessible, ChatGPT has successfully optimized linguistic content for patients seeking care in various medical fields. This study aims to evaluate and compare the ability of large language models (LLMs) to analyze readability and simplify online patient-facing resources for gender-affirming procedures. Google Incognito searches were performed on 15 terms relating to gender-affirming surgery. The first 20 text results were analyzed for reading level difficulty by an online readability calculator, Readability Scoring System v2.0 (RSS). Eight easily accessible LLMs were used to assess texts for readability and simplify texts to an eighth grade reading level, which were reevaluated by the RSS. Descriptive statistics, t-tests, and one-way ANOVA tests were used for statistical analyses. Online resources were written with a mean reading grade level of 12.66 ± 2.54. Google Gemini was most successful at simplifying texts (8.39 ± 1.49), followed by Anthropic Claude (9.53 ± 1.85) and ChatGPT 4 (10.19 ± 1.83). LLMs had a greater margin of error when assessing readability of feminizing and facial procedures and when simplifying genital procedures (<i>p</i> < .017) Online texts on gender-affirming procedures are written with a readability more challenging than is recommended for patient-facing resources. Certain LLMs were better at simplifying texts than others. Providers should use caution when using LLMs for patient education in gender-affirming care, as they are prone to variability and bias.</p>\",\"PeriodicalId\":16026,\"journal\":{\"name\":\"Journal of Health Communication\",\"volume\":\" \",\"pages\":\"1-19\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Health Communication\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/10810730.2025.2547321\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMMUNICATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Health Communication","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10810730.2025.2547321","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMMUNICATION","Score":null,"Total":0}
引用次数: 0

摘要

在性别确认手术中,面对患者的材料通常比美国国立卫生研究院推荐的八年级患者教育材料的阅读水平要高。为了使患者资源更容易获得,ChatGPT已成功优化了各个医疗领域的患者就医语言内容。本研究旨在评估和比较大型语言模型(llm)在分析可读性和简化面向患者的在线性别确认程序资源方面的能力。在与性别确认手术相关的15个词条中进行了隐蔽性搜索。通过在线可读性计算器“可读性评分系统v2.0”(RSS)分析前20篇文章的阅读难度。使用8个易于访问的llm来评估文本的可读性,并将文本简化为八年级阅读水平,然后通过RSS重新评估。采用描述性统计、t检验和单因素方差分析进行统计分析。在线资源的平均阅读等级水平为12.66±2.54。谷歌Gemini在简化文本方面最成功(8.39±1.49),其次是Anthropic Claude(9.53±1.85)和ChatGPT 4(10.19±1.83)。法学硕士在评估女性化和面部手术的可读性以及简化生殖器手术时有更大的误差幅度
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparing Large Language Models as Health Literacy Tools: Evaluating and Simplifying Texts on gender-Affirming Surgery.

Patient-facing materials in gender-affirming surgery are often written at a level higher than the NIH-recommended eighth grade reading level for patient education materials. In efforts to make patient resources more accessible, ChatGPT has successfully optimized linguistic content for patients seeking care in various medical fields. This study aims to evaluate and compare the ability of large language models (LLMs) to analyze readability and simplify online patient-facing resources for gender-affirming procedures. Google Incognito searches were performed on 15 terms relating to gender-affirming surgery. The first 20 text results were analyzed for reading level difficulty by an online readability calculator, Readability Scoring System v2.0 (RSS). Eight easily accessible LLMs were used to assess texts for readability and simplify texts to an eighth grade reading level, which were reevaluated by the RSS. Descriptive statistics, t-tests, and one-way ANOVA tests were used for statistical analyses. Online resources were written with a mean reading grade level of 12.66 ± 2.54. Google Gemini was most successful at simplifying texts (8.39 ± 1.49), followed by Anthropic Claude (9.53 ± 1.85) and ChatGPT 4 (10.19 ± 1.83). LLMs had a greater margin of error when assessing readability of feminizing and facial procedures and when simplifying genital procedures (p < .017) Online texts on gender-affirming procedures are written with a readability more challenging than is recommended for patient-facing resources. Certain LLMs were better at simplifying texts than others. Providers should use caution when using LLMs for patient education in gender-affirming care, as they are prone to variability and bias.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.60
自引率
4.50%
发文量
63
期刊介绍: Journal of Health Communication: International Perspectives is the leading journal covering the full breadth of a field that focuses on the communication of health information globally. Articles feature research on: • Developments in the field of health communication; • New media, m-health and interactive health communication; • Health Literacy; • Social marketing; • Global Health; • Shared decision making and ethics; • Interpersonal and mass media communication; • Advances in health diplomacy, psychology, government, policy and education; • Government, civil society and multi-stakeholder initiatives; • Public Private partnerships and • Public Health campaigns. Global in scope, the journal seeks to advance a synergistic relationship between research and practical information. With a focus on promoting the health literacy of the individual, caregiver, provider, community, and those in the health policy, the journal presents research, progress in areas of technology and public health, ethics, politics and policy, and the application of health communication principles. The journal is selective with the highest quality social scientific research including qualitative and quantitative studies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信