确保来自 ChatGPT 和 CDC 的疫苗接种信息的准确性和公平性:混合方法跨语言评估。

IF 2 Q3 HEALTH CARE SCIENCES & SERVICES
Saubhagya Joshi, Eunbin Ha, Andee Amaya, Melissa Mendoza, Yonaira Rivera, Vivek K Singh
{"title":"确保来自 ChatGPT 和 CDC 的疫苗接种信息的准确性和公平性:混合方法跨语言评估。","authors":"Saubhagya Joshi, Eunbin Ha, Andee Amaya, Melissa Mendoza, Yonaira Rivera, Vivek K Singh","doi":"10.2196/60939","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In the digital age, large language models (LLMs) like ChatGPT have emerged as important sources of health care information. Their interactive capabilities offer promise for enhancing health access, particularly for groups facing traditional barriers such as insurance and language constraints. Despite their growing public health use, with millions of medical queries processed weekly, the quality of LLM-provided information remains inconsistent. Previous studies have predominantly assessed ChatGPT's English responses, overlooking the needs of non-English speakers in the United States. This study addresses this gap by evaluating the quality and linguistic parity of vaccination information from ChatGPT and the Centers for Disease Control and Prevention (CDC), emphasizing health equity.</p><p><strong>Objective: </strong>This study aims to assess the quality and language equity of vaccination information provided by ChatGPT and the CDC in English and Spanish. It highlights the critical need for cross-language evaluation to ensure equitable health information access for all linguistic groups.</p><p><strong>Methods: </strong>We conducted a comparative analysis of ChatGPT's and CDC's responses to frequently asked vaccination-related questions in both languages. The evaluation encompassed quantitative and qualitative assessments of accuracy, readability, and understandability. Accuracy was gauged by the perceived level of misinformation; readability, by the Flesch-Kincaid grade level and readability score; and understandability, by items from the National Institutes of Health's Patient Education Materials Assessment Tool (PEMAT) instrument.</p><p><strong>Results: </strong>The study found that both ChatGPT and CDC provided mostly accurate and understandable (eg, scores over 95 out of 100) responses. However, Flesch-Kincaid grade levels often exceeded the American Medical Association's recommended levels, particularly in English (eg, average grade level in English for ChatGPT=12.84, Spanish=7.93, recommended=6). CDC responses outperformed ChatGPT in readability across both languages. Notably, some Spanish responses appeared to be direct translations from English, leading to unnatural phrasing. The findings underscore the potential and challenges of using ChatGPT for health care access.</p><p><strong>Conclusions: </strong>ChatGPT holds potential as a health information resource but requires improvements in readability and linguistic equity to be truly effective for diverse populations. Crucially, the default user experience with ChatGPT, typically encountered by those without advanced language and prompting skills, can significantly shape health perceptions. This is vital from a public health standpoint, as the majority of users will interact with LLMs in their most accessible form. Ensuring that default responses are accurate, understandable, and equitable is imperative for fostering informed health decisions across diverse communities.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ensuring Accuracy and Equity in Vaccination Information From ChatGPT and CDC: Mixed-Methods Cross-Language Evaluation.\",\"authors\":\"Saubhagya Joshi, Eunbin Ha, Andee Amaya, Melissa Mendoza, Yonaira Rivera, Vivek K Singh\",\"doi\":\"10.2196/60939\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>In the digital age, large language models (LLMs) like ChatGPT have emerged as important sources of health care information. Their interactive capabilities offer promise for enhancing health access, particularly for groups facing traditional barriers such as insurance and language constraints. Despite their growing public health use, with millions of medical queries processed weekly, the quality of LLM-provided information remains inconsistent. Previous studies have predominantly assessed ChatGPT's English responses, overlooking the needs of non-English speakers in the United States. This study addresses this gap by evaluating the quality and linguistic parity of vaccination information from ChatGPT and the Centers for Disease Control and Prevention (CDC), emphasizing health equity.</p><p><strong>Objective: </strong>This study aims to assess the quality and language equity of vaccination information provided by ChatGPT and the CDC in English and Spanish. It highlights the critical need for cross-language evaluation to ensure equitable health information access for all linguistic groups.</p><p><strong>Methods: </strong>We conducted a comparative analysis of ChatGPT's and CDC's responses to frequently asked vaccination-related questions in both languages. The evaluation encompassed quantitative and qualitative assessments of accuracy, readability, and understandability. Accuracy was gauged by the perceived level of misinformation; readability, by the Flesch-Kincaid grade level and readability score; and understandability, by items from the National Institutes of Health's Patient Education Materials Assessment Tool (PEMAT) instrument.</p><p><strong>Results: </strong>The study found that both ChatGPT and CDC provided mostly accurate and understandable (eg, scores over 95 out of 100) responses. However, Flesch-Kincaid grade levels often exceeded the American Medical Association's recommended levels, particularly in English (eg, average grade level in English for ChatGPT=12.84, Spanish=7.93, recommended=6). CDC responses outperformed ChatGPT in readability across both languages. Notably, some Spanish responses appeared to be direct translations from English, leading to unnatural phrasing. The findings underscore the potential and challenges of using ChatGPT for health care access.</p><p><strong>Conclusions: </strong>ChatGPT holds potential as a health information resource but requires improvements in readability and linguistic equity to be truly effective for diverse populations. Crucially, the default user experience with ChatGPT, typically encountered by those without advanced language and prompting skills, can significantly shape health perceptions. This is vital from a public health standpoint, as the majority of users will interact with LLMs in their most accessible form. Ensuring that default responses are accurate, understandable, and equitable is imperative for fostering informed health decisions across diverse communities.</p>\",\"PeriodicalId\":14841,\"journal\":{\"name\":\"JMIR Formative Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-10-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Formative Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/60939\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Formative Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/60939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

背景:在数字时代,像 ChatGPT 这样的大型语言模型(LLMs)已成为医疗保健信息的重要来源。它们的交互能力为提高医疗服务的可及性带来了希望,特别是对于面临保险和语言限制等传统障碍的群体。尽管其在公共卫生领域的应用日益广泛,每周处理的医疗查询达数百万次,但 LLM 所提供信息的质量仍然参差不齐。以往的研究主要评估 ChatGPT 的英语回复,忽略了美国非英语使用者的需求。本研究通过评估 ChatGPT 和美国疾病控制与预防中心 (CDC) 提供的疫苗接种信息的质量和语言平等性来弥补这一不足,同时强调健康公平:本研究旨在评估 ChatGPT 和 CDC 用英语和西班牙语提供的疫苗接种信息的质量和语言平等性。本研究旨在评估 ChatGPT 和 CDC 用英语和西班牙语提供的疫苗接种信息的质量和语言公平性,强调跨语言评估的关键需求,以确保所有语言群体都能公平地获取健康信息:我们用两种语言对 ChatGPT 和疾病预防控制中心对疫苗接种相关常见问题的回答进行了比较分析。评估包括对准确性、可读性和可理解性的定量和定性评估。准确性通过错误信息的感知程度来衡量;可读性通过弗莱什-金凯德等级和可读性评分来衡量;可理解性通过美国国立卫生研究院的患者教育材料评估工具 (PEMAT) 工具中的项目来衡量:研究发现,ChatGPT 和 CDC 提供的回答大多准确易懂(例如,满分 95 分以上)。但是,Flesch-Kincaid 等级往往超过美国医学会的建议等级,尤其是英语(例如,ChatGPT 的英语平均等级=12.84,西班牙语=7.93,建议等级=6)。在两种语言的可读性方面,疾病防治中心的回复都优于 ChatGPT。值得注意的是,一些西班牙语回复似乎是从英语直接翻译过来的,导致措辞不自然。这些发现强调了将 ChatGPT 用于医疗保健访问的潜力和挑战:结论:ChatGPT 具有作为健康信息资源的潜力,但需要在可读性和语言公平性方面加以改进,才能对不同人群真正有效。最重要的是,ChatGPT 的默认用户体验(通常是那些没有高级语言和提示技能的用户)会极大地影响人们的健康观念。从公共卫生的角度来看,这一点至关重要,因为大多数用户将以最容易理解的形式与 LLM 互动。确保默认回复的准确性、可理解性和公平性对于促进不同社区做出明智的健康决定至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Ensuring Accuracy and Equity in Vaccination Information From ChatGPT and CDC: Mixed-Methods Cross-Language Evaluation.

Background: In the digital age, large language models (LLMs) like ChatGPT have emerged as important sources of health care information. Their interactive capabilities offer promise for enhancing health access, particularly for groups facing traditional barriers such as insurance and language constraints. Despite their growing public health use, with millions of medical queries processed weekly, the quality of LLM-provided information remains inconsistent. Previous studies have predominantly assessed ChatGPT's English responses, overlooking the needs of non-English speakers in the United States. This study addresses this gap by evaluating the quality and linguistic parity of vaccination information from ChatGPT and the Centers for Disease Control and Prevention (CDC), emphasizing health equity.

Objective: This study aims to assess the quality and language equity of vaccination information provided by ChatGPT and the CDC in English and Spanish. It highlights the critical need for cross-language evaluation to ensure equitable health information access for all linguistic groups.

Methods: We conducted a comparative analysis of ChatGPT's and CDC's responses to frequently asked vaccination-related questions in both languages. The evaluation encompassed quantitative and qualitative assessments of accuracy, readability, and understandability. Accuracy was gauged by the perceived level of misinformation; readability, by the Flesch-Kincaid grade level and readability score; and understandability, by items from the National Institutes of Health's Patient Education Materials Assessment Tool (PEMAT) instrument.

Results: The study found that both ChatGPT and CDC provided mostly accurate and understandable (eg, scores over 95 out of 100) responses. However, Flesch-Kincaid grade levels often exceeded the American Medical Association's recommended levels, particularly in English (eg, average grade level in English for ChatGPT=12.84, Spanish=7.93, recommended=6). CDC responses outperformed ChatGPT in readability across both languages. Notably, some Spanish responses appeared to be direct translations from English, leading to unnatural phrasing. The findings underscore the potential and challenges of using ChatGPT for health care access.

Conclusions: ChatGPT holds potential as a health information resource but requires improvements in readability and linguistic equity to be truly effective for diverse populations. Crucially, the default user experience with ChatGPT, typically encountered by those without advanced language and prompting skills, can significantly shape health perceptions. This is vital from a public health standpoint, as the majority of users will interact with LLMs in their most accessible form. Ensuring that default responses are accurate, understandable, and equitable is imperative for fostering informed health decisions across diverse communities.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JMIR Formative Research
JMIR Formative Research Medicine-Medicine (miscellaneous)
CiteScore
2.70
自引率
9.10%
发文量
579
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信