Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?

IF 4.2 2区医学 Q1 ORTHOPEDICS

Clinical Orthopaedics and Related Research® Pub Date : 2024-09-25 DOI:10.1097/corr.0000000000003263

Paul G Guirguis,Mark P Youssef,Ankit Punreddy,Mina Botros,Mattie Raiford,Susan McDowell

{"title":"Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?","authors":"Paul G Guirguis,Mark P Youssef,Ankit Punreddy,Mina Botros,Mattie Raiford,Susan McDowell","doi":"10.1097/corr.0000000000003263","DOIUrl":null,"url":null,"abstract":"BACKGROUND\r\nPatients and caregivers may experience immense distress when receiving the diagnosis of a primary musculoskeletal malignancy and subsequently turn to internet resources for more information. It is not clear whether these resources, including Google and ChatGPT, offer patients information that is readable, a measure of how easy text is to understand. Since many patients turn to Google and artificial intelligence resources for healthcare information, we thought it was important to ascertain whether the information they find is readable and easy to understand. The objective of this study was to compare readability of Google search results and ChatGPT answers to frequently asked questions and assess whether these sources meet NIH recommendations for readability.\r\n\r\nQUESTIONS/PURPOSES\r\n(1) What is the readability of ChatGPT-3.5 as a source of patient information for the three most common primary bone malignancies compared with top online resources from Google search? (2) Do ChatGPT-3.5 responses and online resources meet NIH readability guidelines for patient education materials?\r\n\r\nMETHODS\r\nThis was a cross-sectional analysis of the 12 most common online questions about osteosarcoma, chondrosarcoma, and Ewing sarcoma. To be consistent with other studies of similar design that utilized national society frequently asked questions lists, questions were selected from the American Cancer Society and categorized based on content, including diagnosis, treatment, and recovery and prognosis. Google was queried using all 36 questions, and top responses were recorded. Author types, such as hospital systems, national health organizations, or independent researchers, were recorded. ChatGPT-3.5 was provided each question in independent queries without further prompting. Responses were assessed with validated reading indices to determine readability by grade level. An independent t-test was performed with significance set at p < 0.05.\r\n\r\nRESULTS\r\nGoogle (n = 36) and ChatGPT-3.5 (n = 36) answers were recorded, 12 for each of the three cancer types. Reading grade levels based on mean readability scores were 11.0 ± 2.9 and 16.1 ± 3.6, respectively. This corresponds to the eleventh grade reading level for Google and a fourth-year undergraduate student level for ChatGPT-3.5. Google answers were more readable across all individual indices, without differences in word count. No difference in readability was present across author type, question category, or cancer type. Of 72 total responses across both search modalities, none met NIH readability criteria at the sixth-grade level.\r\n\r\nCONCLUSION\r\nGoogle material was presented at a high school reading level, whereas ChatGPT-3.5 was at an undergraduate reading level. The readability of both resources was inadequate based on NIH recommendations. Improving readability is crucial for better patient understanding during cancer treatment. Physicians should assess patients' needs, offer them tailored materials, and guide them to reliable resources to prevent reliance on online information that is hard to understand.\r\n\r\nLEVEL OF EVIDENCE\r\nLevel III, prognostic study.","PeriodicalId":10404,"journal":{"name":"Clinical Orthopaedics and Related Research®","volume":"56 1","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Orthopaedics and Related Research®","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/corr.0000000000003263","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}

引用次数: 0

Abstract

BACKGROUND Patients and caregivers may experience immense distress when receiving the diagnosis of a primary musculoskeletal malignancy and subsequently turn to internet resources for more information. It is not clear whether these resources, including Google and ChatGPT, offer patients information that is readable, a measure of how easy text is to understand. Since many patients turn to Google and artificial intelligence resources for healthcare information, we thought it was important to ascertain whether the information they find is readable and easy to understand. The objective of this study was to compare readability of Google search results and ChatGPT answers to frequently asked questions and assess whether these sources meet NIH recommendations for readability. QUESTIONS/PURPOSES (1) What is the readability of ChatGPT-3.5 as a source of patient information for the three most common primary bone malignancies compared with top online resources from Google search? (2) Do ChatGPT-3.5 responses and online resources meet NIH readability guidelines for patient education materials? METHODS This was a cross-sectional analysis of the 12 most common online questions about osteosarcoma, chondrosarcoma, and Ewing sarcoma. To be consistent with other studies of similar design that utilized national society frequently asked questions lists, questions were selected from the American Cancer Society and categorized based on content, including diagnosis, treatment, and recovery and prognosis. Google was queried using all 36 questions, and top responses were recorded. Author types, such as hospital systems, national health organizations, or independent researchers, were recorded. ChatGPT-3.5 was provided each question in independent queries without further prompting. Responses were assessed with validated reading indices to determine readability by grade level. An independent t-test was performed with significance set at p < 0.05. RESULTS Google (n = 36) and ChatGPT-3.5 (n = 36) answers were recorded, 12 for each of the three cancer types. Reading grade levels based on mean readability scores were 11.0 ± 2.9 and 16.1 ± 3.6, respectively. This corresponds to the eleventh grade reading level for Google and a fourth-year undergraduate student level for ChatGPT-3.5. Google answers were more readable across all individual indices, without differences in word count. No difference in readability was present across author type, question category, or cancer type. Of 72 total responses across both search modalities, none met NIH readability criteria at the sixth-grade level. CONCLUSION Google material was presented at a high school reading level, whereas ChatGPT-3.5 was at an undergraduate reading level. The readability of both resources was inadequate based on NIH recommendations. Improving readability is crucial for better patient understanding during cancer treatment. Physicians should assess patients' needs, offer them tailored materials, and guide them to reliable resources to prevent reliance on online information that is hard to understand. LEVEL OF EVIDENCE Level III, prognostic study.

查看原文本刊更多论文

大语言模型或网络资源中有关肌肉骨骼恶性肿瘤的信息是否符合患者的阅读水平？

背景当患者和护理人员被诊断出患有原发性肌肉骨骼恶性肿瘤时，他们可能会感到非常痛苦，随后便会求助于互联网资源以获取更多信息。目前尚不清楚包括谷歌和 ChatGPT 在内的这些资源是否能为患者提供可读信息，可读性是衡量文字是否易于理解的标准。由于许多患者都会通过谷歌和人工智能资源来获取医疗保健信息，因此我们认为有必要确定他们所找到的信息是否具有可读性且易于理解。本研究的目的是比较谷歌搜索结果和 ChatGPT 对常见问题的回答的可读性，并评估这些来源是否符合美国国立卫生研究院对可读性的建议。问题/提案（1）与谷歌搜索的顶级在线资源相比，ChatGPT-3.5 作为三种最常见的原发性骨恶性肿瘤的患者信息来源的可读性如何？(2）ChatGPT-3.5 的回复和在线资源是否符合美国国立卫生研究院（NIH）关于患者教育材料的可读性指南？方法：这是一项横断面分析，分析了关于骨肉瘤、软骨肉瘤和尤文肉瘤的 12 个最常见的在线问题。为了与其他利用国家学会常见问题列表进行的类似设计研究保持一致，我们从美国癌症学会挑选了问题，并根据诊断、治疗、康复和预后等内容进行了分类。使用全部 36 个问题对谷歌进行了查询，并记录了热门回复。记录了作者类型，如医院系统、国家卫生组织或独立研究人员。ChatGPT-3.5 在独立查询中提供了每个问题，无需进一步提示。通过有效的阅读指数对回答进行评估，以确定不同年级的可读性。结果记录了谷歌（n = 36）和 ChatGPT-3.5 （n = 36）的答案，三种癌症类型各 12 个。根据平均可读性得分，阅读等级分别为 11.0 ± 2.9 和 16.1 ± 3.6。这相当于 Google 11 年级的阅读水平和 ChatGPT-3.5 的四年级本科生水平。在所有单项指数中，Google 答案的可读性更高，字数上没有差异。不同作者类型、问题类别或癌症类型的可读性没有差异。在两种搜索模式的 72 个回答中，没有一个符合美国国立卫生研究院六年级水平的可读性标准。根据美国国立卫生研究院的建议，这两种资源的可读性都不足。提高可读性对于患者在癌症治疗期间更好地理解至关重要。医生应评估患者的需求，为他们提供量身定制的材料，并指导他们使用可靠的资源，以避免依赖难以理解的在线信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Clinical Orthopaedics and Related Research® 医学-外科

CiteScore

7.00

自引率

11.90%

发文量

722

审稿时长

2.5 months

期刊介绍： Clinical Orthopaedics and Related Research® is a leading peer-reviewed journal devoted to the dissemination of new and important orthopaedic knowledge. CORR® brings readers the latest clinical and basic research, along with columns, commentaries, and interviews with authors.