Evaluating the Performance of Artificial Intelligence for Improving Readability of Online English- and Spanish-Language Orthopaedic Patient Educational Material: Challenges in Bridging the Digital Divide.

IF 4.4 1区医学 Q1 ORTHOPEDICS

Journal of Bone and Joint Surgery, American Volume Pub Date : 2025-04-16 Epub Date: 2025-02-28 DOI:10.2106/JBJS.24.01078

Carrie N Reaver, Daniel E Pereira, Elisa V Carrillo, Carolena Rojas Marcos, Charles A Goldfarb

{"title":"Evaluating the Performance of Artificial Intelligence for Improving Readability of Online English- and Spanish-Language Orthopaedic Patient Educational Material: Challenges in Bridging the Digital Divide.","authors":"Carrie N Reaver, Daniel E Pereira, Elisa V Carrillo, Carolena Rojas Marcos, Charles A Goldfarb","doi":"10.2106/JBJS.24.01078","DOIUrl":null,"url":null,"abstract":"Background: The readability of most online patient educational materials (OPEMs) in orthopaedic surgery is above the American Medical Association/National Institutes of Health recommended reading level of sixth grade for both English- and Spanish-language content. The current project evaluates ChatGPT's performance across English- and Spanish-language orthopaedic OPEMs when prompted to rewrite the material at a sixth-grade reading level.Methods: We performed a cross-sectional study evaluating the readability of 57 English- and 56 Spanish-language publicly available OPEMs found by querying online in both English and Spanish for 6 common orthopaedic procedures. Five distinct, validated readability tests were used to score the OPEMs before and after ChatGPT 4.0 was prompted to rewrite the OPEMs at a sixth-grade reading level. We compared the averages of each readability test, the cumulative average reading grade level, average total word count, average number of complex words (defined as ≥3 syllables), and average number of long sentences (defined as >22 words) between original content and ChatGPT-rewritten content for both languages using paired t tests.Results: The cumulative average reading grade level of original English- and Spanish-language OPEMs was 9.6 ± 2.6 and 9.5 ± 1.5, respectively. ChatGPT significantly lowered the reading grade level (improved comprehension) to 7.7 ± 1.9 (95% CI of difference, 1.68 to 2.15; p < 0.05) for English-language content and 8.3 ± 1.3 (95% CI, 1.17 to 1.45; p < 0.05) for Spanish-language content. English-language OPEMs saw a reduction of 2.0 ± 1.8 grade levels, whereas Spanish-language OPEMs saw a reduction of 1.5 ± 1.2 grade levels. Word count, use of complex words, and long sentences were also reduced significantly in both languages while still maintaining high accuracy and similarity compared with original content.Conclusions: Our study supports the potential of artificial intelligence as a low-cost, accessible tool to assist health professionals in improving the readability of orthopaedic OPEMs in both English and Spanish.","PeriodicalId":15273,"journal":{"name":"Journal of Bone and Joint Surgery, American Volume","volume":" ","pages":"e36"},"PeriodicalIF":4.4000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bone and Joint Surgery, American Volume","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2106/JBJS.24.01078","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The readability of most online patient educational materials (OPEMs) in orthopaedic surgery is above the American Medical Association/National Institutes of Health recommended reading level of sixth grade for both English- and Spanish-language content. The current project evaluates ChatGPT's performance across English- and Spanish-language orthopaedic OPEMs when prompted to rewrite the material at a sixth-grade reading level.

Methods: We performed a cross-sectional study evaluating the readability of 57 English- and 56 Spanish-language publicly available OPEMs found by querying online in both English and Spanish for 6 common orthopaedic procedures. Five distinct, validated readability tests were used to score the OPEMs before and after ChatGPT 4.0 was prompted to rewrite the OPEMs at a sixth-grade reading level. We compared the averages of each readability test, the cumulative average reading grade level, average total word count, average number of complex words (defined as ≥3 syllables), and average number of long sentences (defined as >22 words) between original content and ChatGPT-rewritten content for both languages using paired t tests.

Results: The cumulative average reading grade level of original English- and Spanish-language OPEMs was 9.6 ± 2.6 and 9.5 ± 1.5, respectively. ChatGPT significantly lowered the reading grade level (improved comprehension) to 7.7 ± 1.9 (95% CI of difference, 1.68 to 2.15; p < 0.05) for English-language content and 8.3 ± 1.3 (95% CI, 1.17 to 1.45; p < 0.05) for Spanish-language content. English-language OPEMs saw a reduction of 2.0 ± 1.8 grade levels, whereas Spanish-language OPEMs saw a reduction of 1.5 ± 1.2 grade levels. Word count, use of complex words, and long sentences were also reduced significantly in both languages while still maintaining high accuracy and similarity compared with original content.

Conclusions: Our study supports the potential of artificial intelligence as a low-cost, accessible tool to assist health professionals in improving the readability of orthopaedic OPEMs in both English and Spanish.

查看原文本刊更多论文

评估人工智能的性能，以提高在线英语和西班牙语骨科患者教育材料的可读性：弥合数字鸿沟的挑战。

背景：大多数骨科手术在线患者教育材料（OPEM）的可读性都高于美国医学会/美国国立卫生研究院推荐的六年级阅读水平（英语和西班牙语）。本项目评估了 ChatGPT 在被要求以六年级阅读水平重写材料时，在英语和西班牙语骨科 OPEM 中的表现：我们进行了一项横向研究，评估了通过在线查询发现的 57 份英语和 56 份西班牙语公开 OPEM 的可读性，其中包括英语和西班牙语的 6 种常见骨科手术。在 ChatGPT 4.0 按六年级阅读水平重写 OPEM 之前和之后，我们使用了五种不同的、经过验证的可读性测试对 OPEM 进行评分。我们使用配对 t 检验比较了两种语言的原始内容和 ChatGPT 重写后内容的每个可读性测试的平均值、累计平均阅读年级、平均总字数、复杂单词平均数量（定义为≥3 个音节）和长句平均数量（定义为大于 22 个单词）：英语和西班牙语原始 OPEM 的累计平均阅读水平分别为 9.6 ± 2.6 和 9.5 ± 1.5。对于英语内容，ChatGPT 大幅降低了阅读等级（提高了理解能力），达到 7.7 ± 1.9（95% CI 差异，1.68 至 2.15；p < 0.05）；对于西班牙语内容，达到 8.3 ± 1.3（95% CI，1.17 至 1.45；p < 0.05）。英语 OPEM 降低了 2.0 ± 1.8 个年级，而西班牙语 OPEM 降低了 1.5 ± 1.2 个年级。两种语言的字数、复杂词语的使用和长句也都显著减少，但与原始内容相比仍保持了较高的准确性和相似性：我们的研究证实了人工智能作为一种低成本、易获得的工具，在帮助医疗专业人员提高英语和西班牙语骨科 OPEMs 的可读性方面所具有的潜力：临床相关性：TK.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Bone and Joint Surgery, American Volume 医学-外科

CiteScore

8.90

自引率

7.50%

发文量

660

审稿时长

1 months

期刊介绍： The Journal of Bone & Joint Surgery (JBJS) has been the most valued source of information for orthopaedic surgeons and researchers for over 125 years and is the gold standard in peer-reviewed scientific information in the field. A core journal and essential reading for general as well as specialist orthopaedic surgeons worldwide, The Journal publishes evidence-based research to enhance the quality of care for orthopaedic patients. Standards of excellence and high quality are maintained in everything we do, from the science of the content published to the customer service we provide. JBJS is an independent, non-profit journal.