Global variations in artificial intelligence-generated information on juvenile idiopathic arthritis.

IF 4.7 2区 医学 Q1 RHEUMATOLOGY
Saverio La Bella, Deniz Bayraktar, Annamaria Porreca, Linda C Li, Marina Attanasi, Emil Aliyev, Angela Nyangore Migowa, Christiaan Scott, Darpan R Thakare, Yagmur Bayindir, Alessandro Consolaro, Brian M Feldman, Seza Ozen
{"title":"Global variations in artificial intelligence-generated information on juvenile idiopathic arthritis.","authors":"Saverio La Bella, Deniz Bayraktar, Annamaria Porreca, Linda C Li, Marina Attanasi, Emil Aliyev, Angela Nyangore Migowa, Christiaan Scott, Darpan R Thakare, Yagmur Bayindir, Alessandro Consolaro, Brian M Feldman, Seza Ozen","doi":"10.1093/rheumatology/keaf329","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>We aimed to evaluate similarities and variations of information provided by Large Language Models (LLMs) across diverse world regions by analyzing responses to validated questions on oligoarticular juvenile idiopathic arthritis (oJIA).</p><p><strong>Methods: </strong>The ten PICOs related to the oJIA treatment on the 2021 American College of Rheumatology recommendations were simultaneously prompted in English to ChatGPT 4o from five different countries (Canada, India, Italy, Kenya and Türkiye). Readability was assessed through the Flesch Reading Ease Score (FRES), distinctiveness of terms through the Term Frequency-Inverse Document Frequency (TF-IDF) analysis. Co-occurrence networks (CONs) detailed the relationships between terms. Three experts rated the adherence of responses to recommendations using a Likert-like scale.</p><p><strong>Results: </strong>All the responses were difficult or very difficult to read, with a median FRES of 30 [24-34]. Depending on the expert, 52% to 84% of responses were mostly or fully adherent to the recommendations, with similar adherence rates across countries. No response was not adherent at all. Inter-rater agreement on the adherence of LLM-generated responses was generally weak (Kappa values mostly below 0.40), highlighting the challenges of consistently evaluating AI-generated medical information. The TF-IDF analysis showed that the distinctiveness of terminology in LLM-generated responses varied across countries, with scores ranging from 0.60-0.85. CONs detailed a strong focus on intra-articular corticosteroid treatments in Italy and emphasis on short- and long-term outcomes in Kenya.</p><p><strong>Conclusion: </strong>LLM-generated content should be critically evaluated in clinical practice, especially in the context of regional differences.</p>","PeriodicalId":21255,"journal":{"name":"Rheumatology","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Rheumatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/rheumatology/keaf329","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RHEUMATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: We aimed to evaluate similarities and variations of information provided by Large Language Models (LLMs) across diverse world regions by analyzing responses to validated questions on oligoarticular juvenile idiopathic arthritis (oJIA).

Methods: The ten PICOs related to the oJIA treatment on the 2021 American College of Rheumatology recommendations were simultaneously prompted in English to ChatGPT 4o from five different countries (Canada, India, Italy, Kenya and Türkiye). Readability was assessed through the Flesch Reading Ease Score (FRES), distinctiveness of terms through the Term Frequency-Inverse Document Frequency (TF-IDF) analysis. Co-occurrence networks (CONs) detailed the relationships between terms. Three experts rated the adherence of responses to recommendations using a Likert-like scale.

Results: All the responses were difficult or very difficult to read, with a median FRES of 30 [24-34]. Depending on the expert, 52% to 84% of responses were mostly or fully adherent to the recommendations, with similar adherence rates across countries. No response was not adherent at all. Inter-rater agreement on the adherence of LLM-generated responses was generally weak (Kappa values mostly below 0.40), highlighting the challenges of consistently evaluating AI-generated medical information. The TF-IDF analysis showed that the distinctiveness of terminology in LLM-generated responses varied across countries, with scores ranging from 0.60-0.85. CONs detailed a strong focus on intra-articular corticosteroid treatments in Italy and emphasis on short- and long-term outcomes in Kenya.

Conclusion: LLM-generated content should be critically evaluated in clinical practice, especially in the context of regional differences.

人工智能生成的青少年特发性关节炎信息的全球变化。
目的:我们旨在通过分析对青少年少关节特发性关节炎(oJIA)验证问题的回答,评估世界不同地区大语言模型(llm)提供的信息的相似性和差异性。方法:将2021年美国风湿病学会推荐的与oJIA治疗相关的10个pico同时以英文提示至来自5个不同国家(加拿大、印度、意大利、肯尼亚和斯里兰卡)的ChatGPT 40。通过Flesch Reading Ease Score (FRES)评估可读性,通过Term Frequency- inverse Document Frequency (TF-IDF)分析评估术语的独特性。共现网络(CONs)详细描述了术语之间的关系。三位专家使用李克特式量表对回答对建议的依从性进行评分。结果:所有的回答都很难或非常难以阅读,FRES中位数为30[24-34]。根据专家的不同,52%至84%的答复大部分或完全遵守了建议,各国的遵守率相似。没有反应,完全没有依从性。评分者对法学硕士生成的回答的依从性的一致性普遍较弱(Kappa值大多低于0.40),这突出了持续评估人工智能生成的医疗信息的挑战。TF-IDF分析显示,法学硕士产生的回答中术语的独特性因国家而异,得分范围为0.60-0.85。CONs详细介绍了意大利对关节内皮质类固醇治疗的高度关注,并强调了肯尼亚的短期和长期结果。结论:法学硕士生成的内容应该在临床实践中进行批判性评估,特别是在地区差异的背景下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Rheumatology
Rheumatology 医学-风湿病学
CiteScore
9.40
自引率
7.30%
发文量
1091
审稿时长
2 months
期刊介绍: Rheumatology strives to support research and discovery by publishing the highest quality original scientific papers with a focus on basic, clinical and translational research. The journal’s subject areas cover a wide range of paediatric and adult rheumatological conditions from an international perspective. It is an official journal of the British Society for Rheumatology, published by Oxford University Press. Rheumatology publishes original articles, reviews, editorials, guidelines, concise reports, meta-analyses, original case reports, clinical vignettes, letters and matters arising from published material. The journal takes pride in serving the global rheumatology community, with a focus on high societal impact in the form of podcasts, videos and extended social media presence, and utilizing metrics such as Altmetric. Keep up to date by following the journal on Twitter @RheumJnl.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信