Suhasini Gupta BS , Brett D. Haislup MD , Alayna K. Vaughan MD , Ryan A. Hoffman MD , Anand M. Murthi MD
{"title":"Accessing information provided via artificial intelligence regarding reverse and anatomic total shoulder arthroplasty","authors":"Suhasini Gupta BS , Brett D. Haislup MD , Alayna K. Vaughan MD , Ryan A. Hoffman MD , Anand M. Murthi MD","doi":"10.1053/j.sart.2024.09.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The purpose of this study is to analyze the quality, accuracy, and readability of information provided by an artificial intelligence (AI) interface ChatGPT (OpenAI, San Francisco). We searched ChatGPT for commonly asked questions by patients regarding anatomic total shoulder arthroplasty (aTSA) and reverse total shoulder arthroplasty (rTSA).</div></div><div><h3>Methods</h3><div>ChatGPT was used to answer 30 commonly asked questions by patients regarding aTSA and rTSA, inputted as “total shoulder replacement” and “reverse shoulder replacement”. These questions were categorized based on the Rothwell criteria into <em>Fact</em>, <em>Policy</em>, and <em>Value</em>. The answers generated were analyzed for quality, accuracy, and readability using the DISCERN scale, JAMA benchmark criteria, Flesch-Kincaid Reading Ease Score (FRES) and Grade Level (FKGL).</div></div><div><h3>Results</h3><div>For both rTSA and aTSA the DISCERN score for <em>Fact</em> questions was 57, <em>Policy</em> questions was 61, and for <em>Value</em> questions was 58 (all were all considered “good”). The JAMA benchmark criteria was 0, representing the lowest score for <em>Fact</em>, <em>Policy</em>, <em>and Value</em> questions for both rTSA and aTSA questions. The FRES score for the aTSA answers for <em>Fact</em> was 15.15, for <em>Policy</em> was 11.14, and for <em>Value</em> questions was 10.95. The FRES score for rTSA questions for <em>Fact</em> is 48.02, <em>Policy</em> is 12.51, and <em>Value</em> is 17.22. The FKGL for aTSA answer for <em>Fact</em> was 17.48, <em>Policy</em> was 17.72 and <em>Value</em> was 17.96. The FKGL for rTSA questions for <em>Fact</em> are 8.10, <em>Policy</em> is 17.27, and <em>Value</em> is 16.56.</div></div><div><h3>Conclusion</h3><div>Overall, the quality of answers provided by AI open model, ChatGPT was considered “good.” The information provided had lower reliability, and lack of information regarding funding and disclosures. Most of the information generated by ChatGPT was also found to have the readability of “academic level text”, while <em>Fact</em> related information on reverse shoulder arthroplasty was found to have the readability of 9th grade level, which may be too complex for most patients. Overall, these results indicate that ChatGPT can provide correct answers to questions about aTSA and rTSA, although we would caution patients from utilizing this resource due to the lack of citations and complexity of the output that ChatGPT provides. Importantly, all answers provided by AI suggested reaching out to physicians to get more accurate and personalized advise, to factor into the shared decisions making model.</div></div>","PeriodicalId":39885,"journal":{"name":"Seminars in Arthroplasty","volume":"35 1","pages":"Pages 56-61"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seminars in Arthroplasty","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1045452724001007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Background
The purpose of this study is to analyze the quality, accuracy, and readability of information provided by an artificial intelligence (AI) interface ChatGPT (OpenAI, San Francisco). We searched ChatGPT for commonly asked questions by patients regarding anatomic total shoulder arthroplasty (aTSA) and reverse total shoulder arthroplasty (rTSA).
Methods
ChatGPT was used to answer 30 commonly asked questions by patients regarding aTSA and rTSA, inputted as “total shoulder replacement” and “reverse shoulder replacement”. These questions were categorized based on the Rothwell criteria into Fact, Policy, and Value. The answers generated were analyzed for quality, accuracy, and readability using the DISCERN scale, JAMA benchmark criteria, Flesch-Kincaid Reading Ease Score (FRES) and Grade Level (FKGL).
Results
For both rTSA and aTSA the DISCERN score for Fact questions was 57, Policy questions was 61, and for Value questions was 58 (all were all considered “good”). The JAMA benchmark criteria was 0, representing the lowest score for Fact, Policy, and Value questions for both rTSA and aTSA questions. The FRES score for the aTSA answers for Fact was 15.15, for Policy was 11.14, and for Value questions was 10.95. The FRES score for rTSA questions for Fact is 48.02, Policy is 12.51, and Value is 17.22. The FKGL for aTSA answer for Fact was 17.48, Policy was 17.72 and Value was 17.96. The FKGL for rTSA questions for Fact are 8.10, Policy is 17.27, and Value is 16.56.
Conclusion
Overall, the quality of answers provided by AI open model, ChatGPT was considered “good.” The information provided had lower reliability, and lack of information regarding funding and disclosures. Most of the information generated by ChatGPT was also found to have the readability of “academic level text”, while Fact related information on reverse shoulder arthroplasty was found to have the readability of 9th grade level, which may be too complex for most patients. Overall, these results indicate that ChatGPT can provide correct answers to questions about aTSA and rTSA, although we would caution patients from utilizing this resource due to the lack of citations and complexity of the output that ChatGPT provides. Importantly, all answers provided by AI suggested reaching out to physicians to get more accurate and personalized advise, to factor into the shared decisions making model.
本研究的目的是分析人工智能(AI)界面ChatGPT (OpenAI, San Francisco)提供的信息的质量、准确性和可读性。我们在ChatGPT中搜索了患者关于解剖性全肩关节置换术(aTSA)和反向全肩关节置换术(rTSA)的常见问题。方法采用schatgpt回答患者关于aTSA和rTSA的30个常见问题,输入“全肩关节置换术”和“反向肩关节置换术”。根据罗斯威尔的标准,这些问题被分为事实、政策和价值。使用DISCERN量表、JAMA基准标准、Flesch-Kincaid阅读轻松评分(FRES)和年级水平(FKGL)分析生成的答案的质量、准确性和可读性。结果对于rTSA和aTSA,事实问题的辨别得分为57分,政策问题为61分,价值问题为58分(所有问题都被认为是“好”)。JAMA基准标准为0,代表rTSA和aTSA问题的事实、政策和价值问题的最低得分。aTSA的“事实”问题的FRES得分为15.15分,“政策”问题的得分为11.14分,“价值”问题的得分为10.95分。rTSA问题中Fact的FRES得分为48.02,Policy为12.51,Value为17.22。aTSA对Fact的FKGL为17.48,Policy为17.72,Value为17.96。rTSA问题Fact的FKGL为8.10,Policy为17.27,Value为16.56。结论总体而言,人工智能开放模型ChatGPT提供的答案质量为“良好”。所提供的信息可靠性较低,并且缺乏关于资金和披露的信息。ChatGPT生成的大部分信息也具有“学术水平文本”的可读性,而关于肩关节置换术的Fact相关信息的可读性为9年级水平,对于大多数患者来说可能过于复杂。总的来说,这些结果表明ChatGPT可以提供关于aTSA和rTSA的问题的正确答案,尽管我们会提醒患者不要使用该资源,因为ChatGPT提供的输出缺乏引用和复杂性。重要的是,人工智能提供的所有答案都建议与医生联系,以获得更准确和个性化的建议,以纳入共同决策模型。
期刊介绍:
Each issue of Seminars in Arthroplasty provides a comprehensive, current overview of a single topic in arthroplasty. The journal addresses orthopedic surgeons, providing authoritative reviews with emphasis on new developments relevant to their practice.