ChatGPT对有关肩关节置换术的常见问题的充分回应：它是患者教育的适当辅助手段吗？

Q2 Medicine

JSES International Pub Date : 2025-05-01 DOI:10.1016/j.jseint.2025.01.008

Christopher K. Johnson MD, MS , Krishna Mandalia BS , Jason Corban FRCSC , Kaley E. Beall MPH , Sarav S. Shah MD

{"title":"ChatGPT对有关肩关节置换术的常见问题的充分回应：它是患者教育的适当辅助手段吗？","authors":"Christopher K. Johnson MD, MS , Krishna Mandalia BS , Jason Corban FRCSC , Kaley E. Beall MPH , Sarav S. Shah MD","doi":"10.1016/j.jseint.2025.01.008","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Artificial intelligence (AI) large language models, such as ChatGPT, have numerous novel applications in medicine, one of which is patient education. Several studies in other specialties have investigated the adequacy of ChatGPT-generated responses to frequently asked questions (FAQs) by patients, with largely positive results. The purpose of this study is to evaluate the accuracy and clarity of ChatGPT-generated responses to website-derived FAQs relating to shoulder arthroplasty.</div></div><div><h3>Methods</h3><div>Ten questions regarding shoulder arthroplasty were compiled from the websites of 5 leading academic institutions. These questions were rated on a scale from 1 to 4, corresponding to “excellent response not requiring clarification,” “satisfactory requiring minimal clarification,” “satisfactory requiring moderate clarification,” and “unsatisfactory requiring substantial clarification,” respectively, by 2 orthopedic surgeons. A senior shoulder arthroplasty surgeon arbitrated disagreements. Cohen’s Kappa coefficient was utilized to assess inter-rater agreement.</div></div><div><h3>Results</h3><div>After arbitration, only one response was rated as “excellent response not requiring clarification.” Nine of 10 responses required clarification. Four were rated as a “satisfactory requiring minimal clarification,” 5 were rated as a “satisfactory requiring moderate clarification,” and none were rated as “unsatisfactory requiring substantial clarification”. The Kappa coefficient was 0.516 (<em>P</em> = .027), indicating moderate agreement between reviewers.</div></div><div><h3>Conclusion</h3><div>When queried with FAQs regarding shoulder arthroplasty, ChatGPT’s responses were all deemed ‘satisfactory’, but most required clarification. This may be due to the nuances of anatomic vs. reverse shoulder replacement. Thus, patients may find benefit in using ChatGPT to guide whether or not they should seek medical attention, but are limited in the detail and accuracy of treatment-related questions. While a helpful tool to start provider–patient conversations, it does not appear that ChatGPT provides quality, verified, data-driven answers at this time, and should be used cautiously in conjunction to provider–patient discussions. Although the use of ChatGPT in answering FAQs is limited at the moment, orthopedic surgeons should continue to monitor the use of ChatGPT as a patient education tool, as well as the expanding use of AI as a possible adjunct in clinical decision-making.</div></div>","PeriodicalId":34444,"journal":{"name":"JSES International","volume":"9 3","pages":"Pages 830-836"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adequacy of ChatGPT responses to frequently asked questions about shoulder arthroplasty: is it an appropriate adjunct for patient education?\",\"authors\":\"Christopher K. Johnson MD, MS , Krishna Mandalia BS , Jason Corban FRCSC , Kaley E. Beall MPH , Sarav S. Shah MD\",\"doi\":\"10.1016/j.jseint.2025.01.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Artificial intelligence (AI) large language models, such as ChatGPT, have numerous novel applications in medicine, one of which is patient education. Several studies in other specialties have investigated the adequacy of ChatGPT-generated responses to frequently asked questions (FAQs) by patients, with largely positive results. The purpose of this study is to evaluate the accuracy and clarity of ChatGPT-generated responses to website-derived FAQs relating to shoulder arthroplasty.</div></div><div><h3>Methods</h3><div>Ten questions regarding shoulder arthroplasty were compiled from the websites of 5 leading academic institutions. These questions were rated on a scale from 1 to 4, corresponding to “excellent response not requiring clarification,” “satisfactory requiring minimal clarification,” “satisfactory requiring moderate clarification,” and “unsatisfactory requiring substantial clarification,” respectively, by 2 orthopedic surgeons. A senior shoulder arthroplasty surgeon arbitrated disagreements. Cohen’s Kappa coefficient was utilized to assess inter-rater agreement.</div></div><div><h3>Results</h3><div>After arbitration, only one response was rated as “excellent response not requiring clarification.” Nine of 10 responses required clarification. Four were rated as a “satisfactory requiring minimal clarification,” 5 were rated as a “satisfactory requiring moderate clarification,” and none were rated as “unsatisfactory requiring substantial clarification”. The Kappa coefficient was 0.516 (<em>P</em> = .027), indicating moderate agreement between reviewers.</div></div><div><h3>Conclusion</h3><div>When queried with FAQs regarding shoulder arthroplasty, ChatGPT’s responses were all deemed ‘satisfactory’, but most required clarification. This may be due to the nuances of anatomic vs. reverse shoulder replacement. Thus, patients may find benefit in using ChatGPT to guide whether or not they should seek medical attention, but are limited in the detail and accuracy of treatment-related questions. While a helpful tool to start provider–patient conversations, it does not appear that ChatGPT provides quality, verified, data-driven answers at this time, and should be used cautiously in conjunction to provider–patient discussions. Although the use of ChatGPT in answering FAQs is limited at the moment, orthopedic surgeons should continue to monitor the use of ChatGPT as a patient education tool, as well as the expanding use of AI as a possible adjunct in clinical decision-making.</div></div>\",\"PeriodicalId\":34444,\"journal\":{\"name\":\"JSES International\",\"volume\":\"9 3\",\"pages\":\"Pages 830-836\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JSES International\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666638325000337\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JSES International","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666638325000337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 0

摘要

人工智能（AI）大型语言模型，如ChatGPT，在医学上有许多新的应用，其中之一是患者教育。其他专业的几项研究已经调查了chatgpt对患者常见问题（FAQs）的反应是否足够，结果大多是积极的。本研究的目的是评估chatgpt对网站衍生的与肩关节置换术相关的常见问题的回答的准确性和清晰度。方法从国内5家知名学术机构的网站上收集有关肩关节置换术的10个问题。这些问题的评分从1到4，分别对应于“优秀的回答不需要澄清”，“满意的需要最少的澄清”，“满意的需要适度的澄清”和“不满意的需要大量的澄清”，由2名骨科医生。一位资深肩关节置换术医生对分歧进行了仲裁。采用Cohen’s Kappa系数评价评分者之间的一致性。结果仲裁后，只有一个回复被评为“优秀回复无需澄清”。10个回答中有9个需要澄清。4个被评为“满意，需要最少的澄清”，5个被评为“满意，需要适度的澄清”，没有一个被评为“不满意，需要实质性的澄清”。Kappa系数为0.516 (P = 0.027)，表明审稿人之间的一致程度中等。结论当被问及关于肩关节置换术的常见问题时，ChatGPT的回答都被认为是“满意的”，但大多数需要澄清。这可能是由于解剖与反向肩关节置换术的细微差别。因此，患者可能会发现使用ChatGPT来指导他们是否应该就医是有益的，但在治疗相关问题的细节和准确性方面受到限制。虽然ChatGPT是一种有用的工具，可用于启动提供者与患者之间的对话，但它目前似乎并不能提供高质量的、经过验证的、数据驱动的答案，因此应谨慎地将其用于提供者与患者之间的讨论。虽然目前ChatGPT在回答常见问题方面的使用是有限的，但骨科医生应该继续监测ChatGPT作为患者教育工具的使用情况，以及人工智能作为临床决策可能辅助手段的扩大使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adequacy of ChatGPT responses to frequently asked questions about shoulder arthroplasty: is it an appropriate adjunct for patient education?

Background

Artificial intelligence (AI) large language models, such as ChatGPT, have numerous novel applications in medicine, one of which is patient education. Several studies in other specialties have investigated the adequacy of ChatGPT-generated responses to frequently asked questions (FAQs) by patients, with largely positive results. The purpose of this study is to evaluate the accuracy and clarity of ChatGPT-generated responses to website-derived FAQs relating to shoulder arthroplasty.

Methods

Ten questions regarding shoulder arthroplasty were compiled from the websites of 5 leading academic institutions. These questions were rated on a scale from 1 to 4, corresponding to “excellent response not requiring clarification,” “satisfactory requiring minimal clarification,” “satisfactory requiring moderate clarification,” and “unsatisfactory requiring substantial clarification,” respectively, by 2 orthopedic surgeons. A senior shoulder arthroplasty surgeon arbitrated disagreements. Cohen’s Kappa coefficient was utilized to assess inter-rater agreement.

Results

After arbitration, only one response was rated as “excellent response not requiring clarification.” Nine of 10 responses required clarification. Four were rated as a “satisfactory requiring minimal clarification,” 5 were rated as a “satisfactory requiring moderate clarification,” and none were rated as “unsatisfactory requiring substantial clarification”. The Kappa coefficient was 0.516 (P = .027), indicating moderate agreement between reviewers.

Conclusion

When queried with FAQs regarding shoulder arthroplasty, ChatGPT’s responses were all deemed ‘satisfactory’, but most required clarification. This may be due to the nuances of anatomic vs. reverse shoulder replacement. Thus, patients may find benefit in using ChatGPT to guide whether or not they should seek medical attention, but are limited in the detail and accuracy of treatment-related questions. While a helpful tool to start provider–patient conversations, it does not appear that ChatGPT provides quality, verified, data-driven answers at this time, and should be used cautiously in conjunction to provider–patient discussions. Although the use of ChatGPT in answering FAQs is limited at the moment, orthopedic surgeons should continue to monitor the use of ChatGPT as a patient education tool, as well as the expanding use of AI as a possible adjunct in clinical decision-making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JSES International Medicine-Surgery

CiteScore

2.80

自引率

0.00%

发文量

174

审稿时长

14 weeks