评估 ChatGPT 对静脉曲张射频消融相关问题的答复质量。

IF 2.8 2区医学 Q2 PERIPHERAL VASCULAR DISEASE

Journal of vascular surgery. Venous and lymphatic disorders Pub Date : 2025-01-01 DOI:10.1016/j.jvsv.2024.101985

Muhammad Anees MBBS , Fareed Ahmed Shaikh MBBS, MRCSEd, FCPS , Hafsah Shaikh MBBS , Nadeem Ahmed Siddiqui MBBS, FCPS, FRCS , Zia Ur Rehman MBBS, FCPS, FRCS

{"title":"评估 ChatGPT 对静脉曲张射频消融相关问题的答复质量。","authors":"Muhammad Anees MBBS , Fareed Ahmed Shaikh MBBS, MRCSEd, FCPS , Hafsah Shaikh MBBS , Nadeem Ahmed Siddiqui MBBS, FCPS, FRCS , Zia Ur Rehman MBBS, FCPS, FRCS","doi":"10.1016/j.jvsv.2024.101985","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>This study aimed to evaluate the accuracy and reproducibility of information provided by ChatGPT, in response to frequently asked questions about radiofrequency ablation (RFA) for varicose veins.</div></div><div><h3>Methods</h3><div>This cross-sectional study was conducted at The Aga Khan University Hospital, Karachi, Pakistan. A set of 18 frequently asked questions regarding RFA for varicose veins were compiled from credible online sources and presented to ChatGPT twice, separately, using the new chat option. Twelve experienced vascular surgeons (with >2 years of experience and ≥20 RFA procedures performed annually) independently evaluated the accuracy of the responses using a 4-point Likert scale and assessed their reproducibility.</div></div><div><h3>Results</h3><div>Most evaluators were males (n = 10/12 [83.3%]) with an average of 12.3 ± 6.2 years of experience as a vascular surgeon. Six evaluators (50%) were from the UK followed by three from Saudi Arabia (25.0%), two from Pakistan (16.7%), and one from the United States (8.3%). Among the 216 accuracy grades, most of the evaluators graded the responses as comprehensive (n = 87/216 [40.3%]) or accurate but insufficient (n = 70/216 [32.4%]), whereas only 17.1% (n = 37/216) were graded as a mixture of both accurate and inaccurate information and 10.8% (n = 22/216) as entirely inaccurate. Overall, 89.8% of the responses (n = 194/216) were deemed reproducible. Of the total responses, 70.4% (n = 152/216) were classified as good quality and reproducible. The remaining responses were poor quality with 19.4% reproducible (n = 42/216) and 10.2% nonreproducible (n = 22/216). There was nonsignificant inter-rater disagreement among the vascular surgeons for overall responses (Fleiss' kappa, −0.028; <em>P</em> = .131).</div></div><div><h3>Conclusions</h3><div>ChatGPT provided generally accurate and reproducible information on RFA for varicose veins; however, variability in response quality and limited inter-rater reliability highlight the need for further improvements. Although it has the potential to enhance patient education and support healthcare decision-making, improvements in its training, validation, transparency, and mechanisms to address inaccurate or incomplete information are essential.</div></div>","PeriodicalId":17537,"journal":{"name":"Journal of vascular surgery. Venous and lymphatic disorders","volume":"13 1","pages":"Article 101985"},"PeriodicalIF":2.8000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11764857/pdf/","citationCount":"0","resultStr":"{\"title\":\"Assessing the quality of ChatGPT's responses to questions related to radiofrequency ablation for varicose veins\",\"authors\":\"Muhammad Anees MBBS , Fareed Ahmed Shaikh MBBS, MRCSEd, FCPS , Hafsah Shaikh MBBS , Nadeem Ahmed Siddiqui MBBS, FCPS, FRCS , Zia Ur Rehman MBBS, FCPS, FRCS\",\"doi\":\"10.1016/j.jvsv.2024.101985\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>This study aimed to evaluate the accuracy and reproducibility of information provided by ChatGPT, in response to frequently asked questions about radiofrequency ablation (RFA) for varicose veins.</div></div><div><h3>Methods</h3><div>This cross-sectional study was conducted at The Aga Khan University Hospital, Karachi, Pakistan. A set of 18 frequently asked questions regarding RFA for varicose veins were compiled from credible online sources and presented to ChatGPT twice, separately, using the new chat option. Twelve experienced vascular surgeons (with >2 years of experience and ≥20 RFA procedures performed annually) independently evaluated the accuracy of the responses using a 4-point Likert scale and assessed their reproducibility.</div></div><div><h3>Results</h3><div>Most evaluators were males (n = 10/12 [83.3%]) with an average of 12.3 ± 6.2 years of experience as a vascular surgeon. Six evaluators (50%) were from the UK followed by three from Saudi Arabia (25.0%), two from Pakistan (16.7%), and one from the United States (8.3%). Among the 216 accuracy grades, most of the evaluators graded the responses as comprehensive (n = 87/216 [40.3%]) or accurate but insufficient (n = 70/216 [32.4%]), whereas only 17.1% (n = 37/216) were graded as a mixture of both accurate and inaccurate information and 10.8% (n = 22/216) as entirely inaccurate. Overall, 89.8% of the responses (n = 194/216) were deemed reproducible. Of the total responses, 70.4% (n = 152/216) were classified as good quality and reproducible. The remaining responses were poor quality with 19.4% reproducible (n = 42/216) and 10.2% nonreproducible (n = 22/216). There was nonsignificant inter-rater disagreement among the vascular surgeons for overall responses (Fleiss' kappa, −0.028; <em>P</em> = .131).</div></div><div><h3>Conclusions</h3><div>ChatGPT provided generally accurate and reproducible information on RFA for varicose veins; however, variability in response quality and limited inter-rater reliability highlight the need for further improvements. Although it has the potential to enhance patient education and support healthcare decision-making, improvements in its training, validation, transparency, and mechanisms to address inaccurate or incomplete information are essential.</div></div>\",\"PeriodicalId\":17537,\"journal\":{\"name\":\"Journal of vascular surgery. Venous and lymphatic disorders\",\"volume\":\"13 1\",\"pages\":\"Article 101985\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11764857/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of vascular surgery. Venous and lymphatic disorders\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2213333X24004049\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PERIPHERAL VASCULAR DISEASE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of vascular surgery. Venous and lymphatic disorders","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213333X24004049","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PERIPHERAL VASCULAR DISEASE","Score":null,"Total":0}

引用次数: 0

摘要

目的：本研究旨在评估 ChatGPT 针对有关静脉曲张射频消融术（RFA）的常见问题（FAQs）所提供信息的准确性和可重复性：这项横断面研究在巴基斯坦卡拉奇的阿迦汗大学医院进行。研究人员从可靠的在线资料来源收集整理了 18 个有关静脉曲张射频消融的常见问题，并使用 "新聊天 "选项分别向 ChatGPT 演示了两次。12 名经验丰富的血管外科医生（具有 2 年以上经验，每年至少进行 20 次 RFA 手术）使用 4 点李克特量表独立评估了回复的准确性，并评估了回复的可重复性：大多数评估者为男性（10/12，83.3%），平均拥有 12.3 ± 6.2 年的血管外科医生经验。6名（50%）评估者来自英国，其次是3名（25.0%）来自沙特阿拉伯，2名（16.7%）来自巴基斯坦，1名（8.3%）来自美国。在 216 个准确性等级中，大多数评估者将答复评为 "全面"（87/216，40.3%）或 "准确但不充分"（70/216，32.4%），只有 17.1%（37/216）被评为 "既有准确信息也有不准确信息"，10.8%（22/216）被评为 "完全不准确"。总体而言，89.8%（n=194/216）的回答被认为是可重复的。在所有回复中，70.4%（n=152/216）被归类为 "质量好 "和 "可重现"。其余答复为 "质量差"，19.4%（n=42/216）为 "可再现"，10.2%（n=22/216）为 "不可再现"。血管外科医生之间对总体答复的评分者间差异不显著（弗莱斯卡帕：-0.028，P=0.131）：结论：ChatGPT 为静脉曲张的射频消融治疗提供了基本准确和可重复的信息，但是，回答质量的差异和评分者之间有限的可靠性凸显了进一步改进的必要性。虽然它具有加强患者教育和支持医疗决策的潜力，但改进其培训、验证、透明度以及处理不准确或不完整信息的机制也至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Assessing the quality of ChatGPT's responses to questions related to radiofrequency ablation for varicose veins

Objective

This study aimed to evaluate the accuracy and reproducibility of information provided by ChatGPT, in response to frequently asked questions about radiofrequency ablation (RFA) for varicose veins.

Methods

This cross-sectional study was conducted at The Aga Khan University Hospital, Karachi, Pakistan. A set of 18 frequently asked questions regarding RFA for varicose veins were compiled from credible online sources and presented to ChatGPT twice, separately, using the new chat option. Twelve experienced vascular surgeons (with >2 years of experience and ≥20 RFA procedures performed annually) independently evaluated the accuracy of the responses using a 4-point Likert scale and assessed their reproducibility.

Results

Most evaluators were males (n = 10/12 [83.3%]) with an average of 12.3 ± 6.2 years of experience as a vascular surgeon. Six evaluators (50%) were from the UK followed by three from Saudi Arabia (25.0%), two from Pakistan (16.7%), and one from the United States (8.3%). Among the 216 accuracy grades, most of the evaluators graded the responses as comprehensive (n = 87/216 [40.3%]) or accurate but insufficient (n = 70/216 [32.4%]), whereas only 17.1% (n = 37/216) were graded as a mixture of both accurate and inaccurate information and 10.8% (n = 22/216) as entirely inaccurate. Overall, 89.8% of the responses (n = 194/216) were deemed reproducible. Of the total responses, 70.4% (n = 152/216) were classified as good quality and reproducible. The remaining responses were poor quality with 19.4% reproducible (n = 42/216) and 10.2% nonreproducible (n = 22/216). There was nonsignificant inter-rater disagreement among the vascular surgeons for overall responses (Fleiss' kappa, −0.028; P = .131).

Conclusions

ChatGPT provided generally accurate and reproducible information on RFA for varicose veins; however, variability in response quality and limited inter-rater reliability highlight the need for further improvements. Although it has the potential to enhance patient education and support healthcare decision-making, improvements in its training, validation, transparency, and mechanisms to address inaccurate or incomplete information are essential.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of vascular surgery. Venous and lymphatic disorders SURGERYPERIPHERAL VASCULAR DISEASE&n-PERIPHERAL VASCULAR DISEASE

CiteScore

6.30

自引率

18.80%

发文量

328

审稿时长

71 days

期刊介绍： Journal of Vascular Surgery: Venous and Lymphatic Disorders is one of a series of specialist journals launched by the Journal of Vascular Surgery. It aims to be the premier international Journal of medical, endovascular and surgical management of venous and lymphatic disorders. It publishes high quality clinical, research, case reports, techniques, and practice manuscripts related to all aspects of venous and lymphatic disorders, including malformations and wound care, with an emphasis on the practicing clinician. The journal seeks to provide novel and timely information to vascular surgeons, interventionalists, phlebologists, wound care specialists, and allied health professionals who treat patients presenting with vascular and lymphatic disorders. As the official publication of The Society for Vascular Surgery and the American Venous Forum, the Journal will publish, after peer review, selected papers presented at the annual meeting of these organizations and affiliated vascular societies, as well as original articles from members and non-members.