The Role of Claude 3.5 Sonet and ChatGPT-4 in Posterior Cervical Fusion Patient Guidance.

IF 1.9 4区医学 Q3 CLINICAL NEUROLOGY

World neurosurgery Pub Date : 2025-03-11 DOI:10.1016/j.wneu.2025.123889

Rauf Nasirov

{"title":"The Role of Claude 3.5 Sonet and ChatGPT-4 in Posterior Cervical Fusion Patient Guidance.","authors":"Rauf Nasirov","doi":"10.1016/j.wneu.2025.123889","DOIUrl":null,"url":null,"abstract":"Background: This study evaluates the role of ChatGPT-4 and Claude 3.5 Sonet in postoperative management for patients undergoing posterior cervical fusion. It focuses on their ability to provide accurate, clear, and relevant responses to patient concerns, highlighting their potential as supplementary tools in surgical aftercare.Methods: Ten common postoperative questions were selected and posed to ChatGPT-4 and Claude 3.5 Sonet. Ten independent neurosurgeons evaluated responses using a structured framework that assessed accuracy, response time, clarity, and relevance. A 5-point Likert scale also measured satisfaction, quality, performance, and importance. Advanced statistical analyses were used to compare the two AI platforms, including sensitivity, specificity, p-values, confidence intervals, and Cohen's d.Results: Claude 3.5 Sonet outperformed ChatGPT-4 across all metrics, particularly in accuracy (96.5% vs. 80.6%), response time (92.9% vs. 76.4%), clarity (94.6% vs. 75.4%), and relevance (95.5% vs. 74.0%). Likert scale evaluations showed significant differences (p < 0.001) in satisfaction, quality, and performance, with Claude achieving higher ratings. Statistical analyses confirmed large effect sizes, high inter-rater reliability (kappa: 0.85-0.92 for Claude), and narrower confidence intervals, reinforcing Claude's consistency and superior performance.Conclusion: Claude 3.5 Sonet demonstrated exceptional capability in addressing postoperative concerns for posterior cervical fusion patients, surpassing ChatGPT-4 in accuracy, clarity, and practical relevance. These findings underscore its potential as a reliable AI tool for enhancing patient care and satisfaction in surgical aftercare.","PeriodicalId":23906,"journal":{"name":"World neurosurgery","volume":" ","pages":"123889"},"PeriodicalIF":1.9000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World neurosurgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.wneu.2025.123889","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: This study evaluates the role of ChatGPT-4 and Claude 3.5 Sonet in postoperative management for patients undergoing posterior cervical fusion. It focuses on their ability to provide accurate, clear, and relevant responses to patient concerns, highlighting their potential as supplementary tools in surgical aftercare.

Methods: Ten common postoperative questions were selected and posed to ChatGPT-4 and Claude 3.5 Sonet. Ten independent neurosurgeons evaluated responses using a structured framework that assessed accuracy, response time, clarity, and relevance. A 5-point Likert scale also measured satisfaction, quality, performance, and importance. Advanced statistical analyses were used to compare the two AI platforms, including sensitivity, specificity, p-values, confidence intervals, and Cohen's d.

Results: Claude 3.5 Sonet outperformed ChatGPT-4 across all metrics, particularly in accuracy (96.5% vs. 80.6%), response time (92.9% vs. 76.4%), clarity (94.6% vs. 75.4%), and relevance (95.5% vs. 74.0%). Likert scale evaluations showed significant differences (p < 0.001) in satisfaction, quality, and performance, with Claude achieving higher ratings. Statistical analyses confirmed large effect sizes, high inter-rater reliability (kappa: 0.85-0.92 for Claude), and narrower confidence intervals, reinforcing Claude's consistency and superior performance.

Conclusion: Claude 3.5 Sonet demonstrated exceptional capability in addressing postoperative concerns for posterior cervical fusion patients, surpassing ChatGPT-4 in accuracy, clarity, and practical relevance. These findings underscore its potential as a reliable AI tool for enhancing patient care and satisfaction in surgical aftercare.

查看原文本刊更多论文

背景：本研究评估了 ChatGPT-4 和 Claude 3.5 Sonet 在颈椎后路融合术患者术后管理中的作用。研究的重点是它们能否准确、清晰、贴切地回答病人关心的问题，突出它们作为术后护理辅助工具的潜力：方法：选取了十个常见的术后问题，并将其提交给 ChatGPT-4 和 Claude 3.5 Sonet。十位独立的神经外科医生采用结构化框架对回复进行评估，评估内容包括准确性、回复时间、清晰度和相关性。此外，还采用 5 点李克特量表来衡量满意度、质量、性能和重要性。高级统计分析用于比较两个人工智能平台，包括灵敏度、特异性、P 值、置信区间和 Cohen's d：Claude 3.5 Sonet 在所有指标上都优于 ChatGPT-4，特别是在准确性（96.5% 对 80.6%）、响应时间（92.9% 对 76.4%）、清晰度（94.6% 对 75.4%）和相关性（95.5% 对 74.0%）方面。李克特量表评价显示，在满意度、质量和绩效方面存在显著差异（p < 0.001），克劳德获得的评分更高。统计分析表明，克劳德的效果大、评分者之间的可靠性高（克劳德的 kappa 值为 0.85-0.92）、置信区间较窄，这都加强了克劳德的一致性和卓越性能：Claude 3.5 Sonet 在解决颈椎后路融合术患者术后问题方面表现出了卓越的能力，在准确性、清晰度和实用性方面都超过了 ChatGPT-4。这些研究结果表明，Claude 3.5 Sonet 有潜力成为一款可靠的人工智能工具，用于提高患者护理水平和术后护理满意度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

World neurosurgery CLINICAL NEUROLOGY-SURGERY

CiteScore

3.90

自引率

15.00%

发文量

1765

审稿时长

47 days

期刊介绍： World Neurosurgery has an open access mirror journal World Neurosurgery: X, sharing the same aims and scope, editorial team, submission system and rigorous peer review. The journal''s mission is to: -To provide a first-class international forum and a 2-way conduit for dialogue that is relevant to neurosurgeons and providers who care for neurosurgery patients. The categories of the exchanged information include clinical and basic science, as well as global information that provide social, political, educational, economic, cultural or societal insights and knowledge that are of significance and relevance to worldwide neurosurgery patient care. -To act as a primary intellectual catalyst for the stimulation of creativity, the creation of new knowledge, and the enhancement of quality neurosurgical care worldwide. -To provide a forum for communication that enriches the lives of all neurosurgeons and their colleagues; and, in so doing, enriches the lives of their patients. Topics to be addressed in World Neurosurgery include: EDUCATION, ECONOMICS, RESEARCH, POLITICS, HISTORY, CULTURE, CLINICAL SCIENCE, LABORATORY SCIENCE, TECHNOLOGY, OPERATIVE TECHNIQUES, CLINICAL IMAGES, VIDEOS