Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0.

IF 0.5 Q4 SURGERY

Journal of Hand and Microsurgery Pub Date : 2025-05-05 eCollection Date: 2025-07-01 DOI:10.1016/j.jham.2025.100258

Salman Hasan, Kyros Ipaktchi, Nicolas Meyer, Philippe Liverneaux

{"title":"Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0.","authors":"Salman Hasan, Kyros Ipaktchi, Nicolas Meyer, Philippe Liverneaux","doi":"10.1016/j.jham.2025.100258","DOIUrl":null,"url":null,"abstract":"Certification in hand surgery in Europe (EBHS) and the United States (HSE) requires a subspecialty examination. These exams differ in format, and practice exams, such as those published by the Journal of Hand Surgery (European Volume) and the ASSH, are used for preparation. This study aimed to compare the difficulty of the multiple-choice questions (MCQs) for the EBHS and HSE practice exams under the assumption that European MCQs are more challenging. ChatGPT 4.0 answered 94 MCQs (34 EBHS and 60 HSE practice exams) across five attempts. We excluded MCQs with visual aids. Performance was analyzed both quantitatively (overall and by section) and qualitatively. ChatGPT's scores improved after being provided with correct answers, from 59 % to 71 % for EBHS and 97 % for HSE practice exams by the 5th attempt. The European MCQs proved more difficult, with limited progress (<50 % accuracy up to the 5th attempt), while ChatGPT demonstrated better learning with the HSE questions. The complexity of the European MCQs raises questions about the harmonization of certification standards. ChatGPT can help standardize evaluations, though its performance remains inferior to that of humans. The findings confirm the hypothesis that EBHS MCQs are more challenging than the HSE practice exam.Level of evidence: Exploratory study, level of evidence IV.","PeriodicalId":45368,"journal":{"name":"Journal of Hand and Microsurgery","volume":"17 4","pages":"100258"},"PeriodicalIF":0.5000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12133689/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hand and Microsurgery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jham.2025.100258","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"SURGERY","Score":null,"Total":0}

引用次数: 0

Abstract

Certification in hand surgery in Europe (EBHS) and the United States (HSE) requires a subspecialty examination. These exams differ in format, and practice exams, such as those published by the Journal of Hand Surgery (European Volume) and the ASSH, are used for preparation. This study aimed to compare the difficulty of the multiple-choice questions (MCQs) for the EBHS and HSE practice exams under the assumption that European MCQs are more challenging. ChatGPT 4.0 answered 94 MCQs (34 EBHS and 60 HSE practice exams) across five attempts. We excluded MCQs with visual aids. Performance was analyzed both quantitatively (overall and by section) and qualitatively. ChatGPT's scores improved after being provided with correct answers, from 59 % to 71 % for EBHS and 97 % for HSE practice exams by the 5th attempt. The European MCQs proved more difficult, with limited progress (<50 % accuracy up to the 5th attempt), while ChatGPT demonstrated better learning with the HSE questions. The complexity of the European MCQs raises questions about the harmonization of certification standards. ChatGPT can help standardize evaluations, though its performance remains inferior to that of humans. The findings confirm the hypothesis that EBHS MCQs are more challenging than the HSE practice exam.

Level of evidence: Exploratory study, level of evidence IV.

查看原文本刊更多论文

使用ChatGPT 4.0的欧美手外科认证考试比较

欧洲（EBHS）和美国（HSE）的手外科认证要求通过亚专科考试。这些考试的形式不同，实践考试，如手外科杂志（欧洲卷）和ASSH出版的考试，用于准备。本研究的目的是在假定欧洲的选择题难度更大的情况下，比较EBHS和HSE实践考试的选择题难度。ChatGPT 4.0在五次尝试中回答了94个mcq（34个EBHS和60个HSE实践考试）。我们排除了有视觉辅助的mcq。性能进行了定量（总体和分段）和定性分析。在提供正确答案后，ChatGPT的分数有所提高，在第5次尝试时，EBHS的分数从59%提高到71%，HSE实践考试的分数为97%。欧洲mcq更难，进展有限（证据等级：探索性研究，证据等级IV）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Hand and Microsurgery SURGERY-

CiteScore

1.00

自引率

25.00%

发文量