Salman Hasan, Kyros Ipaktchi, Nicolas Meyer, Philippe Liverneaux
{"title":"使用ChatGPT 4.0的欧美手外科认证考试比较","authors":"Salman Hasan, Kyros Ipaktchi, Nicolas Meyer, Philippe Liverneaux","doi":"10.1016/j.jham.2025.100258","DOIUrl":null,"url":null,"abstract":"<p><p>Certification in hand surgery in Europe (EBHS) and the United States (HSE) requires a subspecialty examination. These exams differ in format, and practice exams, such as those published by the Journal of Hand Surgery (European Volume) and the ASSH, are used for preparation. This study aimed to compare the difficulty of the multiple-choice questions (MCQs) for the EBHS and HSE practice exams under the assumption that European MCQs are more challenging. ChatGPT 4.0 answered 94 MCQs (34 EBHS and 60 HSE practice exams) across five attempts. We excluded MCQs with visual aids. Performance was analyzed both quantitatively (overall and by section) and qualitatively. ChatGPT's scores improved after being provided with correct answers, from 59 % to 71 % for EBHS and 97 % for HSE practice exams by the 5th attempt. The European MCQs proved more difficult, with limited progress (<50 % accuracy up to the 5th attempt), while ChatGPT demonstrated better learning with the HSE questions. The complexity of the European MCQs raises questions about the harmonization of certification standards. ChatGPT can help standardize evaluations, though its performance remains inferior to that of humans. The findings confirm the hypothesis that EBHS MCQs are more challenging than the HSE practice exam.</p><p><strong>Level of evidence: </strong>Exploratory study, level of evidence IV.</p>","PeriodicalId":45368,"journal":{"name":"Journal of Hand and Microsurgery","volume":"17 4","pages":"100258"},"PeriodicalIF":0.3000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12133689/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0.\",\"authors\":\"Salman Hasan, Kyros Ipaktchi, Nicolas Meyer, Philippe Liverneaux\",\"doi\":\"10.1016/j.jham.2025.100258\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Certification in hand surgery in Europe (EBHS) and the United States (HSE) requires a subspecialty examination. These exams differ in format, and practice exams, such as those published by the Journal of Hand Surgery (European Volume) and the ASSH, are used for preparation. This study aimed to compare the difficulty of the multiple-choice questions (MCQs) for the EBHS and HSE practice exams under the assumption that European MCQs are more challenging. ChatGPT 4.0 answered 94 MCQs (34 EBHS and 60 HSE practice exams) across five attempts. We excluded MCQs with visual aids. Performance was analyzed both quantitatively (overall and by section) and qualitatively. ChatGPT's scores improved after being provided with correct answers, from 59 % to 71 % for EBHS and 97 % for HSE practice exams by the 5th attempt. The European MCQs proved more difficult, with limited progress (<50 % accuracy up to the 5th attempt), while ChatGPT demonstrated better learning with the HSE questions. The complexity of the European MCQs raises questions about the harmonization of certification standards. ChatGPT can help standardize evaluations, though its performance remains inferior to that of humans. The findings confirm the hypothesis that EBHS MCQs are more challenging than the HSE practice exam.</p><p><strong>Level of evidence: </strong>Exploratory study, level of evidence IV.</p>\",\"PeriodicalId\":45368,\"journal\":{\"name\":\"Journal of Hand and Microsurgery\",\"volume\":\"17 4\",\"pages\":\"100258\"},\"PeriodicalIF\":0.3000,\"publicationDate\":\"2025-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12133689/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Hand and Microsurgery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jham.2025.100258\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q4\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hand and Microsurgery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jham.2025.100258","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"SURGERY","Score":null,"Total":0}
Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0.
Certification in hand surgery in Europe (EBHS) and the United States (HSE) requires a subspecialty examination. These exams differ in format, and practice exams, such as those published by the Journal of Hand Surgery (European Volume) and the ASSH, are used for preparation. This study aimed to compare the difficulty of the multiple-choice questions (MCQs) for the EBHS and HSE practice exams under the assumption that European MCQs are more challenging. ChatGPT 4.0 answered 94 MCQs (34 EBHS and 60 HSE practice exams) across five attempts. We excluded MCQs with visual aids. Performance was analyzed both quantitatively (overall and by section) and qualitatively. ChatGPT's scores improved after being provided with correct answers, from 59 % to 71 % for EBHS and 97 % for HSE practice exams by the 5th attempt. The European MCQs proved more difficult, with limited progress (<50 % accuracy up to the 5th attempt), while ChatGPT demonstrated better learning with the HSE questions. The complexity of the European MCQs raises questions about the harmonization of certification standards. ChatGPT can help standardize evaluations, though its performance remains inferior to that of humans. The findings confirm the hypothesis that EBHS MCQs are more challenging than the HSE practice exam.
Level of evidence: Exploratory study, level of evidence IV.