Mark T Keegan, Ann E Harman, Stacie G Deiner, Huaping Sun
{"title":"A Comparison of Psychometric Properties of the American Board of Anesthesiology's In-Person and Virtual Standardized Oral Examinations.","authors":"Mark T Keegan, Ann E Harman, Stacie G Deiner, Huaping Sun","doi":"10.1097/ACM.0000000000005782","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The COVID-19 pandemic prompted training institutions and national credentialing organizations to administer examinations virtually. This study compared task difficulty, examiner grading, candidate performance, and other psychometric properties between in-person and virtual standardized oral examinations (SOEs) administered by the American Board of Anesthesiology.</p><p><strong>Method: </strong>This retrospective study included SOEs administered in person from March 2018 to March 2020 and virtually from December 2020 to November 2021. The in-person and virtual SOEs share the same structure, including 4 tasks of preoperative evaluation, intraoperative management, postoperative care, and additional topics. The Many-Facet Rasch Model was used to estimate candidate performance, examiner grading severity, and task difficulty for the in-person and virtual SOEs separately; the virtual SOE was equated to the in-person SOE by common examiners and all tasks. The independent-samples and partially overlapping-samples t tests were used to compare candidate performance and examiner grading severity between these 2 formats, respectively.</p><p><strong>Results: </strong>In-person (n = 3,462) and virtual (n = 2,959) first-time candidates were comparable in age, sex, race and ethnicity, and whether they were U.S. medical school graduates. The mean (standard deviation [SD]) candidate performance was 2.96 (1.76) logits for the virtual SOE, which was statistically significantly better than that for the in-person SOE (mean [SD], 2.86 [1.75]; Welch independent-samples t test, P = .02); however, the effect size was negligible (Cohen d = 0.06). The difference in the grading severity of examiners who rated the in-person (n = 398; mean [SD], 0.00 [0.73]) versus virtual (n = 341; mean [SD], 0.07 [0.77]) SOE was not statistically significant (Welch partially overlapping-samples t test, P = .07).</p><p><strong>Conclusions: </strong>Candidate performance and examiner grading severity were comparable between the in-person and virtual SOEs, supporting the reliability and validity of the virtual oral exam in this large-volume, high-stakes setting.</p>","PeriodicalId":50929,"journal":{"name":"Academic Medicine","volume":" ","pages":"86-93"},"PeriodicalIF":5.3000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Academic Medicine","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1097/ACM.0000000000005782","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/7 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: The COVID-19 pandemic prompted training institutions and national credentialing organizations to administer examinations virtually. This study compared task difficulty, examiner grading, candidate performance, and other psychometric properties between in-person and virtual standardized oral examinations (SOEs) administered by the American Board of Anesthesiology.
Method: This retrospective study included SOEs administered in person from March 2018 to March 2020 and virtually from December 2020 to November 2021. The in-person and virtual SOEs share the same structure, including 4 tasks of preoperative evaluation, intraoperative management, postoperative care, and additional topics. The Many-Facet Rasch Model was used to estimate candidate performance, examiner grading severity, and task difficulty for the in-person and virtual SOEs separately; the virtual SOE was equated to the in-person SOE by common examiners and all tasks. The independent-samples and partially overlapping-samples t tests were used to compare candidate performance and examiner grading severity between these 2 formats, respectively.
Results: In-person (n = 3,462) and virtual (n = 2,959) first-time candidates were comparable in age, sex, race and ethnicity, and whether they were U.S. medical school graduates. The mean (standard deviation [SD]) candidate performance was 2.96 (1.76) logits for the virtual SOE, which was statistically significantly better than that for the in-person SOE (mean [SD], 2.86 [1.75]; Welch independent-samples t test, P = .02); however, the effect size was negligible (Cohen d = 0.06). The difference in the grading severity of examiners who rated the in-person (n = 398; mean [SD], 0.00 [0.73]) versus virtual (n = 341; mean [SD], 0.07 [0.77]) SOE was not statistically significant (Welch partially overlapping-samples t test, P = .07).
Conclusions: Candidate performance and examiner grading severity were comparable between the in-person and virtual SOEs, supporting the reliability and validity of the virtual oral exam in this large-volume, high-stakes setting.
期刊介绍:
Academic Medicine, the official peer-reviewed journal of the Association of American Medical Colleges, acts as an international forum for exchanging ideas, information, and strategies to address the significant challenges in academic medicine. The journal covers areas such as research, education, clinical care, community collaboration, and leadership, with a commitment to serving the public interest.