Applying generalized theory to optimize the quality of high-stakes objective structured clinical examinations for undergraduate medical students: experience from the French medical school.
{"title":"Applying generalized theory to optimize the quality of high-stakes objective structured clinical examinations for undergraduate medical students: experience from the French medical school.","authors":"Eva Feigerlova","doi":"10.1186/s12909-025-07255-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The national OSCE examination has recently been adopted in France as a prerequisite for medical students to enter accredited graduate education programs. However, the reliability and generalizability of OSCE scores are not well explored taking into account the national examination blueprint.</p><p><strong>Method: </strong>To obtain complementary information for monitoring and improving the quality of the OSCE we performed a pilot study applying generalizability (G-)theory on a sample of 6th-year undergraduate medical students (n = 73) who were assessed by 24 examiner pairs at three stations. Based on the national blueprint, three different scoring subunits (a dichotomous task-specific checklist evaluating clinical skills and behaviorally anchored scales evaluating generic skills and a global performance scale) were used to evaluate students and combined into a station score. A variance component analysis was performed using mixed modelling to identify the impact of different facets (station, student and student x station interactions) on the scoring subunits. The generalizability and dependability statistics were calculated.</p><p><strong>Results: </strong>There was no significant difference between mean scores attributable to different examiner pairs across the data. The examiner variance component was greater for the clinical skills score (14.4%) than for the generic skills (5.6%) and global performance scores (5.1%). The station variance component was largest for the clinical skills score, accounting for 22.9% of the total score variance, compared to 3% for the generic skills and 13.9% for global performance scores. The variance component related to student represented 12% of the total variance for clinicals skills, 17.4% for generic skills and 14.3% for global performance ratings. The combined generalizability coefficients across all the data were 0.59 for the clinical skills score, 0.93 for the generic skills score and 0.75 for global performance.</p><p><strong>Conclusions: </strong>The combined estimates of relative reliability across all data are greater for generic skills scores and global performance ratings than for clinical skills scores. This is likely explained by the fact that content-specific tasks evaluated using checklists produce greater variability in scores than scales evaluating broader competencies. This work can be valuable to other teaching institutions, as monitoring the sources of errors is a principal quality control strategy to ensure valid interpretations of the students' scores.</p>","PeriodicalId":51234,"journal":{"name":"BMC Medical Education","volume":"25 1","pages":"643"},"PeriodicalIF":2.7000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12046744/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Education","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12909-025-07255-y","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The national OSCE examination has recently been adopted in France as a prerequisite for medical students to enter accredited graduate education programs. However, the reliability and generalizability of OSCE scores are not well explored taking into account the national examination blueprint.
Method: To obtain complementary information for monitoring and improving the quality of the OSCE we performed a pilot study applying generalizability (G-)theory on a sample of 6th-year undergraduate medical students (n = 73) who were assessed by 24 examiner pairs at three stations. Based on the national blueprint, three different scoring subunits (a dichotomous task-specific checklist evaluating clinical skills and behaviorally anchored scales evaluating generic skills and a global performance scale) were used to evaluate students and combined into a station score. A variance component analysis was performed using mixed modelling to identify the impact of different facets (station, student and student x station interactions) on the scoring subunits. The generalizability and dependability statistics were calculated.
Results: There was no significant difference between mean scores attributable to different examiner pairs across the data. The examiner variance component was greater for the clinical skills score (14.4%) than for the generic skills (5.6%) and global performance scores (5.1%). The station variance component was largest for the clinical skills score, accounting for 22.9% of the total score variance, compared to 3% for the generic skills and 13.9% for global performance scores. The variance component related to student represented 12% of the total variance for clinicals skills, 17.4% for generic skills and 14.3% for global performance ratings. The combined generalizability coefficients across all the data were 0.59 for the clinical skills score, 0.93 for the generic skills score and 0.75 for global performance.
Conclusions: The combined estimates of relative reliability across all data are greater for generic skills scores and global performance ratings than for clinical skills scores. This is likely explained by the fact that content-specific tasks evaluated using checklists produce greater variability in scores than scales evaluating broader competencies. This work can be valuable to other teaching institutions, as monitoring the sources of errors is a principal quality control strategy to ensure valid interpretations of the students' scores.
期刊介绍:
BMC Medical Education is an open access journal publishing original peer-reviewed research articles in relation to the training of healthcare professionals, including undergraduate, postgraduate, and continuing education. The journal has a special focus on curriculum development, evaluations of performance, assessment of training needs and evidence-based medicine.