{"title":"Determining intra-standard-setter inconsistency in the Angoff method using the three-parameter item response theory.","authors":"Mohsen Tavakol, David O'Brien, Claire Stewart","doi":"10.5116/ijme.64ed.e296","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To measure intra-standard-setter variability and assess the variations between the pass marks obtained from Angoff ratings, guided by the latent trait theory as the theoretical model.</p><p><strong>Methods: </strong>A non-experimental cross-sectional study was conducted to achieve the purpose of the study. Two knowledge-based tests were administered to 358 final-year medical students (223 females and 135 males) as part of their normal summative programme of assessments. The results of judgmental standard-setting using the Angoff method, which is widely used in medical schools, were used to determine intra-standard-setter inconsistency using the three-parameter item response theory (IRT). Permission for this study was granted by the local Research Ethics Committee of the University of Nottingham. To ensure anonymity and confidentiality, all identifiers at the student level were removed before the data were analysed.</p><p><strong>Results: </strong>The results of this study confirm that the three-parameter IRT can be used to analyse the results of individual judgmental standard setters. Overall, standard-setters behaved fairly consistently in both tests. The mean Angoff ratings and conditional probability were strongly positively correlated, which is a matter of inter-standard-setter validity.</p><p><strong>Conclusions: </strong>We recommend that assessment providers adopt the methodology used in this study to help determine inter and intra-judgmental inconsistencies across standard setters to minimise the number of false positive and false negative decisions.</p>","PeriodicalId":14029,"journal":{"name":"International Journal of Medical Education","volume":"14 ","pages":"123-130"},"PeriodicalIF":1.6000,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10693949/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5116/ijme.64ed.e296","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: To measure intra-standard-setter variability and assess the variations between the pass marks obtained from Angoff ratings, guided by the latent trait theory as the theoretical model.
Methods: A non-experimental cross-sectional study was conducted to achieve the purpose of the study. Two knowledge-based tests were administered to 358 final-year medical students (223 females and 135 males) as part of their normal summative programme of assessments. The results of judgmental standard-setting using the Angoff method, which is widely used in medical schools, were used to determine intra-standard-setter inconsistency using the three-parameter item response theory (IRT). Permission for this study was granted by the local Research Ethics Committee of the University of Nottingham. To ensure anonymity and confidentiality, all identifiers at the student level were removed before the data were analysed.
Results: The results of this study confirm that the three-parameter IRT can be used to analyse the results of individual judgmental standard setters. Overall, standard-setters behaved fairly consistently in both tests. The mean Angoff ratings and conditional probability were strongly positively correlated, which is a matter of inter-standard-setter validity.
Conclusions: We recommend that assessment providers adopt the methodology used in this study to help determine inter and intra-judgmental inconsistencies across standard setters to minimise the number of false positive and false negative decisions.