{"title":"Do Subject Matter Experts’ Judgments of Multiple-Choice Format Suitability Predict Item Quality?","authors":"Rebecca F. Berenbon, Bridget C. McHugh","doi":"10.1111/emip.12570","DOIUrl":null,"url":null,"abstract":"<p>To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ suitability for a given content standard and the associated item characteristics. Prior to item writing, we surveyed SMEs on MCQ suitability for each content standard. Following field testing, we then used SMEs’ average ratings for each content standard to predict item characteristics for the tests. We analyzed multilevel models predicting item difficulty (<i>p</i> value), discrimination, and nonfunctioning distractor presence. Items were nested within courses and content standards. There was a curvilinear relationship between SMEs’ ratings and item difficulty such that very low MCQ suitability ratings were predictive of easier items. After controlling for item difficulty, items with higher MCQ suitability ratings had higher discrimination and were less likely to have one or more nonfunctioning distractors. This research has practical implications for optimizing test blueprints. Additionally, psychometricians may use these ratings to better prepare for coaching SMEs during item writing.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12570","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational Measurement-Issues and Practice","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/emip.12570","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ suitability for a given content standard and the associated item characteristics. Prior to item writing, we surveyed SMEs on MCQ suitability for each content standard. Following field testing, we then used SMEs’ average ratings for each content standard to predict item characteristics for the tests. We analyzed multilevel models predicting item difficulty (p value), discrimination, and nonfunctioning distractor presence. Items were nested within courses and content standards. There was a curvilinear relationship between SMEs’ ratings and item difficulty such that very low MCQ suitability ratings were predictive of easier items. After controlling for item difficulty, items with higher MCQ suitability ratings had higher discrimination and were less likely to have one or more nonfunctioning distractors. This research has practical implications for optimizing test blueprints. Additionally, psychometricians may use these ratings to better prepare for coaching SMEs during item writing.