Educational Measurement-Issues and Practice最新文献_第10页

Measurement Efficiency for Technology-Enhanced and Multiple-Choice Items in a K–12 Mathematics Accountability Assessment K-12数学问责性评估中技术增强和多项选择项目的测量效率

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-08-25 DOI: 10.1111/emip.12580

Ozge Ersan, Yufeng Berry

引用次数: 0

Weighing the Value of Complex Growth Estimation Methods to Evaluate Individual Student Response to Instruction 权衡复杂成长评估方法的价值，以评估个别学生对教学的反应

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-08-24 DOI: 10.1111/emip.12579

Ethan R. Van Norman

引用次数: 0

Does It Matter How the Rigor of High School Coursework Is Measured? Gaps in Coursework Among Students and Across Grades 如何衡量高中课程的严谨性重要吗?学生之间和年级之间的课程差距

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-08-23 DOI: 10.1111/emip.12577

Burhan Ogut, Darrick Yee, Ruhan Circi, Nevin Dizdari

引用次数: 0

Exploration of Latent Structure in Test Revision and Review Log Data 测试校核测井资料中的潜在结构探讨

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-08-14 DOI: 10.1111/emip.12576

Susu Zhang, Anqi Li, Shiyu Wang

引用次数: 0

Applying a Mixture Rasch Model-Based Approach to Standard Setting 应用混合Rasch模型为基础的方法来制定标准

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-07-17 DOI: 10.1111/emip.12571

Michael R. Peabody, Timothy J. Muckle, Yu Meng

引用次数: 0

Do Subject Matter Experts’ Judgments of Multiple-Choice Format Suitability Predict Item Quality? 主题专家对多选格式适用性的判断能预测项目质量吗？

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-07-11 DOI: 10.1111/emip.12570

Rebecca F. Berenbon, Bridget C. McHugh

{"title":"Do Subject Matter Experts’ Judgments of Multiple-Choice Format Suitability Predict Item Quality?","authors":"Rebecca F. Berenbon, Bridget C. McHugh","doi":"10.1111/emip.12570","DOIUrl":"10.1111/emip.12570","url":null,"abstract":"To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ suitability for a given content standard and the associated item characteristics. Prior to item writing, we surveyed SMEs on MCQ suitability for each content standard. Following field testing, we then used SMEs’ average ratings for each content standard to predict item characteristics for the tests. We analyzed multilevel models predicting item difficulty (p value), discrimination, and nonfunctioning distractor presence. Items were nested within courses and content standards. There was a curvilinear relationship between SMEs’ ratings and item difficulty such that very low MCQ suitability ratings were predictive of easier items. After controlling for item difficulty, items with higher MCQ suitability ratings had higher discrimination and were less likely to have one or more nonfunctioning distractors. This research has practical implications for optimizing test blueprints. Additionally, psychometricians may use these ratings to better prepare for coaching SMEs during item writing.","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 3","pages":"13-21"},"PeriodicalIF":2.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12570","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46840085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Defining Test-Score Interpretation, Use, and Claims: Delphi Study for the Validity Argument 定义考试成绩的解释、使用和主张:有效性论证的德尔菲研究

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-06-27 DOI: 10.1111/emip.12569

Timothy D. Folger, Jonathan Bostic, Erin E. Krupa

引用次数: 1

Hierarchical Agglomerative Clustering to Detect Test Collusion on Computer-Based Tests 基于层次聚集聚类的计算机测试共谋检测

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-06-19 DOI: 10.1111/emip.12568

Soo Jeong Ingrisone, James N. Ingrisone

引用次数: 0

A Probabilistic Filtering Approach to Non-Effortful Responding 一种非费力响应的概率过滤方法

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-06-16 DOI: 10.1111/emip.12567

Esther Ulitzsch, Benjamin W. Domingue, Radhika Kapoor, Klint Kanopka, Joseph A. Rios

{"title":"A Probabilistic Filtering Approach to Non-Effortful Responding","authors":"Esther Ulitzsch, Benjamin W. Domingue, Radhika Kapoor, Klint Kanopka, Joseph A. Rios","doi":"10.1111/emip.12567","DOIUrl":"10.1111/emip.12567","url":null,"abstract":"Common response-time-based approaches for non-effortful response behavior (NRB) in educational achievement tests filter responses that are associated with response times below some threshold. These approaches are, however, limited in that they require a binary decision on whether a response is classified as stemming from NRB; thus ignoring potential classification uncertainty in resulting parameter estimates. We developed a response-time-based probabilistic filtering procedure that overcomes this limitation. The procedure is rooted in the principles of multiple imputation. Instead of creating multiple plausible replacements of missing data, however, multiple data sets are created that represent plausible filtered response data. We propose two different approaches to filtering models, originating in different research traditions and conceptualizations of response-time-based identification of NRB. The first approach uses Gaussian mixture modeling to identify a response time subcomponent stemming from NRB. Plausible filtered data sets are created based on examinees' posterior probabilities of belonging to the NRB subcomponent. The second approach defines a plausible range of response time thresholds and creates plausible filtered data sets by drawing multiple response time thresholds from the defined range. We illustrate the workings of the proposed procedure as well as differences between the proposed filtering models based on both simulated data and empirical data from PISA 2018.","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 3","pages":"50-64"},"PeriodicalIF":2.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12567","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46209020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Issue Cover 发行封面

IF 2 4区教育学

Educational Measurement-Issues and Practice Pub Date : 2023-06-09 DOI: 10.1111/emip.12514

引用次数: 0