Journal of Educational Measurement最新文献

筛选
英文 中文
Using Simulated Retests to Estimate the Reliability of Diagnostic Assessment Systems 用模拟复验估计诊断评估系统的可靠性
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2023-02-19 DOI: 10.1111/jedm.12359
W. Jake Thompson, Brooke Nash, Amy K. Clark, Jeffrey C. Hoover
{"title":"Using Simulated Retests to Estimate the Reliability of Diagnostic Assessment Systems","authors":"W. Jake Thompson,&nbsp;Brooke Nash,&nbsp;Amy K. Clark,&nbsp;Jeffrey C. Hoover","doi":"10.1111/jedm.12359","DOIUrl":"10.1111/jedm.12359","url":null,"abstract":"<p>As diagnostic classification models become more widely used in large-scale operational assessments, we must give consideration to the methods for estimating and reporting reliability. Researchers must explore alternatives to traditional reliability methods that are consistent with the design, scoring, and reporting levels of diagnostic assessment systems. In this article, we describe and evaluate a method for simulating retests to summarize reliability evidence at multiple reporting levels. We evaluate how the performance of reliability estimates from simulated retests compares to other measures of classification consistency and accuracy for diagnostic assessments that have previously been described in the literature, but which limit the level at which reliability can be reported. Overall, the findings show that reliability estimates from simulated retests are an accurate measure of reliability and are consistent with other measures of reliability for diagnostic assessments. We then apply this method to real data from the Examination for the Certificate of Proficiency in English to demonstrate the method in practice and compare reliability estimates from observed data. Finally, we discuss implications for the field and possible next directions.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 3","pages":"455-475"},"PeriodicalIF":1.3,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47801652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Exploration of an Improved Aggregate Student Growth Measure Using Data from Two States 利用两个州的数据探索一种改进的学生综合成长测量方法
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2023-01-31 DOI: 10.1111/jedm.12354
Katherine E. Castellano, Daniel F. McCaffrey, J. R. Lockwood
{"title":"An Exploration of an Improved Aggregate Student Growth Measure Using Data from Two States","authors":"Katherine E. Castellano,&nbsp;Daniel F. McCaffrey,&nbsp;J. R. Lockwood","doi":"10.1111/jedm.12354","DOIUrl":"10.1111/jedm.12354","url":null,"abstract":"<p>The simple average of student growth scores is often used in accountability systems, but it can be problematic for decision making. When computed using a small/moderate number of students, it can be sensitive to the sample, resulting in inaccurate representations of growth of the students, low year-to-year stability, and inequities for low-incidence groups. An alternative designed to address these issues is to use an Empirical Best Linear Prediction (EBLP), which is a weighted average of growth score data from other years and/or subjects. We apply both approaches to two statewide datasets to answer empirical questions about their performance. The EBLP outperforms the simple average in accuracy and cross-year stability with the exception that accuracy was not necessarily improved for very large districts in one of the states. In such exceptions, we show a beneficial alternative may be to use a hybrid approach in which very large districts receive the simple average and all others receive the EBLP. We find that adding more growth score data to the computation of the EBLP can improve accuracy, but not necessarily for larger schools/districts. We review key decision points in aggregate growth reporting and in specifying an EBLP weighted average in practice.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"173-201"},"PeriodicalIF":1.3,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41556588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classification Accuracy and Consistency of Compensatory Composite Test Scores 补偿性综合测试成绩的分类准确性和一致性
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2023-01-28 DOI: 10.1111/jedm.12357
J. Carl Setzer, Ying Cheng, Cheng Liu
{"title":"Classification Accuracy and Consistency of Compensatory Composite Test Scores","authors":"J. Carl Setzer,&nbsp;Ying Cheng,&nbsp;Cheng Liu","doi":"10.1111/jedm.12357","DOIUrl":"10.1111/jedm.12357","url":null,"abstract":"<p>Test scores are often used to make decisions about examinees, such as in licensure and certification testing, as well as in many educational contexts. In some cases, these decisions are based upon compensatory scores, such as those from multiple sections or components of an exam. Classification accuracy and classification consistency are two psychometric characteristics of test scores that are often reported when decisions are based on those scores, and several techniques currently exist for estimating both accuracy and consistency. However, research on classification accuracy and consistency on compensatory test scores is scarce. This study demonstrates two techniques that can be used to estimate classification accuracy and consistency when test scores are used in a compensatory manner. First, a simulation study demonstrates that both methods provide very similar results under the studied conditions. Second, we demonstrate how the two methods could be used with a high-stakes licensure exam.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 3","pages":"501-519"},"PeriodicalIF":1.3,"publicationDate":"2023-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46918652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial for JEM issue 59-4 JEM第59-4期社论
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2023-01-06 DOI: 10.1111/jedm.12356
Sandip Sinharay
{"title":"Editorial for JEM issue 59-4","authors":"Sandip Sinharay","doi":"10.1111/jedm.12356","DOIUrl":"https://doi.org/10.1111/jedm.12356","url":null,"abstract":"","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"59 4","pages":"397"},"PeriodicalIF":1.3,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137647443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Specifying the Three Ws in Educational Measurement: Who Uses Which Scores for What Purpose? 指定教育测量中的三个w:谁使用哪个分数用于什么目的?
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2022-12-25 DOI: 10.1111/jedm.12355
Andrew Ho
{"title":"Specifying the Three Ws in Educational Measurement: Who Uses Which Scores for What Purpose?","authors":"Andrew Ho","doi":"10.1111/jedm.12355","DOIUrl":"10.1111/jedm.12355","url":null,"abstract":"<p>I argue that understanding and improving educational measurement requires specificity about actors, scores, and purpose: Who uses which scores for what purpose? I show how this specificity complements Briggs’ frameworks for educational measurement that he presented in his 2022 address as president of the National Council on Measurement in Education.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"59 4","pages":"418-422"},"PeriodicalIF":1.3,"publicationDate":"2022-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48187875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items 多维计算机自适应测试中的在线标定
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2022-12-15 DOI: 10.1111/jedm.12353
Lu Yuan, Yingshi Huang, Shuhang Li, Ping Chen
{"title":"Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items","authors":"Lu Yuan,&nbsp;Yingshi Huang,&nbsp;Shuhang Li,&nbsp;Ping Chen","doi":"10.1111/jedm.12353","DOIUrl":"10.1111/jedm.12353","url":null,"abstract":"<p>Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more common, only a few published reports focus on online calibration in MCAT with polytomously scored items (P-MCAT). Therefore, standing on the shoulders of the existing online calibration methods/designs, this study proposes four new P-MCAT online calibration methods and two new P-MCAT online calibration designs and conducts two simulation studies to evaluate their performance under varying conditions (i.e., different calibration sample sizes and correlations between dimensions). Results show that all of the newly proposed methods can accurately recover item parameters, and the adaptive designs outperform the random design in most cases. In the end, this paper provides practical guidance based on simulation results.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 3","pages":"476-500"},"PeriodicalIF":1.3,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47208290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring the Uncertainty of Imputed Scores 估算分数的不确定度测量
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2022-12-14 DOI: 10.1111/jedm.12352
Sandip Sinharay
{"title":"Measuring the Uncertainty of Imputed Scores","authors":"Sandip Sinharay","doi":"10.1111/jedm.12352","DOIUrl":"10.1111/jedm.12352","url":null,"abstract":"<p>Technical difficulties and other unforeseen events occasionally lead to incomplete data on educational tests, which necessitates the reporting of imputed scores to some examinees. While there exist several approaches for reporting imputed scores, there is a lack of any guidance on the reporting of the uncertainty of imputed scores. In this paper, several approaches are suggested for quantifying the uncertainty of imputed scores using measures that are similar in spirit to estimates of reliability and standard error of measurement. A simulation study is performed to examine the properties of the approaches. The approaches are then applied to data from a state test on which some examinees' scores had to be imputed following computer problems. Several recommendations are made for practice.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"351-375"},"PeriodicalIF":1.3,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45116305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Exponentially Weighted Moving Average Procedure for Detecting Back Random Responding Behavior 一种指数加权移动平均法检测反向随机响应行为
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2022-12-09 DOI: 10.1111/jedm.12351
Yinhong He
{"title":"An Exponentially Weighted Moving Average Procedure for Detecting Back Random Responding Behavior","authors":"Yinhong He","doi":"10.1111/jedm.12351","DOIUrl":"10.1111/jedm.12351","url":null,"abstract":"<p>Back random responding (BRR) behavior is one of the commonly observed careless response behaviors. Accurately detecting BRR behavior can improve test validities. Yu and Cheng (2019) showed that the change point analysis (CPA) procedure based on weighted residual (CPA-WR) performed well in detecting BRR. Compared with the CPA procedure, the exponentially weighted moving average (EWMA) obtains more detailed information. This study equipped the weighted residual statistic with EWMA, and proposed the EWMA-WR method to detect BRR. To make the critical values adaptive to the ability levels, this study proposed the Monte Carlo simulation with ability stratification (MC-stratification) method for calculating critical values. Compared to the original Monte Carlo simulation (MC) method, the newly proposed MC-stratification method generated a larger number of satisfactory results. The performances of CPA-WR and EWMA-WR were evaluated under different conditions that varied in the test lengths, abnormal proportions, critical values and smoothing constants used in the EWMA-WR method. The results showed that EWMA-WR was more powerful than CPA-WR in detecting BRR. Moreover, an empirical study was conducted to illustrate the utility of EWMA-WR for detecting BRR.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"282-317"},"PeriodicalIF":1.3,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47390314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multiple-Group Joint Modeling of Item Responses, Response Times, and Action Counts with the Conway-Maxwell-Poisson Distribution Conway‐Maxwell‐Poisson分布的项目响应、响应时间和行动次数的多组联合建模
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2022-12-07 DOI: 10.1111/jedm.12349
Xin Qiao, Hong Jiao, Qiwei He
{"title":"Multiple-Group Joint Modeling of Item Responses, Response Times, and Action Counts with the Conway-Maxwell-Poisson Distribution","authors":"Xin Qiao,&nbsp;Hong Jiao,&nbsp;Qiwei He","doi":"10.1111/jedm.12349","DOIUrl":"10.1111/jedm.12349","url":null,"abstract":"<p>Multiple group modeling is one of the methods to address the measurement noninvariance issue. Traditional studies on multiple group modeling have mainly focused on item responses. In computer-based assessments, joint modeling of response times and action counts with item responses helps estimate the latent speed and action levels in addition to latent ability. These two new data sources can also be used to further address the measurement noninvariance issue. One challenge, however, is to correctly model action counts which can be underdispersed, overdispersed, or equidispersed in real data sets. To address this, we adopted the Conway-Maxwell-Poisson distribution that accounts for different types of dispersion in action counts and incorporated it in the multiple group joint modeling of item responses, response times, and action counts. Bayesian Markov Chain Monte Carlo method was used for model parameter estimation. To illustrate an application of the proposed model, an empirical data analysis was conducted using the Programme for International Student Assessment (PISA) 2015 collaborative problem-solving items where potential measurement noninvariance issue existed between gender groups. Results indicated that Conway-Maxwell-Poisson model yielded better model fit than alternative count data models such as negative binomial and Poisson models. In addition, response times and action counts provided further information on performance differences between groups.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"255-281"},"PeriodicalIF":1.3,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45484845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
NCME Presidential Address 2022: Turning the Page to the Next Chapter of Educational Measurement 全国教育计量学会2022年会长致辞:掀开教育计量的新篇章
IF 1.3 4区 心理学
Journal of Educational Measurement Pub Date : 2022-11-09 DOI: 10.1111/jedm.12350
Derek C. Briggs
{"title":"NCME Presidential Address 2022: Turning the Page to the Next Chapter of Educational Measurement","authors":"Derek C. Briggs","doi":"10.1111/jedm.12350","DOIUrl":"https://doi.org/10.1111/jedm.12350","url":null,"abstract":"","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"59 4","pages":"398-417"},"PeriodicalIF":1.3,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137813868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信