S. Paganin, C. Paciorek, Claudia Wehrhahn, Abel Rodríguez, S. Rabe-Hesketh, P. de Valpine
{"title":"Computational Strategies and Estimation Performance With Bayesian Semiparametric Item Response Theory Models","authors":"S. Paganin, C. Paciorek, Claudia Wehrhahn, Abel Rodríguez, S. Rabe-Hesketh, P. de Valpine","doi":"10.3102/10769986221136105","DOIUrl":"https://doi.org/10.3102/10769986221136105","url":null,"abstract":"Item response theory (IRT) models typically rely on a normality assumption for subject-specific latent traits, which is often unrealistic in practice. Semiparametric extensions based on Dirichlet process mixtures (DPMs) offer a more flexible representation of the unknown distribution of the latent trait. However, the use of such models in the IRT literature has been extremely limited, in good part because of the lack of comprehensive studies and accessible software tools. This article provides guidance for practitioners on semiparametric IRT models and their implementation. In particular, we rely on NIMBLE, a flexible software system for hierarchical models that enables the use of DPMs. We highlight efficient sampling strategies for model estimation and compare inferential results under parametric and semiparametric models.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"147 - 188"},"PeriodicalIF":2.4,"publicationDate":"2021-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43423727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Youmi Suk, Peter M Steiner, Jee-Seon Kim, Hyunseung Kang
{"title":"Regression Discontinuity Designs With an Ordinal Running Variable: Evaluating the Effects of Extended Time Accommodations for English-Language Learners","authors":"Youmi Suk, Peter M Steiner, Jee-Seon Kim, Hyunseung Kang","doi":"10.3102/10769986221090275","DOIUrl":"https://doi.org/10.3102/10769986221090275","url":null,"abstract":"Regression discontinuity (RD) designs are commonly used for program evaluation with continuous treatment assignment variables. But in practice, treatment assignment is frequently based on ordinal variables. In this study, we propose an RD design with an ordinal running variable to assess the effects of extended time accommodations (ETA) for English-language learners (ELLs). ETA eligibility is determined by ordinal ELL English-proficiency categories of National Assessment of Educational Progress data. We discuss the identification and estimation of the average treatment effect (ATE), intent-to-treat effect, and the local ATE at the cutoff. We also propose a series of sensitivity analyses to probe the effect estimates’ robustness to the choices of scaling functions and cutoff scores and remaining confounding.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"459 - 484"},"PeriodicalIF":2.4,"publicationDate":"2021-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44101445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-Classified Random Effects Modeling for Moderated Item Calibration","authors":"Seungwon Chung, Li Cai","doi":"10.3102/1076998620983908","DOIUrl":"https://doi.org/10.3102/1076998620983908","url":null,"abstract":"In the research reported here, we propose a new method for scale alignment and test scoring in the context of supporting students with disabilities. In educational assessment, students from these special populations take modified tests because of a demonstrated disability that requires more assistance than standard testing accommodation. Updated federal education legislation and guidance require that these students be assessed and included in state education accountability systems, and their achievement reported with respect to the same rigorous content and achievement standards that the state adopted. Routine item calibration and linking methods are not feasible because the size of these special populations tends to be small. We develop a unified cross-classified random effects model that utilizes item response data from the general population as well as judge-provided data from subject matter experts in order to obtain revised item parameter estimates for use in scoring modified tests. We extend the Metropolis–Hastings Robbins–Monro algorithm to estimate the parameters of this model. The proposed method is applied to Braille test forms in a large operational multistate English language proficiency assessment program. Our work not only allows a broader range of modifications that is routinely considered in large-scale educational assessments but also directly incorporates the input from subject matter experts who work directly with the students needing support. Their structured and informed feedback deserves more attention from the psychometric community.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"651 - 681"},"PeriodicalIF":2.4,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45633091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Practical Guide for Analyzing Large-Scale Assessment Data Using Mplus: A Case Demonstration Using the Program for International Assessment of Adult Competencies Data","authors":"T. Yamashita, Thomas J. Smith, P. Cummins","doi":"10.3102/1076998620978554","DOIUrl":"https://doi.org/10.3102/1076998620978554","url":null,"abstract":"In order to promote the use of increasingly available large-scale assessment data in education and expand the scope of analytic capabilities among applied researchers, this study provides step-by-step guidance, and practical examples of syntax and data analysis using Mplus. Concise overview and key unique aspects of large-scale assessment data from the 2012/2014 Program for International Assessment of Adult Competencies (PIAAC) are described. Using commonly-used statistical software including SAS and R, a simple macro program and syntax are developed to streamline the data preparation process. Then, two examples of structural equation models are demonstrated using Mplus. The suggested data preparation and analytic approaches can be immediately applicable to existing large-scale assessment data.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"501 - 518"},"PeriodicalIF":2.4,"publicationDate":"2020-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3102/1076998620978554","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43159243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Weight Estimation of Latent Ability: Application to Computerized Adaptive Testing With Response Revision","authors":"Shiyu Wang, Houping Xiao, A. Cohen","doi":"10.3102/1076998620972800","DOIUrl":"https://doi.org/10.3102/1076998620972800","url":null,"abstract":"An adaptive weight estimation approach is proposed to provide robust latent ability estimation in computerized adaptive testing (CAT) with response revision. This approach assigns different weights to each distinct response to the same item when response revision is allowed in CAT. Two types of weight estimation procedures, nonfunctional and functional weight, are proposed to determine the weight adaptively based on the compatibility of each revised response with the assumed statistical model in relation to remaining observations. The application of this estimation approach to a data set collected from a large-scale multistage adaptive testing demonstrates the capability of this method to reveal more information regarding the test taker’s latent ability by using the valid response path compared with only using the very last response. Limited simulation studies were concluded to evaluate the proposed ability estimation method and to compare it with several other estimation procedures in literature. Results indicate that the proposed ability estimation approach is able to provide robust estimation results in two test-taking scenarios.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"560 - 591"},"PeriodicalIF":2.4,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3102/1076998620972800","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45694185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ordinal Approaches to Decomposing Between-Group Test Score Disparities","authors":"David M. Quinn, Andrew D. Ho","doi":"10.3102/1076998620967726","DOIUrl":"https://doi.org/10.3102/1076998620967726","url":null,"abstract":"The estimation of test score “gaps” and gap trends plays an important role in monitoring educational inequality. Researchers decompose gaps and gap changes into within- and between-school portions to generate evidence on the role schools play in shaping these inequalities. However, existing decomposition methods assume an equal-interval test scale and are a poor fit to coarsened data such as proficiency categories. This leaves many potential data sources ill-suited for decomposition applications. We develop two decomposition approaches that overcome these limitations: an extension of V, an ordinal gap statistic, and an extension of ordered probit models. Simulations show V decompositions have negligible bias with small within-school samples. Ordered probit decompositions have negligible bias with large within-school samples but more serious bias with small within-school samples. More broadly, our methods enable analysts to (1) decompose the difference between two groups on any ordinal outcome into portions within- and between some third categorical variable and (2) estimate scale-invariant between-group differences that adjust for a categorical covariate.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"466 - 500"},"PeriodicalIF":2.4,"publicationDate":"2020-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3102/1076998620967726","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42874649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Treatment of Missing Data in Background Questionnaires in Educational Large-Scale Assessments: An Evaluation of Different Procedures","authors":"S. Grund, O. Lüdtke, A. Robitzsch","doi":"10.3102/1076998620959058","DOIUrl":"https://doi.org/10.3102/1076998620959058","url":null,"abstract":"Large-scale assessments (LSAs) use Mislevy’s “plausible value” (PV) approach to relate student proficiency to noncognitive variables administered in a background questionnaire. This method requires background variables to be completely observed, a requirement that is seldom fulfilled. In this article, we evaluate and compare the properties of methods used in current practice for dealing with missing data in background variables in educational LSAs, which rely on the missing indicator method (MIM), with other methods based on multiple imputation. In this context, we present a fully conditional specification (FCS) approach that allows for a joint treatment of PVs and missing data. Using theoretical arguments and two simulation studies, we illustrate under what conditions the MIM provides biased or unbiased estimates of population parameters and provide evidence that methods such as FCS can provide an effective alternative to the MIM. We discuss the strengths and weaknesses of the approaches and outline potential consequences for operational practice in educational LSAs. An illustration is provided using data from the PISA 2015 study.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"430 - 465"},"PeriodicalIF":2.4,"publicationDate":"2020-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49151735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Block What You Can, Except When You Shouldn’t","authors":"Nicole E. Pashley, Luke W. Miratrix","doi":"10.3102/10769986211027240","DOIUrl":"https://doi.org/10.3102/10769986211027240","url":null,"abstract":"Several branches of the potential outcome causal inference literature have discussed the merits of blocking versus complete randomization. Some have concluded it can never hurt the precision of estimates, and some have concluded it can hurt. In this article, we reconcile these apparently conflicting views, give a more thorough discussion of what guarantees no harm, and discuss how other aspects of a blocked design can cost, all in terms of estimator precision. We discuss how the different findings are due to different sampling models and assumptions of how the blocks were formed. We also connect these ideas to common misconceptions; for instance, we show that analyzing a blocked experiment as if it were completely randomized, a seemingly conservative method, can actually backfire in some cases. Overall, we find that blocking can have a price but that this price is usually small and the potential for gain can be large. It is hard to go too far wrong with blocking.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"69 - 100"},"PeriodicalIF":2.4,"publicationDate":"2020-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48111783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}