{"title":"Bayesian Analysis Methods for Two-Level Diagnosis Classification Models","authors":"K. Yamaguchi","doi":"10.3102/10769986231173594","DOIUrl":"https://doi.org/10.3102/10769986231173594","url":null,"abstract":"Understanding whether or not different types of students master various attributes can aid future learning remediation. In this study, two-level diagnostic classification models (DCMs) were developed to represent the probabilistic relationship between external latent classes and attribute mastery patterns. Furthermore, variational Bayesian (VB) inference and Gibbs sampling Markov chain Monte Carlo methods were developed for parameter estimation of the two-level DCMs. The results of a parameter recovery simulation study show that both techniques appropriately recovered the true parameters; Gibbs sampling in particular was slightly more accurate than VB, whereas VB performed estimation much faster than Gibbs sampling. The two-level DCMs with the proposed Bayesian estimation methods were further applied to fourth-grade data obtained from the Trends in International Mathematics and Science Study 2007 and indicated that mathematical activities in the classroom could be organized into four latent classes, with each latent class connected to different attribute mastery patterns. This information can be employed in educational intervention to focus on specific latent classes and elucidate attribute patterns.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44040378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Psychometric Framework for Evaluating Fairness in Algorithmic Decision Making: Differential Algorithmic Functioning","authors":"Youmi Suk, K. T. Han","doi":"10.3102/10769986231171711","DOIUrl":"https://doi.org/10.3102/10769986231171711","url":null,"abstract":"As algorithmic decision making is increasingly deployed in every walk of life, many researchers have raised concerns about fairness-related bias from such algorithms. But there is little research on harnessing psychometric methods to uncover potential discriminatory bias inside decision-making algorithms. The main goal of this article is to propose a new framework for algorithmic fairness based on differential item functioning (DIF), which has been commonly used to measure item fairness in psychometrics. Our fairness notion, which we call differential algorithmic functioning (DAF), is defined based on three pieces of information: a decision variable, a “fair” variable, and a protected variable such as race or gender. Under the DAF framework, an algorithm can exhibit uniform DAF, nonuniform DAF, or neither (i.e., non-DAF). For detecting DAF, we provide modifications of well-established DIF methods: Mantel–Haenszel test, logistic regression, and residual-based DIF. We demonstrate our framework through a real dataset concerning decision-making algorithms for grade retention in K–12 education in the United States.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43676822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model Misspecification and Robustness of Observed-Score Test Equating Using Propensity Scores","authors":"G. Wallin, M. Wiberg","doi":"10.3102/10769986231161575","DOIUrl":"https://doi.org/10.3102/10769986231161575","url":null,"abstract":"This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for latent ability to balance the test groups. The objective is to assess the sensitivity of the equated scores to various misspecifications in the propensity score model. The study assumes a parametric form of the propensity score and evaluates the effects of various misspecification scenarios on equating error. The results, based on both simulated and real testing data, show that (1) omitting an important covariate leads to biased estimates of the equated scores, (2) misspecifying a nonlinear relationship between the covariates and test scores increases the equating standard error in the tails of the score distributions, and (3) the equating estimators are robust against omitting a second-order term as well as using an incorrect link function in the propensity score estimation model. The findings demonstrate that auxiliary information is beneficial for test score equating in complex settings. However, it also sheds light on the challenge of making fair comparisons between nonequivalent test groups in the absence of common items. The study identifies scenarios, where equating performance is acceptable and problematic, provides practical guidelines, and identifies areas for further investigation.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"603 - 635"},"PeriodicalIF":2.4,"publicationDate":"2023-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43284852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Item-Level Heterogeneous Treatment Effects With the Explanatory Item Response Model: Leveraging Large-Scale Online Assessments to Pinpoint the Impact of Educational Interventions","authors":"Josh Gilbert, James S. Kim, Luke W. Miratrix","doi":"10.3102/10769986231171710","DOIUrl":"https://doi.org/10.3102/10769986231171710","url":null,"abstract":"Analyses that reveal how treatment effects vary allow researchers, practitioners, and policymakers to better understand the efficacy of educational interventions. In practice, however, standard statistical methods for addressing heterogeneous treatment effects (HTE) fail to address the HTE that may exist within outcome measures. In this study, we present a novel application of the explanatory item response model (EIRM) for assessing what we term “item-level” HTE (IL-HTE), in which a unique treatment effect is estimated for each item in an assessment. Results from data simulation reveal that when IL-HTE is present but ignored in the model, standard errors can be underestimated and false positive rates can increase. We then apply the EIRM to assess the impact of a literacy intervention focused on promoting transfer in reading comprehension on a digital assessment delivered online to approximately 8,000 third-grade students. We demonstrate that allowing for IL-HTE can reveal treatment effects at the item-level masked by a null average treatment effect, and the EIRM can thus provide fine-grained information for researchers and policymakers on the potentially heterogeneous causal effects of educational interventions.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43185120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cognitive Diagnosis Testlet Model for Multiple-Choice Items","authors":"Lei Guo, Wenjie Zhou, Xiao Li","doi":"10.3102/10769986231165622","DOIUrl":"https://doi.org/10.3102/10769986231165622","url":null,"abstract":"The testlet design is very popular in educational and psychological assessments. This article proposes a new cognitive diagnosis model, the multiple-choice cognitive diagnostic testlet (MC-CDT) model for tests using testlets consisting of MC items. The MC-CDT model uses the original examinees’ responses to MC items instead of dichotomously scored data (i.e., correct or incorrect) to retain information of different distractors and thus enhance the MC items’ diagnostic power. The Markov chain Monte Carlo algorithm was adopted to calibrate the model using the WinBUGS software. Then, a thorough simulation study was conducted to evaluate the estimation accuracy for both item and examinee parameters in the MC-CDT model under various conditions. The results showed that the proposed MC-CDT model outperformed the traditional MC cognitive diagnostic model. Specifically, the MC-CDT model fits the testlet data better than the traditional model, while also fitting the data without testlets well. The findings of this empirical study show that the MC-CDT model fits real data better than the traditional model and that it can also provide testlet information.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47191461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Within-Group Approach to Ensemble Machine Learning Methods for Causal Inference in Multilevel Studies","authors":"Youmi Suk","doi":"10.3102/10769986231162096","DOIUrl":"https://doi.org/10.3102/10769986231162096","url":null,"abstract":"Machine learning (ML) methods for causal inference have gained popularity due to their flexibility to predict the outcome model and the propensity score. In this article, we provide a within-group approach for ML-based causal inference methods in order to robustly estimate average treatment effects in multilevel studies when there is cluster-level unmeasured confounding. We focus on one particular ML-based causal inference method based on the targeted maximum likelihood estimation (TMLE) with an ensemble learner called SuperLearner. Through our simulation studies, we observe that training TMLE within groups of similar clusters helps remove bias from cluster-level unmeasured confounders. Also, using within-group propensity scores estimated from fixed effects logistic regression increases the robustness of the proposed within-group TMLE method. Even if the propensity scores are partially misspecified, the within-group TMLE still produces robust ATE estimates due to double robustness with flexible modeling, unlike parametric-based inverse propensity weighting methods. We demonstrate our proposed methods and conduct sensitivity analyses against the number of groups and individual-level unmeasured confounding to evaluate the effect of taking an eighth-grade algebra course on math achievement in the Early Childhood Longitudinal Study.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48737730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Latent Transition Cognitive Diagnosis Model With Covariates: A Three-Step Approach","authors":"Qianru Liang, Jimmy de la Torre, N. Law","doi":"10.3102/10769986231163320","DOIUrl":"https://doi.org/10.3102/10769986231163320","url":null,"abstract":"To expand the use of cognitive diagnosis models (CDMs) to longitudinal assessments, this study proposes a bias-corrected three-step estimation approach for latent transition CDMs with covariates by integrating a general CDM and a latent transition model. The proposed method can be used to assess changes in attribute mastery status and attribute profiles and to evaluate the covariate effects on both the initial state and transition probabilities over time using latent (multinomial) logistic regression. Because stepwise approaches generally yield biased estimates, correction for classification error probabilities is considered in this study. The results of the simulation study showed that the proposed method yielded more accurate parameter estimates than the uncorrected approach. The use of the proposed method is also illustrated using a set of real data.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47630907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnosing Primary Students’ Reading Progression: Is Cognitive Diagnostic Computerized Adaptive Testing the Way Forward?","authors":"Yan Li, Chao-Hsien Huang, Jia Liu","doi":"10.3102/10769986231160668","DOIUrl":"https://doi.org/10.3102/10769986231160668","url":null,"abstract":"Cognitive diagnostic computerized adaptive testing (CD-CAT) is a cutting-edge technology in educational measurement that targets at providing feedback on examinees’ strengths and weaknesses while increasing test accuracy and efficiency. To date, most CD-CAT studies have made methodological progress under simulated conditions, but little has applied CD-CAT to real educational assessment. The present study developed a Chinese reading comprehension item bank tapping into six validated reading attributes, with 195 items calibrated using data of 28,485 second to sixth graders and the item-level cognitive diagnostic models (CDMs). The measurement precision and efficiency of the reading CD-CAT system were compared and optimized in terms of crucial CD-CAT settings, including the CDMs for calibration, item selection methods, and termination rules. The study identified seven dominant reading attribute mastery profiles that stably exist across grades. These major clusters of readers and their variety with grade indicated some sort of reading developmental mechanisms that advance and deepen step by step at the primary school level. Results also suggested that compared to traditional linear tests, CD-CAT significantly improved the classification accuracy without imposing much testing burden. These findings may elucidate the multifaceted nature and possible learning paths of reading and raise the question of whether CD-CAT is applicable to other educational domains where there is a need to provide formative and fine-grained feedback but where there is a limited amount of test time.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"1 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41708221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kylie Gorney, James A. Wollack, S. Sinharay, Carol Eckerly
{"title":"Using Item Scores and Distractors to Detect Item Compromise and Preknowledge","authors":"Kylie Gorney, James A. Wollack, S. Sinharay, Carol Eckerly","doi":"10.3102/10769986231159923","DOIUrl":"https://doi.org/10.3102/10769986231159923","url":null,"abstract":"Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item scores and distractors to simultaneously detect CI and EWP. The false positive rate and true positive rate are evaluated for both items and examinees using detailed simulations. A real data example is also provided using data from an information technology certification exam.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"636 - 660"},"PeriodicalIF":2.4,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49258978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Explicit Form With Continuous Attribute Profile of the Partial Mastery DINA Model","authors":"Tian Shu, Guanzhong Luo, Zhaosheng Luo, Xiaofeng Yu, Xiaojun Guo, Yujun Li","doi":"10.3102/10769986231159436","DOIUrl":"https://doi.org/10.3102/10769986231159436","url":null,"abstract":"Cognitive diagnosis models (CDMs) are the statistical framework for cognitive diagnostic assessment in education and psychology. They generally assume that subjects’ latent attributes are dichotomous—mastery or nonmastery, which seems quite deterministic. As an alternative to dichotomous attribute mastery, attention is drawn to the use of a continuous attribute mastery format in recent literature. To obtain subjects’ finer-grained attribute mastery for more precise diagnosis and guidance, an equivalent but more explicit form of the partial-mastery-deterministic inputs, noisy “and” gate (DINA) model (termed continuous attribute profile [CAP]-DINA form) is proposed in this article. Its parameters estimation algorithm based on this form using Bayesian techniques with Markov chain Monte Carlo algorithm is also presented. Two simulation studies are conducted then to explore its parameter recovery and model misspecification, and the results demonstrate that the CAP-DINA form performs robustly with satisfactory efficiency in these two aspects. A real data study of the English test also indicates it has a better model fit than DINA.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"573 - 602"},"PeriodicalIF":2.4,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42823878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}