PsychometrikaPub Date : 2024-12-01Epub Date: 2024-05-30DOI: 10.1007/s11336-024-09978-1
Sainan Xu, Jing Lu, Jiwei Zhang, Chun Wang, Gongjun Xu
{"title":"Optimizing Large-Scale Educational Assessment with a \"Divide-and-Conquer\" Strategy: Fast and Efficient Distributed Bayesian Inference in IRT Models.","authors":"Sainan Xu, Jing Lu, Jiwei Zhang, Chun Wang, Gongjun Xu","doi":"10.1007/s11336-024-09978-1","DOIUrl":"10.1007/s11336-024-09978-1","url":null,"abstract":"<p><p>With the growing attention on large-scale educational testing and assessment, the ability to process substantial volumes of response data becomes crucial. Current estimation methods within item response theory (IRT), despite their high precision, often pose considerable computational burdens with large-scale data, leading to reduced computational speed. This study introduces a novel \"divide- and-conquer\" parallel algorithm built on the Wasserstein posterior approximation concept, aiming to enhance computational speed while maintaining accurate parameter estimation. This algorithm enables drawing parameters from segmented data subsets in parallel, followed by an amalgamation of these parameters via Wasserstein posterior approximation. Theoretical support for the algorithm is established through asymptotic optimality under certain regularity assumptions. Practical validation is demonstrated using real-world data from the Programme for International Student Assessment. Ultimately, this research proposes a transformative approach to managing educational big data, offering a scalable, efficient, and precise alternative that promises to redefine traditional practices in educational assessments.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1119-1147"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141176735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-10-22DOI: 10.1007/s11336-024-10003-8
Robert J Mislevy
{"title":"Are Sum Scores a Great Accomplishment of Psychometrics or Intuitive Test Theory?","authors":"Robert J Mislevy","doi":"10.1007/s11336-024-10003-8","DOIUrl":"10.1007/s11336-024-10003-8","url":null,"abstract":"<p><p>Sijtsma, Ellis, and Borsboom (Psychometrika, 89:84-117, 2024. https://doi.org/10.1007/s11336-024-09964-7 ) provide a thoughtful treatment in Psychometrika of the value and properties of sum scores and classical test theory at a depth at which few practicing psychometricians are familiar. In this note, I offer comments on their article from the perspective of evidentiary reasoning.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1170-1174"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142481089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-07-05DOI: 10.1007/s11336-024-09983-4
Seunghyun Lee, Yuqi Gu
{"title":"New Paradigm of Identifiable General-response Cognitive Diagnostic Models: Beyond Categorical Data.","authors":"Seunghyun Lee, Yuqi Gu","doi":"10.1007/s11336-024-09983-4","DOIUrl":"10.1007/s11336-024-09983-4","url":null,"abstract":"<p><p>Cognitive diagnostic models (CDMs) are a popular family of discrete latent variable models that model students' mastery or deficiency of multiple fine-grained skills. CDMs have been most widely used to model categorical item response data such as binary or polytomous responses. With advances in technology and the emergence of varying test formats in modern educational assessments, new response types, including continuous responses such as response times, and count-valued responses from tests with repetitive tasks or eye-tracking sensors, have also become available. Variants of CDMs have been proposed recently for modeling such responses. However, whether these extended CDMs are identifiable and estimable is entirely unknown. We propose a very general cognitive diagnostic modeling framework for arbitrary types of multivariate responses with minimal assumptions, and establish identifiability in this general setting. Surprisingly, we prove that our general-response CDMs are identifiable under <math><mi>Q</mi></math> -matrix-based conditions similar to those for traditional categorical-response CDMs. Our conclusions set up a new paradigm of identifiable general-response CDMs. We propose an EM algorithm to efficiently estimate a broad class of exponential family-based general-response CDMs. We conduct simulation studies under various response types. The simulation results not only corroborate our identifiability theory, but also demonstrate the superior empirical performance of our estimation algorithms. We illustrate our methodology by applying it to a TIMSS 2019 response time dataset.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1304-1336"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141535981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-06-11DOI: 10.1007/s11336-024-09984-3
Teague R Henry, Lindley R Slipetz, Ami Falk, Jiaxing Qiu, Meng Chen
{"title":"Ordinal Outcome State-Space Models for Intensive Longitudinal Data.","authors":"Teague R Henry, Lindley R Slipetz, Ami Falk, Jiaxing Qiu, Meng Chen","doi":"10.1007/s11336-024-09984-3","DOIUrl":"10.1007/s11336-024-09984-3","url":null,"abstract":"<p><p>Intensive longitudinal (IL) data are increasingly prevalent in psychological science, coinciding with technological advancements that make it simple to deploy study designs such as daily diary and ecological momentary assessments. IL data are characterized by a rapid rate of data collection (1+ collections per day), over a period of time, allowing for the capture of the dynamics that underlie psychological and behavioral processes. One powerful framework for analyzing IL data is state-space modeling, where observed variables are considered measurements for underlying states (i.e., latent variables) that change together over time. However, state-space modeling has typically relied on continuous measurements, whereas psychological data often come in the form of ordinal measurements such as Likert scale items. In this manuscript, we develop a general estimation approach for state-space models with ordinal measurements, specifically focusing on a graded response model for Likert scale items. We evaluate the performance of our model and estimator against that of the commonly used \"linear approximation\" model, which treats ordinal measurements as though they are continuous. We find that our model resulted in unbiased estimates of the state dynamics, while the linear approximation resulted in strongly biased estimates of the state dynamics. Finally, we develop an approximate standard error, termed slice standard errors and show that these approximate standard errors are more liberal than true standard errors (i.e., smaller) at a consistent bias.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1203-1229"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11582181/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141302095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-07-06DOI: 10.1007/s11336-024-09985-2
Siliang Zhang, Yunxiao Chen
{"title":"A Note on Ising Network Analysis with Missing Data.","authors":"Siliang Zhang, Yunxiao Chen","doi":"10.1007/s11336-024-09985-2","DOIUrl":"10.1007/s11336-024-09985-2","url":null,"abstract":"<p><p>The Ising model has become a popular psychometric model for analyzing item response data. The statistical inference of the Ising model is typically carried out via a pseudo-likelihood, as the standard likelihood approach suffers from a high computational cost when there are many variables (i.e., items). Unfortunately, the presence of missing values can hinder the use of pseudo-likelihood, and a listwise deletion approach for missing data treatment may introduce a substantial bias into the estimation and sometimes yield misleading interpretations. This paper proposes a conditional Bayesian framework for Ising network analysis with missing data, which integrates a pseudo-likelihood approach with iterative data imputation. An asymptotic theory is established for the method. Furthermore, a computationally efficient Pólya-Gamma data augmentation procedure is proposed to streamline the sampling of model parameters. The method's performance is shown through simulations and a real-world application to data on major depressive and generalized anxiety disorders from the National Epidemiological Survey on Alcohol and Related Conditions (NESARC).</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1186-1202"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11582142/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141545557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-07-20DOI: 10.1007/s11336-024-09988-z
Daniel McNeish
{"title":"Practical Implications of Sum Scores Being Psychometrics' Greatest Accomplishment.","authors":"Daniel McNeish","doi":"10.1007/s11336-024-09988-z","DOIUrl":"10.1007/s11336-024-09988-z","url":null,"abstract":"<p><p>This paper reflects on some practical implications of the excellent treatment of sum scoring and classical test theory (CTT) by Sijtsma et al. (Psychometrika 89(1):84-117, 2024). I have no major disagreements about the content they present and found it to be an informative clarification of the properties and possible extensions of CTT. In this paper, I focus on whether sum scores-despite their mathematical justification-are positioned to improve psychometric practice in empirical studies in psychology, education, and adjacent areas. First, I summarize recent reviews of psychometric practice in empirical studies, subsequent calls for greater psychometric transparency and validity, and how sum scores may or may not be positioned to adhere to such calls. Second, I consider limitations of sum scores for prediction, especially in the presence of common features like ordinal or Likert response scales, multidimensional constructs, and moderated or heterogeneous associations. Third, I review previous research outlining potential limitations of using sum scores as outcomes in subsequent analyses where rank ordering is not always sufficient to successfully characterize group differences or change over time. Fourth, I cover potential challenges for providing validity evidence for whether sum scores represent a single construct, particularly if one wishes to maintain minimal CTT assumptions. I conclude with thoughts about whether sum scores-even if mathematically justified-are positioned to improve psychometric practice in empirical studies.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1148-1169"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141731649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-07-21DOI: 10.1007/s11336-024-09982-5
Jules L Ellis, Klaas Sijtsma, Kristel de Groot, Patrick J F Groenen
{"title":"Reliability Theory for Measurements with Variable Test Length, Illustrated with ERN and Pe Collected in the Flanker Task.","authors":"Jules L Ellis, Klaas Sijtsma, Kristel de Groot, Patrick J F Groenen","doi":"10.1007/s11336-024-09982-5","DOIUrl":"10.1007/s11336-024-09982-5","url":null,"abstract":"<p><p>In psychophysiology, an interesting question is how to estimate the reliability of event-related potentials collected by means of the Eriksen Flanker Task or similar tests. A special problem presents itself if the data represent neurological reactions that are associated with some responses (in case of the Flanker Task, responding incorrectly on a trial) but not others (like when providing a correct response), inherently resulting in unequal numbers of observations per subject. The general trend in reliability research here is to use generalizability theory and Bayesian estimation. We show that a new approach based on classical test theory and frequentist estimation can do the job as well and in a simpler way, and even provides additional insight to matters that were unsolved in the generalizability method approach. One of our contributions is the definition of a single, overall reliability coefficient for an entire group of subjects with unequal numbers of observations. Both methods have slightly different objectives. We argue in favor of the classical approach but without rejecting the generalizability approach.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1280-1303"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11582099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-08-10DOI: 10.1007/s11336-024-09998-x
Na Shan, Ping-Feng Xu
{"title":"Bayesian Adaptive Lasso for Detecting Item-Trait Relationship and Differential Item Functioning in Multidimensional Item Response Theory Models.","authors":"Na Shan, Ping-Feng Xu","doi":"10.1007/s11336-024-09998-x","DOIUrl":"10.1007/s11336-024-09998-x","url":null,"abstract":"<p><p>In multidimensional tests, the identification of latent traits measured by each item is crucial. In addition to item-trait relationship, differential item functioning (DIF) is routinely evaluated to ensure valid comparison among different groups. The two problems are investigated separately in the literature. This paper uses a unified framework for detecting item-trait relationship and DIF in multidimensional item response theory (MIRT) models. By incorporating DIF effects in MIRT models, these problems can be considered as variable selection for latent/observed variables and their interactions. A Bayesian adaptive Lasso procedure is developed for variable selection, in which item-trait relationship and DIF effects can be obtained simultaneously. Simulation studies show the performance of our method for parameter estimation, the recovery of item-trait relationship and the detection of DIF effects. An application is presented using data from the Eysenck Personality Questionnaire.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1337-1365"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141914581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-08-30DOI: 10.1007/s11336-024-10000-x
Khadiga H A Sayed, Maarten J L F Cruyff, Peter G M van der Heijden
{"title":"Modeling Evasive Response Bias in Randomized Response: Cheater Detection Versus Self-protective No-Saying.","authors":"Khadiga H A Sayed, Maarten J L F Cruyff, Peter G M van der Heijden","doi":"10.1007/s11336-024-10000-x","DOIUrl":"10.1007/s11336-024-10000-x","url":null,"abstract":"<p><p>Randomized response is an interview technique for sensitive questions designed to eliminate evasive response bias. Since this elimination is only partially successful, two models have been proposed for modeling evasive response bias: the cheater detection model for a design with two sub-samples with different randomization probabilities and the self-protective no sayers model for a design with multiple sensitive questions. This paper shows the correspondence between these models, and introduces models for the new, hybrid \"ever/last year\" design that account for self-protective no saying and cheating. The model for one set of ever/last year questions has a degree of freedom that can be used for the inclusion of a response bias parameter. Models with multiple degrees of freedom are introduced for extensions of the design with a third randomized response question and a second set of ever/last year questions. The models are illustrated with two surveys on doping use. We conclude with a discussion of the pros and cons of the ever/last year design and its potential for future research.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1261-1279"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11582306/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-12-01Epub Date: 2024-08-17DOI: 10.1007/s11336-024-09997-y
Zhongtian Lin, Tao Jiang, Frank Rijmen, Paul Van Wamelen
{"title":"Asymptotically Correct Person Fit z-Statistics For the Rasch Testlet Model.","authors":"Zhongtian Lin, Tao Jiang, Frank Rijmen, Paul Van Wamelen","doi":"10.1007/s11336-024-09997-y","DOIUrl":"10.1007/s11336-024-09997-y","url":null,"abstract":"<p><p>A well-known person fit statistic in the item response theory (IRT) literature is the <math><msub><mi>l</mi> <mi>z</mi></msub> </math> statistic (Drasgow et al. in Br J Math Stat Psychol 38(1):67-86, 1985). Snijders (Psychometrika 66(3):331-342, 2001) derived <math><mmultiscripts><mi>l</mi> <mrow><mi>z</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , which is the asymptotically correct version of <math><msub><mi>l</mi> <mi>z</mi></msub> </math> when the ability parameter is estimated. However, both statistics and other extensions later developed concern either only the unidimensional IRT models or multidimensional models that require a joint estimate of latent traits across all the dimensions. Considering a marginalized maximum likelihood ability estimator, this paper proposes <math><msub><mi>l</mi> <mrow><mi>zt</mi></mrow> </msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , which are extensions of <math><msub><mi>l</mi> <mi>z</mi></msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>z</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , respectively, for the Rasch testlet model. The computation of <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> relies on several extensions of the Lord-Wingersky algorithm (1984) that are additional contributions of this paper. Simulation results show that <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> has close-to-nominal Type I error rates and satisfactory power for detecting aberrant responses. For unidimensional models, <math><msub><mi>l</mi> <mrow><mi>zt</mi></mrow> </msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> reduce to <math><msub><mi>l</mi> <mi>z</mi></msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>z</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , respectively, and therefore allows for the evaluation of person fit with a wider range of IRT models. A real data application is presented to show the utility of the proposed statistics for a test with an underlying structure that consists of both the traditional unidimensional component and the Rasch testlet component.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1230-1260"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141996955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}