Zhiwei Dou, Sigert Ariens, Eva Ceulemans, Ginette Lafit
{"title":"Unravelling the small sample bias in AR(1) models: The pros and cons of available bias correction methods.","authors":"Zhiwei Dou, Sigert Ariens, Eva Ceulemans, Ginette Lafit","doi":"10.1111/bmsp.70038","DOIUrl":"https://doi.org/10.1111/bmsp.70038","url":null,"abstract":"<p><p>The first-order autoregressive [AR(1)] model is widely used to investigate psychological dynamics. This study focusses on the estimation and inference of the autoregressive (AR) effect in AR(1) models under a limited sample size-a common scenario in psychological research. State-of-the-art estimators of the autoregressive effect are known to be biased when sample sizes are small. We analytically demonstrate the causes and consequences of this small sample bias on the estimation of the AR effect, its variance and the AR(1) model's intercept, particularly when using OLS. In addition, we reviewed various bias correction methods proposed in the time-series literature. A simulation study compares the OLS estimator with these correction methods in terms of estimation accuracy and inference. The main result indicates that the small sample bias of the OLS estimator of the autoregressive effect is a consequence of limited information and correcting for this bias without more information always induces a bias-variance trade-off. Nevertheless, correction methods discussed in this research may offer improved statistical power under moderate sample sizes when the primary research goal is hypothesis testing.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147357621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shedding some light on the relationship between measurement error and statistical power in multilevel models applied to intensive longitudinal designs.","authors":"Ginette Lafit, Sigert Ariens, Richard Artner","doi":"10.1111/bmsp.70040","DOIUrl":"https://doi.org/10.1111/bmsp.70040","url":null,"abstract":"<p><p>We examine multilevel models applied to intensive longitudinal (IL) designs. Many measurements in IL research are influenced by measurement error, which can compromise the consistency of estimates obtained through maximum likelihood estimation (MLE). While previous research has addressed the impact of measurement error in broader multilevel contexts, its effects on statistical power-particularly concerning cross-level interaction effects-have not been thoroughly explored. This study aims to clarify the relationship between measurement error and statistical power by deriving analytical formulas for the asymptotic bias and precision matrix of the MLE of fixed effects in multilevel models that account for additive measurement errors and autoregressive [AR(1)] within-person errors. Furthermore, we analyse how different sources of measurement error affect the standard errors of fixed effects estimates and the overall statistical power, specifically when both time-varying and time-invariant predictors, as well as the response variable, are measured with error. Our findings show that measurement error in predictors leads to downward bias in the MLE of fixed effects, whereas MLE estimates of fixed effects remain unbiased when there is measurement error in the response variable. Finally, we explore how these errors influence statistical power for cross-level interaction effects between time-varying and time-invariant predictors.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147312733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Madlen Hoffstadt, Lourens Waldorp, Javier Garcia-Bernardo, Han van der Maas
{"title":"ReMoDe - Recursive modality detection in distributions of ordinal data.","authors":"Madlen Hoffstadt, Lourens Waldorp, Javier Garcia-Bernardo, Han van der Maas","doi":"10.1111/bmsp.70037","DOIUrl":"https://doi.org/10.1111/bmsp.70037","url":null,"abstract":"<p><p>The detection of the number of modes in distributions of ordinal data is relevant for applied researchers across disciplines, from uncovering polarization to detecting incidence groups in clinical symptom scales. Yet, established modality detection methods are either purely descriptive or not developed for ordinal data. In the present paper, we attempt to fill this gap by proposing a recursive modality detection method (ReMoDe) which detects modes in univariate distributions through recursive significance testing. We provide a comprehensive review of existing modality detection methods and outline their potential pitfalls when applied to ordinal scales. Based on a benchmark of 172 simulated ordinal samples of different sample sizes, we demonstrate that ReMoDe outperforms other established modality detection methods. We furthermore present a stability test for our method as well as p-values and approximated Bayes factors for each detected mode. To make our method easily applicable for researchers, we introduce open-source R and Python packages.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146230015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending reliability to intensive longitudinal data with the Kalman filter.","authors":"Michael D Hunter","doi":"10.1111/bmsp.70039","DOIUrl":"https://doi.org/10.1111/bmsp.70039","url":null,"abstract":"<p><p>Reliability is central to how researchers approach measurement in standard, group-based analyses of single-time-point data, yet this critical aspect is often overlooked in the analysis of repeated observations. Since its inception, reliability has been a between-person concept, but we redevelop this notion for within-person designs by proposing a new coefficient <math> <semantics><mrow><mi>κ</mi></mrow> <annotation>$$ kappa $$</annotation></semantics> </math> of reliability for single-subject designs. This coefficient shares the same general definition of reliability as former coefficients-the ratio of the true score variance to the total variance-but applies to time-dependent within-person variability rather than independent between-person variability. Coefficient <math> <semantics><mrow><mi>κ</mi></mrow> <annotation>$$ kappa $$</annotation></semantics> </math> begins with a latent variable time series model called a state space model, and is then extended to a state space model for multiple subjects with continuous or discrete variation across people. Using analytic methods, we derive coefficient <math> <semantics><mrow><mi>κ</mi></mrow> <annotation>$$ kappa $$</annotation></semantics> </math> and prove its relations to other coefficients of reliability.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146222108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient MCMC-INLA algorithm for Bayesian inference of logistic graded response models.","authors":"Yu Zhou, Yincai Tang, Siliang Zhang","doi":"10.1111/bmsp.70033","DOIUrl":"https://doi.org/10.1111/bmsp.70033","url":null,"abstract":"<p><p>This paper proposes a Bayesian MCMC-INLA algorithm specifically designed for both unidimensional and multidimensional logistic graded response models (LGRMs). The algorithm incorporates a computationally efficient data augmentation approach by introducing Pólya-Gamma variables and latent variables, thereby addressing the limitations of traditional Bayesian MCMC methods in handling item response theory (IRT) models with logistic link functions. By integrating the advanced and efficient integrated nested Laplace approximation (INLA) framework, the MCMC-INLA algorithm achieves both high computational efficiency and estimation accuracy. The paper provides detailed derivations of the posterior and conditional distributions for IRT models, outlines the incorporation of Pólya-Gamma and latent variables within the Gibbs sampling procedure, and presents the implementation of the MCMC-INLA algorithm for both unidimensional and multidimensional cases. The performance of the proposed algorithm is evaluated through extensive simulation studies and an empirical application to the IPIP-NEO dataset. Potential extensions of the MCMC-INLA framework to other IRT models are also discussed.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146151406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variational Bayesian inference for sparse item response theory models.","authors":"Yemao Xia, Yu Xue, Depeng Jiang","doi":"10.1111/bmsp.70032","DOIUrl":"https://doi.org/10.1111/bmsp.70032","url":null,"abstract":"<p><p>Item response theory (IRT) model is a widely appreciated statistical method in exploring the relationship between individual latent traits and item responses. In this paper, a sparse IRT model is established to address the sparsity of factor loadings. A global and local shrinkage prior is imposed to penalize the factor loadings: the global parameter controls the amount of shrinkage at the column levels, while the local parameter adjusts the penalty of factor loadings within each column. We develop a variational Bayesian procedure to conduct posterior inference. By exploiting a stochastic representation for logistic function, we frame sparse IRT model as a mixture model mixing with Pólya-Gamma distribution. Such a strategy admits a conjugate posterior for the latent quantity, thus leading to a straightforward posterior computation. We assess the performance of the proposed method via a simulation study. A real example related to personality assessment is analysed to illustrate the usefulness of methodology.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Latent Poisson count models for action count data from technology-enhanced assessments.","authors":"Gregory Arbet, Hyeon-Ah Kang","doi":"10.1111/bmsp.70036","DOIUrl":"10.1111/bmsp.70036","url":null,"abstract":"<p><p>Recent advances in computerized assessments have enabled the use of innovative item formats (e.g., drag-and-drop, scenario-based), necessitating a flexible model that can capture systematic influence of item types on action counts. In this study, we present a refinement scheme that can explicitly model common features of items and allows inference on the item-type effects. We apply multifaceted parameterization to characterize the common and unique features of items and implement the formulation in two existing models, the Rasch and Conway-Maxwell-Poisson count models. The inference procedures for the proposed models are presented using Stan and validated for estimation accuracy. Numerical experimentation with simulated data suggest that the proposed inferential scheme adequately recovers the underlying model parameters. Empirical application demonstrated that the proposed refinement holds practical relevance when data exhibit distinct item-type effects. Based on the findings from the empirical investigation, we discuss practical considerations in applying the Poisson models for analysing count data.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting reliability with human and machine learning raters under scoring design and rater configuration in the many-facet Rasch model.","authors":"Xingyao Xiao, Richard J Patz, Mark R Wilson","doi":"10.1111/bmsp.70034","DOIUrl":"https://doi.org/10.1111/bmsp.70034","url":null,"abstract":"<p><p>Constructed-response (CR) items are widely used to assess higher order skills but require human scoring, which introduces variability and is costly at scale. Machine learning (ML)-based scoring offers a scalable alternative, yet its psychometric consequences in rater-mediated models remain underexplored. This study examines how scoring design, rater bias, ML inconsistency and model specification affect the reliability of ability estimation in polytomous CR assessments. Using Monte Carlo simulation, we manipulated human and ML rater bias, ML inconsistency and scoring density (complete, overlapping, isolated). Five estimation models were compared, including the Partial Credit Model (PCM) with fixed thresholds and the Many-Facet Partial Credit Model (MFPCM) with and without free calibration. Results showed that systematic bias, not random inconsistency, was the main source of error. Hybrid human-ML scoring improved estimation when raters were unbiased or exhibited opposing biases, but error compounded when biases aligned. Across designs, PCM with fixed thresholds consistently outperformed more complex alternatives, while anchoring CR items to selected-response metrics stabilized MFPCM estimation. The real data application replicated these patterns. Findings show that scoring design and bias structure, rather than model complexity, drive the benefits of hybrid scoring and that anchoring offers a practical strategy for stabilizing estimation.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146094849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian inference for dynamic Q matrices and attribute trajectories in hidden Markov diagnostic classification models.","authors":"Chen-Wei Liu","doi":"10.1111/bmsp.70028","DOIUrl":"https://doi.org/10.1111/bmsp.70028","url":null,"abstract":"<p><p>Hidden Markov diagnostic classification models capture how students' cognitive attributes evolve over time. This paper introduces a Bayesian Markov chain Monte Carlo algorithm for diagnostic classification models that jointly estimates time-varying Q matrices, latent attributes, item parameters, attribute class proportions and transition matrices across multiple occasions. Using the R package hmdcm developed for this study, Monte Carlo simulations demonstrate accurate parameter recovery, and an empirical probability-concept assessment confirmed the algorithm's ability to trace attribute trajectories, supporting its value for longitudinal diagnostic classification in both research and instructional practice.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146013432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing generalizability theory with mixed-effects models for heteroscedasticity in psychological measurement: A theoretical introduction with an application from EEG data.","authors":"Philippe Rast, Peter E Clayson","doi":"10.1111/bmsp.70026","DOIUrl":"10.1111/bmsp.70026","url":null,"abstract":"<p><p>Generalizability theory (G-theory) defines a statistical framework for assessing measurement reliability by decomposing observed variance into meaningful components attributable to persons, facets, and error. Classic G-theory assumes homoscedastic residual variances across measurement conditions, an assumption that is often violated in psychological and behavioural data. The main focus of this work is to extend G-theory using a mixed-effects location-scale model (MELSM) that allows residual error variance to vary systematically across conditions and persons. By modeling heteroscedasticity, we can extend the computation of condition-specific generalizability ( <math> <semantics> <mrow> <msub><mrow><mi>G</mi></mrow> <mrow><mi>t</mi></mrow> </msub> </mrow> <annotation>$$ {G}_t $$</annotation></semantics> </math> ) and dependability ( <math> <semantics> <mrow> <msub><mrow><mi>D</mi></mrow> <mrow><mi>t</mi></mrow> </msub> </mrow> <annotation>$$ {D}_t $$</annotation></semantics> </math> ) coefficients to reflect local reliability under varying degrees of measurement precision. As an illustration, we apply the model to empirical data from an EEG experiment and show that failing to account for variance heterogeneity can mask meaningful differences in measurement quality. A simulation-based decision study further demonstrates how targeted increases in measurement density can improve reliability for low-precision conditions or participants. The proposed framework retains the interpretative character of classical G-theory while enhancing its flexibility. We argue that it supports finer-grained insights on conditions that influence reliability and better-informed design decisions in psychological measurements. We discuss implications for individualized reliability assessment, adaptive measurement strategies, and future extensions to multi-facet designs.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145999604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}