{"title":"Confidence interval-based sample size determination formulas and some mathematical properties for hierarchical data.","authors":"S. Usami","doi":"10.1111/bmsp.12181","DOIUrl":"https://doi.org/10.1111/bmsp.12181","url":null,"abstract":"The use of hierarchical data (also called multilevel data or clustered data) is common in behavioural and psychological research when data of lower-level units (e.g., students, clients, repeated measures) are nested within clusters or higher-level units (e.g., classes, hospitals, individuals). Over the past 25 years we have seen great advances in methods for computing the sample sizes needed to obtain the desired statistical properties for such data in experimental evaluations. The present research provides closed-form and iterative formulas for sample size determination that can be used to ensure the desired width of confidence intervals for hierarchical data. Formulas are provided for a four-level hierarchical linear model that assumes slope variances and inclusion of covariates under both balanced and unbalanced designs. In addition, we address several mathematical properties relating to sample size determination for hierarchical data via the standard errors of experimental effect estimates. These include the relative impact of several indices (e.g., random intercept or slope variance at each level) on standard errors, asymptotic standard errors, minimum required values at the highest level, and generalized expressions of standard errors for designs with any-level randomization under any number of levels. In particular, information on the minimum required values will help researchers to minimize the risk of conducting experiments that are statistically unlikely to show the presence of an experimental effect.","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128607599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rudolf Debelak, Samuel Pawel, C. Strobl, E. Merkle
{"title":"Score‐based measurement invariance checks for Bayesian maximum‐a‐posteriori estimates in item response theory","authors":"Rudolf Debelak, Samuel Pawel, C. Strobl, E. Merkle","doi":"10.31234/osf.io/24a9g","DOIUrl":"https://doi.org/10.31234/osf.io/24a9g","url":null,"abstract":"A family of score‐based tests has been proposed in recent years for assessing the invariance of model parameters in several models of item response theory (IRT). These tests were originally developed in a maximum likelihood framework. This study discusses analogous tests for Bayesian maximum‐a‐posteriori estimates and multiple‐group IRT models. We propose two families of statistical tests, which are based on an approximation using a pooled variance method, or on a simulation approach based on asymptotic results. The resulting tests were evaluated by a simulation study, which investigated their sensitivity against differential item functioning with respect to a categorical or continuous person covariate in the two‐ and three‐parametric logistic models. Whereas the method based on pooled variance was found to be useful in practice with maximum likelihood as well as maximum‐a‐posteriori estimates, the simulation‐based approach was found to require large sample sizes to lead to satisfactory results.","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134349107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hierarchical latent response model for inferences about examinee engagement in terms of guessing and item-level non-response.","authors":"Esther Ulitzsch, Matthias von Davier, S. Pohl","doi":"10.1111/bmsp.12188","DOIUrl":"https://doi.org/10.1111/bmsp.12188","url":null,"abstract":"In low-stakes assessments, test performance has few or no consequences for examinees themselves, so that examinees may not be fully engaged when answering the items. Instead of engaging in solution behaviour, disengaged examinees might randomly guess or generate no response at all. When ignored, examinee disengagement poses a severe threat to the validity of results obtained from low-stakes assessments. Statistical modelling approaches in educational measurement have been proposed that account for non-response or for guessing, but do not consider both types of disengaged behaviour simultaneously. We bring together research on modelling examinee engagement and research on missing values and present a hierarchical latent response model for identifying and modelling the processes associated with examinee disengagement jointly with the processes associated with engaged responses. To that end, we employ a mixture model that identifies disengagement at the item-by-examinee level by assuming different data-generating processes underlying item responses and omissions, respectively, as well as response times associated with engaged and disengaged behaviour. By modelling examinee engagement with a latent response framework, the model allows assessing how examinee engagement relates to ability and speed as well as to identify items that are likely to evoke disengaged test-taking behaviour. An illustration of the model by means of an application to real data is presented.","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134349927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian power equivalence in latent growth curve models","authors":"A. Stefan, Timo von Oertzen","doi":"10.1111/bmsp.12193","DOIUrl":"https://doi.org/10.1111/bmsp.12193","url":null,"abstract":"Longitudinal studies are the gold standard for research on time‐dependent phenomena in the social sciences. However, they often entail high costs due to multiple measurement occasions and a long overall study duration. It is therefore useful to optimize these design factors while maintaining a high informativeness of the design. Von Oertzen and Brandmaier (2013,Psychology and Aging, 28, 414) applied power equivalence to show that Latent Growth Curve Models (LGCMs) with different design factors can have the same power for likelihood‐ratio tests on the latent structure. In this paper, we show that the notion of power equivalence can be extended to Bayesian hypothesis tests of the latent structure constants. Specifically, we show that the results of a Bayes factor design analysis (BFDA; Schönbrodt & Wagenmakers (2018,Psychonomic Bulletin and Review, 25, 128) of two power equivalent LGCMs are equivalent. This will be useful for researchers who aim to plan for compelling evidence instead of frequentist power and provides a contribution towards more efficient procedures for BFDA.","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125796480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The use of item scores and response times to detect examinees who may have benefited from item preknowledge.","authors":"S. Sinharay, Matthew S. Johnson","doi":"10.1111/bmsp.12187","DOIUrl":"https://doi.org/10.1111/bmsp.12187","url":null,"abstract":"According to Wollack and Schoenig (2018, The Sage encyclopedia of educational research, measurement, and evaluation. Thousand Oaks, CA: Sage, 260), benefiting from item preknowledge is one of the three broad types of test fraud that occur in educational assessments. We use tools from constrained statistical inference to suggest a new statistic that is based on item scores and response times and can be used to detect examinees who may have benefited from item preknowledge for the case when the set of compromised items is known. The asymptotic distribution of the new statistic under no preknowledge is proved to be a simple mixture of two χ2 distributions. We perform a detailed simulation study to show that the Type I error rate of the new statistic is very close to the nominal level and that the power of the new statistic is satisfactory in comparison to that of the existing statistics for detecting item preknowledge based on both item scores and response times. We also include a real data example to demonstrate the usefulness of the suggested statistic.","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115047333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lieke Voncken, T. Kneib, C. Albers, Nikolaus Umlauf, M. Timmerman
{"title":"Bayesian Gaussian distributional regression models for more efficient norm estimation","authors":"Lieke Voncken, T. Kneib, C. Albers, Nikolaus Umlauf, M. Timmerman","doi":"10.31234/osf.io/7j8ym","DOIUrl":"https://doi.org/10.31234/osf.io/7j8ym","url":null,"abstract":"A test score on a psychological test is usually expressed as a normed score, representing its position relative to test scores in a reference population. These typically depend on predictor(s) such as age. The test score distribution conditional on predictors is estimated using regression, which may need large normative samples to estimate the relationships between the predictor(s) and the distribution characteristics properly. In this study, we examine to what extent this burden can be alleviated by using prior information in the estimation of new norms with Bayesian Gaussian distributional regression. In a simulation study, we investigate to what extent this norm estimation is more efficient and how robust it is to prior model deviations. We varied the prior type, prior misspecification and sample size. In our simulated conditions, using a fixed effects prior resulted in more efficient norm estimation than a weakly informative prior as long as the prior misspecification was not age dependent. With the proposed method and reasonable prior information, the same norm precision can be achieved with a smaller normative sample, at least in empirical problems similar to our simulated conditions. This may help test developers to achieve cost‐efficient high‐quality norms. The method is illustrated using empirical normative data from the IDS‐2 intelligence test.","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128170495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A note on monotonicity of item response functions for ordered polytomous item response theory models.","authors":"Hyeon-Ah Kang, Ya-Hui Su, Hua-Hua Chang","doi":"10.1111/bmsp.12131","DOIUrl":"https://doi.org/10.1111/bmsp.12131","url":null,"abstract":"<p><p>A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales.</p>","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"71 3","pages":"523-535"},"PeriodicalIF":2.6,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/bmsp.12131","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35894261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A penalized likelihood method for multi-group structural equation modelling.","authors":"Po-Hsien Huang","doi":"10.1111/bmsp.12130","DOIUrl":"https://doi.org/10.1111/bmsp.12130","url":null,"abstract":"<p><p>In the past two decades, statistical modelling with sparsity has become an active research topic in the fields of statistics and machine learning. Recently, Huang, Chen and Weng (2017, Psychometrika, 82, 329) and Jacobucci, Grimm, and McArdle (2016, Structural Equation Modeling: A Multidisciplinary Journal, 23, 555) both proposed sparse estimation methods for structural equation modelling (SEM). These methods, however, are restricted to performing single-group analysis. The aim of the present work is to establish a penalized likelihood (PL) method for multi-group SEM. Our proposed method decomposes each group model parameter into a common reference component and a group-specific increment component. By penalizing the increment components, the heterogeneity of parameter values across the population can be explored since the null group-specific effects are expected to diminish. We developed an expectation-conditional maximization algorithm to optimize the PL criteria. A numerical experiment and a real data example are presented to demonstrate the potential utility of the proposed method.</p>","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"71 3","pages":"499-522"},"PeriodicalIF":2.6,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/bmsp.12130","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35880118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jenő Reiczigel, Márton Ispány, Gábor Tusnády, György Michaletzky, Marco Marozzi
{"title":"Bias-corrected estimation of the Rudas-Clogg-Lindsay mixture index of fit.","authors":"Jenő Reiczigel, Márton Ispány, Gábor Tusnády, György Michaletzky, Marco Marozzi","doi":"10.1111/bmsp.12118","DOIUrl":"https://doi.org/10.1111/bmsp.12118","url":null,"abstract":"<p><p>Rudas, Clogg, and Lindsay (1994, J. R Stat Soc. Ser. B, 56, 623) introduced the so-called mixture index of fit, also known as pi-star (π*), for quantifying the goodness of fit of a model. It is the lowest proportion of 'contamination' which, if removed from the population or from the sample, makes the fit of the model perfect. The mixture index of fit has been widely used in psychometric studies. We show that the asymptotic confidence limits proposed by Rudas et al. (1994, J. R Stat Soc. Ser. B, 56, 623) as well as the jackknife confidence interval by Dayton (, Br. J. Math. Stat. Psychol., 56, 1) perform poorly, and propose a new bias-corrected point estimate, a bootstrap test and confidence limits for pi-star. The proposed confidence limits have coverage probability much closer to the nominal level than the other methods do. We illustrate the usefulness of the proposed method in practice by presenting some practical applications to log-linear models for contingency tables.</p>","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"71 3","pages":"459-471"},"PeriodicalIF":2.6,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/bmsp.12118","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35345110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Numerical approximation of the observed information matrix with Oakes' identity.","authors":"R Philip Chalmers","doi":"10.1111/bmsp.12127","DOIUrl":"https://doi.org/10.1111/bmsp.12127","url":null,"abstract":"<p><p>An efficient and accurate numerical approximation methodology useful for obtaining the observed information matrix and subsequent asymptotic covariance matrix when fitting models with the EM algorithm is presented. The numerical approximation approach is compared to existing algorithms intended for the same purpose, and the computational benefits and accuracy of this new approach are highlighted. Instructive and real-world examples are included to demonstrate the methodology concretely, properties of the estimator are discussed in detail, and a Monte Carlo simulation study is included to investigate the behaviour of a multi-parameter item response theory model using three competing finite-difference algorithms.</p>","PeriodicalId":272649,"journal":{"name":"The British journal of mathematical and statistical psychology","volume":"71 3","pages":"415-436"},"PeriodicalIF":2.6,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/bmsp.12127","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35721601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}