{"title":"Bayesian penalty methods for evaluating measurement invariance in moderated nonlinear factor analysis.","authors":"Holger Brandt, Siyuan Marco Chen, Daniel J Bauer","doi":"10.1037/met0000552","DOIUrl":null,"url":null,"abstract":"<p><p>Measurement invariance (MI) is one of the main psychometric requirements for analyses that focus on potentially heterogeneous populations. MI allows researchers to compare latent factor scores across persons from different subgroups, whereas if a measure is not invariant across all items and persons then such comparisons may be misleading. If full MI does not hold further testing may identify problematic items showing differential item functioning (DIF). Most methods developed to test DIF focused on simple scenarios often with comparisons across two groups. In practical applications, this is an oversimplification if many grouping variables (e.g., gender, race) or continuous covariates (e.g., age) exist that might influence the measurement properties of items; these variables are often correlated, making traditional tests that consider each variable separately less useful. Here, we propose the application of Bayesian Moderated Nonlinear Factor Analysis to overcome limitations of traditional approaches to detect DIF. We investigate how modern Bayesian shrinkage priors can be used to identify DIF items in situations with many groups and continuous covariates. We compare the performance of lasso-type, spike-and-slab, and global-local shrinkage priors (e.g., horseshoe) to standard normal and small variance priors. Results indicate that spike-and-slab and lasso priors outperform the other priors. Horseshoe priors provide slightly lower power compared to lasso and spike-and-slab priors. Small variance priors result in very low power to detect DIF with sample sizes below 800, and normal priors may produce severely inflated type I error rates. We illustrate the approach with data from the PISA 2018 study. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000552","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 1
Abstract
Measurement invariance (MI) is one of the main psychometric requirements for analyses that focus on potentially heterogeneous populations. MI allows researchers to compare latent factor scores across persons from different subgroups, whereas if a measure is not invariant across all items and persons then such comparisons may be misleading. If full MI does not hold further testing may identify problematic items showing differential item functioning (DIF). Most methods developed to test DIF focused on simple scenarios often with comparisons across two groups. In practical applications, this is an oversimplification if many grouping variables (e.g., gender, race) or continuous covariates (e.g., age) exist that might influence the measurement properties of items; these variables are often correlated, making traditional tests that consider each variable separately less useful. Here, we propose the application of Bayesian Moderated Nonlinear Factor Analysis to overcome limitations of traditional approaches to detect DIF. We investigate how modern Bayesian shrinkage priors can be used to identify DIF items in situations with many groups and continuous covariates. We compare the performance of lasso-type, spike-and-slab, and global-local shrinkage priors (e.g., horseshoe) to standard normal and small variance priors. Results indicate that spike-and-slab and lasso priors outperform the other priors. Horseshoe priors provide slightly lower power compared to lasso and spike-and-slab priors. Small variance priors result in very low power to detect DIF with sample sizes below 800, and normal priors may produce severely inflated type I error rates. We illustrate the approach with data from the PISA 2018 study. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.