{"title":"The effective sample size in Bayesian information criterion for level-specific fixed and random-effect selection in a two-level nested model","authors":"Sun-Joo Cho, Hao Wu, Matthew Naveiras","doi":"10.1111/bmsp.12327","DOIUrl":null,"url":null,"abstract":"<p>Popular statistical software provides the Bayesian information criterion (BIC) for multi-level models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties as to the proper use of the BIC in selecting a multi-level model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multi-level models. In this study, we derive the BIC's penalty term for level-specific fixed- and random-effect selection in a two-level nested design. In this new version of BIC, called <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math>, this penalty term is decomposed into two parts if the random-effect variance–covariance matrix has full rank: (a) a term with the log of average sample size per cluster and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we derive the new version of BIC, called <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>2</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math>, in the presence of redundant random effects. We show that the derived formulae, <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math> and <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>2</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math>, adhere to empirical values via numerical demonstration and that <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n </mrow>\n </msub>\n </mrow>\n </semantics></math> (<span></span><math>\n <semantics>\n <mrow>\n <mi>E</mi>\n </mrow>\n </semantics></math> indicating either <span></span><math>\n <semantics>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </semantics></math> or <span></span><math>\n <semantics>\n <mrow>\n <mi>E</mi>\n <mn>2</mn>\n </mrow>\n </semantics></math>) is the best global selection criterion, as it performs at least as well as BIC with the total sample size and BIC with the number of clusters across various multi-level conditions through a simulation study. In addition, the use of <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math> is illustrated with a textbook example dataset.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"77 2","pages":"289-315"},"PeriodicalIF":1.5000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Mathematical & Statistical Psychology","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/bmsp.12327","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Popular statistical software provides the Bayesian information criterion (BIC) for multi-level models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties as to the proper use of the BIC in selecting a multi-level model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multi-level models. In this study, we derive the BIC's penalty term for level-specific fixed- and random-effect selection in a two-level nested design. In this new version of BIC, called , this penalty term is decomposed into two parts if the random-effect variance–covariance matrix has full rank: (a) a term with the log of average sample size per cluster and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we derive the new version of BIC, called , in the presence of redundant random effects. We show that the derived formulae, and , adhere to empirical values via numerical demonstration and that ( indicating either or ) is the best global selection criterion, as it performs at least as well as BIC with the total sample size and BIC with the number of clusters across various multi-level conditions through a simulation study. In addition, the use of is illustrated with a textbook example dataset.
期刊介绍:
The British Journal of Mathematical and Statistical Psychology publishes articles relating to areas of psychology which have a greater mathematical or statistical aspect of their argument than is usually acceptable to other journals including:
• mathematical psychology
• statistics
• psychometrics
• decision making
• psychophysics
• classification
• relevant areas of mathematics, computing and computer software
These include articles that address substantitive psychological issues or that develop and extend techniques useful to psychologists. New models for psychological processes, new approaches to existing data, critiques of existing models and improved algorithms for estimating the parameters of a model are examples of articles which may be favoured.