The effective sample size in Bayesian information criterion for level-specific fixed and random-effect selection in a two-level nested model

IF 1.8 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology Pub Date : 2023-12-01 DOI:10.1111/bmsp.12327

Sun-Joo Cho, Hao Wu, Matthew Naveiras

{"title":"The effective sample size in Bayesian information criterion for level-specific fixed and random-effect selection in a two-level nested model","authors":"Sun-Joo Cho, Hao Wu, Matthew Naveiras","doi":"10.1111/bmsp.12327","DOIUrl":null,"url":null,"abstract":"Popular statistical software provides the Bayesian information criterion (BIC) for multi-level models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties as to the proper use of the BIC in selecting a multi-level model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multi-level models. In this study, we derive the BIC's penalty term for level-specific fixed- and random-effect selection in a two-level nested design. In this new version of BIC, called <math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math>, this penalty term is decomposed into two parts if the random-effect variance–covariance matrix has full rank: (a) a term with the log of average sample size per cluster and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we derive the new version of BIC, called <math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>2</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math>, in the presence of redundant random effects. We show that the derived formulae, <math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math> and <math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>2</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math>, adhere to empirical values via numerical demonstration and that <math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n </mrow>\n </msub>\n </mrow>\n </semantics></math> (<math>\n <semantics>\n <mrow>\n <mi>E</mi>\n </mrow>\n </semantics></math> indicating either <math>\n <semantics>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </semantics></math> or <math>\n <semantics>\n <mrow>\n <mi>E</mi>\n <mn>2</mn>\n </mrow>\n </semantics></math>) is the best global selection criterion, as it performs at least as well as BIC with the total sample size and BIC with the number of clusters across various multi-level conditions through a simulation study. In addition, the use of <math>\n <semantics>\n <mrow>\n <msub>\n <mi>BIC</mi>\n <mrow>\n <mi>E</mi>\n <mn>1</mn>\n </mrow>\n </msub>\n </mrow>\n </semantics></math> is illustrated with a textbook example dataset.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"77 2","pages":"289-315"},"PeriodicalIF":1.8000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Mathematical & Statistical Psychology","FirstCategoryId":"102","ListUrlMain":"https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/bmsp.12327","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Popular statistical software provides the Bayesian information criterion (BIC) for multi-level models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties as to the proper use of the BIC in selecting a multi-level model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multi-level models. In this study, we derive the BIC's penalty term for level-specific fixed- and random-effect selection in a two-level nested design. In this new version of BIC, called ${BIC}_{E 1}$ , this penalty term is decomposed into two parts if the random-effect variance–covariance matrix has full rank: (a) a term with the log of average sample size per cluster and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we derive the new version of BIC, called ${BIC}_{E 2}$ , in the presence of redundant random effects. We show that the derived formulae, ${BIC}_{E 1}$ and ${BIC}_{E 2}$ , adhere to empirical values via numerical demonstration and that ${BIC}_{E}$ ( $E$ indicating either $E 1$ or $E 2$ ) is the best global selection criterion, as it performs at least as well as BIC with the total sample size and BIC with the number of clusters across various multi-level conditions through a simulation study. In addition, the use of ${BIC}_{E 1}$ is illustrated with a textbook example dataset.

Abstract Image

查看原文本刊更多论文

两层嵌套模型中特定水平固定和随机效应选择的贝叶斯信息准则的有效样本量

流行的统计软件为多级模型或线性混合模型提供了贝叶斯信息准则(BIC)。然而，据观察，统计文献和软件文件的结合导致了BIC公式的差异，以及在选择关于特定水平的固定效应和随机效应的多层次模型时如何正确使用BIC的不确定性。这些差异和不确定性是由于BIC对多级模型的惩罚项中样本量的不同规格造成的。在本研究中，我们推导了在两水平嵌套设计中特定水平的固定效应和随机效应选择的BIC惩罚项。在这个称为BICE1 $$ {\mathrm{BIC}}_{E1} $$的新版本的BIC中，如果随机效应方差-协方差矩阵具有全秩，则该惩罚项被分解为两部分:(a)每个簇的平均样本量的log项和(b)参数总数乘以簇总数的log项。此外，在冗余随机效应存在的情况下，我们推导出新的BIC，称为BICE2 $$ {\mathrm{BIC}}_{E2} $$。我们通过数值论证证明了推导公式BICE1 $$ {\mathrm{BIC}}_{E1} $$和BICE2 $$ {\mathrm{BIC}}_{E2} $$符合经验值，并且通过模拟研究表明，BICE $$ {\mathrm{BIC}}_E $$ (E $$ E $$表示E1 $$ E1 $$或E2 $$ E2 $$)是最佳的全局选择标准，因为它在各种多层次条件下的总样本量和聚类数量上的表现至少与BIC一样好。此外，还用一个教科书样例数据集说明了BICE1 $$ {\mathrm{BIC}}_{E1} $$的使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

British Journal of Mathematical & Statistical Psychology 医学-数学跨学科应用

CiteScore

5.00

自引率

3.80%

发文量

审稿时长

>12 weeks

期刊介绍： The British Journal of Mathematical and Statistical Psychology publishes articles relating to areas of psychology which have a greater mathematical or statistical aspect of their argument than is usually acceptable to other journals including: • mathematical psychology • statistics • psychometrics • decision making • psychophysics • classification • relevant areas of mathematics, computing and computer software These include articles that address substantitive psychological issues or that develop and extend techniques useful to psychologists. New models for psychological processes, new approaches to existing data, critiques of existing models and improved algorithms for estimating the parameters of a model are examples of articles which may be favoured.