贝叶斯线性多级模型的直观联合先验：R2D2M2先验

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Electronic Journal of Statistics Pub Date : 2022-08-15 DOI:10.1214/23-ejs2136

Javier Enrique Aguilar, Paul-Christian Burkner

{"title":"贝叶斯线性多级模型的直观联合先验：R2D2M2先验","authors":"Javier Enrique Aguilar, Paul-Christian Burkner","doi":"10.1214/23-ejs2136","DOIUrl":null,"url":null,"abstract":"The training of high-dimensional regression models on comparably sparse data is an important yet complicated topic, especially when there are many more model parameters than observations in the data. From a Bayesian perspective, inference in such cases can be achieved with the help of shrinkage prior distributions, at least for generalized linear models. However, real-world data usually possess multilevel structures, such as repeated measurements or natural groupings of individuals, which existing shrinkage priors are not built to deal with. We generalize and extend one of these priors, the R2D2 prior by Zhang et al. (2020), to linear multilevel models leading to what we call the R2D2M2 prior. The proposed prior enables both local and global shrinkage of the model parameters. It comes with interpretable hyperparameters, which we show to be intrinsically related to vital properties of the prior, such as rates of concentration around the origin, tail behavior, and amount of shrinkage the prior exerts. We offer guidelines on how to select the prior's hyperparameters by deriving shrinkage factors and measuring the effective number of non-zero model coefficients. Hence, the user can readily evaluate and interpret the amount of shrinkage implied by a specific choice of hyperparameters. Finally, we perform extensive experiments on simulated and real data, showing that our inference procedure for the prior is well calibrated, has desirable global and local regularization properties and enables the reliable and interpretable estimation of much more complex Bayesian multilevel models than was previously possible.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Intuitive joint priors for Bayesian linear multilevel models: The R2D2M2 prior\",\"authors\":\"Javier Enrique Aguilar, Paul-Christian Burkner\",\"doi\":\"10.1214/23-ejs2136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The training of high-dimensional regression models on comparably sparse data is an important yet complicated topic, especially when there are many more model parameters than observations in the data. From a Bayesian perspective, inference in such cases can be achieved with the help of shrinkage prior distributions, at least for generalized linear models. However, real-world data usually possess multilevel structures, such as repeated measurements or natural groupings of individuals, which existing shrinkage priors are not built to deal with. We generalize and extend one of these priors, the R2D2 prior by Zhang et al. (2020), to linear multilevel models leading to what we call the R2D2M2 prior. The proposed prior enables both local and global shrinkage of the model parameters. It comes with interpretable hyperparameters, which we show to be intrinsically related to vital properties of the prior, such as rates of concentration around the origin, tail behavior, and amount of shrinkage the prior exerts. We offer guidelines on how to select the prior's hyperparameters by deriving shrinkage factors and measuring the effective number of non-zero model coefficients. Hence, the user can readily evaluate and interpret the amount of shrinkage implied by a specific choice of hyperparameters. Finally, we perform extensive experiments on simulated and real data, showing that our inference procedure for the prior is well calibrated, has desirable global and local regularization properties and enables the reliable and interpretable estimation of much more complex Bayesian multilevel models than was previously possible.\",\"PeriodicalId\":49272,\"journal\":{\"name\":\"Electronic Journal of Statistics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2022-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronic Journal of Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/23-ejs2136\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Journal of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/23-ejs2136","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 11

摘要

在相对稀疏的数据上训练高维回归模型是一个重要但复杂的主题，尤其是当数据中的模型参数比观测值多得多时。从贝叶斯的角度来看，在这种情况下，至少对于广义线性模型，可以在收缩先验分布的帮助下进行推理。然而，真实世界的数据通常具有多级结构，例如重复测量或个体的自然分组，而现有的收缩先验并不是为了处理这些结构而建立的。我们将其中一个先验，张等人的R2D2先验进行了推广和扩展。（2020），将其推广到线性多级模型，从而产生我们所说的R2D2M2先验。所提出的先验能够实现模型参数的局部和全局收缩。它带有可解释的超参数，我们发现这些超参数与先验的重要特性有着内在的联系，例如原点周围的集中率、尾部行为和先验施加的收缩量。我们提供了如何通过推导收缩因子和测量非零模型系数的有效数量来选择先验超参数的指南。因此，用户可以容易地评估和解释超参数的特定选择所暗示的收缩量。最后，我们在模拟和真实数据上进行了大量实验，表明我们对先验的推理过程经过了很好的校准，具有理想的全局和局部正则化特性，并能够对比以前可能的更复杂的贝叶斯多级模型进行可靠和可解释的估计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Intuitive joint priors for Bayesian linear multilevel models: The R2D2M2 prior

The training of high-dimensional regression models on comparably sparse data is an important yet complicated topic, especially when there are many more model parameters than observations in the data. From a Bayesian perspective, inference in such cases can be achieved with the help of shrinkage prior distributions, at least for generalized linear models. However, real-world data usually possess multilevel structures, such as repeated measurements or natural groupings of individuals, which existing shrinkage priors are not built to deal with. We generalize and extend one of these priors, the R2D2 prior by Zhang et al. (2020), to linear multilevel models leading to what we call the R2D2M2 prior. The proposed prior enables both local and global shrinkage of the model parameters. It comes with interpretable hyperparameters, which we show to be intrinsically related to vital properties of the prior, such as rates of concentration around the origin, tail behavior, and amount of shrinkage the prior exerts. We offer guidelines on how to select the prior's hyperparameters by deriving shrinkage factors and measuring the effective number of non-zero model coefficients. Hence, the user can readily evaluate and interpret the amount of shrinkage implied by a specific choice of hyperparameters. Finally, we perform extensive experiments on simulated and real data, showing that our inference procedure for the prior is well calibrated, has desirable global and local regularization properties and enables the reliable and interpretable estimation of much more complex Bayesian multilevel models than was previously possible.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Electronic Journal of Statistics STATISTICS & PROBABILITY-

CiteScore

1.80

自引率

9.10%

发文量

100

审稿时长

3 months

期刊介绍： The Electronic Journal of Statistics (EJS) publishes research articles and short notes on theoretical, computational and applied statistics. The journal is open access. Articles are refereed and are held to the same standard as articles in other IMS journals. Articles become publicly available shortly after they are accepted.