BayesSUR:一个用于线性回归中高维多元贝叶斯变量和协方差选择的R包

IF 5.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Zhi Zhao, Marco Banterle, L. Bottolo, S. Richardson, A. Lewin, M. Zucknick
{"title":"BayesSUR:一个用于线性回归中高维多元贝叶斯变量和协方差选择的R包","authors":"Zhi Zhao, Marco Banterle, L. Bottolo, S. Richardson, A. Lewin, M. Zucknick","doi":"10.18637/jss.v100.i11","DOIUrl":null,"url":null,"abstract":"In molecular biology, advances in high-throughput technologies have made it possible to study complex multivariate phenotypes and their simultaneous associations with highdimensional genomic and other omics data, a problem that can be studied with highdimensional multi-response regression, where the response variables are potentially highly correlated. To this purpose, we recently introduced several multivariate Bayesian variable and covariance selection models, e.g., Bayesian estimation methods for sparse seemingly unrelated regression for variable and covariance selection. Several variable selection priors have been implemented in this context, in particular the hotspot detection prior for latent variable inclusion indicators, which results in sparse variable selection for associations between predictors and multiple phenotypes. Here, we also propose an alternative, which uses a Markov random field (MRF) prior for incorporating prior knowledge about the dependence structure of the inclusion indicators. Inference of Bayesian seemingly unrelated regression (SUR) by Markov chain Monte Carlo methods is made computationally feasible by factorisation of the covariance matrix amongst the response variables. In this paper we present BayesSUR, an R package, which allows the user to easily specify and run a range of different Bayesian SUR models, which have been implemented in C++ for computational efficiency. The R package allows the specification of the models in a modular way, where the user chooses the priors for variable selection and for covariance selection separately. We demonstrate the performance of sparse SUR models with the hotspot prior and spike-and-slab MRF prior on synthetic and real data sets representing eQTL or mQTL studies and in vitro anti-cancer drug screening studies as examples for typical applications.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2021-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"BayesSUR: An R Package for High-Dimensional Multivariate Bayesian Variable and Covariance Selection in Linear Regression\",\"authors\":\"Zhi Zhao, Marco Banterle, L. Bottolo, S. Richardson, A. Lewin, M. Zucknick\",\"doi\":\"10.18637/jss.v100.i11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In molecular biology, advances in high-throughput technologies have made it possible to study complex multivariate phenotypes and their simultaneous associations with highdimensional genomic and other omics data, a problem that can be studied with highdimensional multi-response regression, where the response variables are potentially highly correlated. To this purpose, we recently introduced several multivariate Bayesian variable and covariance selection models, e.g., Bayesian estimation methods for sparse seemingly unrelated regression for variable and covariance selection. Several variable selection priors have been implemented in this context, in particular the hotspot detection prior for latent variable inclusion indicators, which results in sparse variable selection for associations between predictors and multiple phenotypes. Here, we also propose an alternative, which uses a Markov random field (MRF) prior for incorporating prior knowledge about the dependence structure of the inclusion indicators. Inference of Bayesian seemingly unrelated regression (SUR) by Markov chain Monte Carlo methods is made computationally feasible by factorisation of the covariance matrix amongst the response variables. In this paper we present BayesSUR, an R package, which allows the user to easily specify and run a range of different Bayesian SUR models, which have been implemented in C++ for computational efficiency. The R package allows the specification of the models in a modular way, where the user chooses the priors for variable selection and for covariance selection separately. We demonstrate the performance of sparse SUR models with the hotspot prior and spike-and-slab MRF prior on synthetic and real data sets representing eQTL or mQTL studies and in vitro anti-cancer drug screening studies as examples for typical applications.\",\"PeriodicalId\":17237,\"journal\":{\"name\":\"Journal of Statistical Software\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2021-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Statistical Software\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.18637/jss.v100.i11\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Software","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.18637/jss.v100.i11","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 8

摘要

在分子生物学中,高通量技术的进步使得研究复杂的多变量表型及其与高维基因组和其他组学数据的同时关联成为可能,这一问题可以通过高维多响应回归来研究,其中响应变量可能高度相关。为此,我们最近介绍了几种多变量贝叶斯变量和协方差选择模型,例如用于变量和协方差选择的稀疏看似不相关回归的贝叶斯估计方法。在此背景下,已经实现了几个变量选择先验,特别是潜在变量包含指标的热点检测先验,这导致预测因子与多种表型之间关联的变量选择稀疏。在这里,我们还提出了一种替代方法,该方法使用马尔可夫随机场(MRF)先验来结合关于包含指标依赖结构的先验知识。通过对响应变量间的协方差矩阵进行因式分解,使马尔可夫链蒙特卡罗方法对贝叶斯似不相关回归(SUR)的推断在计算上可行。在本文中,我们介绍了BayesSUR,一个R包,它允许用户轻松地指定和运行一系列不同的贝叶斯SUR模型,这些模型已在c++中实现,以提高计算效率。R包允许以模块化的方式规范模型,其中用户分别选择变量选择和协方差选择的先验。我们以典型应用为例,在代表eQTL或mQTL研究和体外抗癌药物筛选研究的合成和真实数据集上,展示了具有热点先验和峰板MRF先验的稀疏SUR模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
BayesSUR: An R Package for High-Dimensional Multivariate Bayesian Variable and Covariance Selection in Linear Regression
In molecular biology, advances in high-throughput technologies have made it possible to study complex multivariate phenotypes and their simultaneous associations with highdimensional genomic and other omics data, a problem that can be studied with highdimensional multi-response regression, where the response variables are potentially highly correlated. To this purpose, we recently introduced several multivariate Bayesian variable and covariance selection models, e.g., Bayesian estimation methods for sparse seemingly unrelated regression for variable and covariance selection. Several variable selection priors have been implemented in this context, in particular the hotspot detection prior for latent variable inclusion indicators, which results in sparse variable selection for associations between predictors and multiple phenotypes. Here, we also propose an alternative, which uses a Markov random field (MRF) prior for incorporating prior knowledge about the dependence structure of the inclusion indicators. Inference of Bayesian seemingly unrelated regression (SUR) by Markov chain Monte Carlo methods is made computationally feasible by factorisation of the covariance matrix amongst the response variables. In this paper we present BayesSUR, an R package, which allows the user to easily specify and run a range of different Bayesian SUR models, which have been implemented in C++ for computational efficiency. The R package allows the specification of the models in a modular way, where the user chooses the priors for variable selection and for covariance selection separately. We demonstrate the performance of sparse SUR models with the hotspot prior and spike-and-slab MRF prior on synthetic and real data sets representing eQTL or mQTL studies and in vitro anti-cancer drug screening studies as examples for typical applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Statistical Software
Journal of Statistical Software 工程技术-计算机:跨学科应用
CiteScore
10.70
自引率
1.70%
发文量
40
审稿时长
6-12 weeks
期刊介绍: The Journal of Statistical Software (JSS) publishes open-source software and corresponding reproducible articles discussing all aspects of the design, implementation, documentation, application, evaluation, comparison, maintainance and distribution of software dedicated to improvement of state-of-the-art in statistical computing in all areas of empirical research. Open-source code and articles are jointly reviewed and published in this journal and should be accessible to a broad community of practitioners, teachers, and researchers in the field of statistics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信