{"title":"Biclustering via Semiparametric Bayesian Inference","authors":"Alejandro Murua, F. Quintana","doi":"10.1214/21-ba1284","DOIUrl":"https://doi.org/10.1214/21-ba1284","url":null,"abstract":"Motivated by classes of problems frequently found in the analysis of gene expression data, we propose a semiparametric Bayesian model to detect biclusters, that is, subsets of individuals sharing similar patterns over a set of conditions. Our approach is based on the well-known plaid model by Lazzeroni and Owen (2002). By assuming a truncated stick-breaking prior we also find the number of biclusters present in the data as part of the inference. Evidence from a simulation study shows that the model is capable of correctly detecting biclusters and performs well compared to some competing approaches. The flexibility of the proposed prior is demonstrated with applications to the analysis of gene expression data (continuous responses) and histone modifications data (count responses).","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48464947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcos A. Capistrán, J. Christen, M. Daza-Torres, Hugo Flores-Arguedas, J. Montesinos-López
{"title":"Error Control of the Numerical Posterior with Bayes Factors in Bayesian Uncertainty Quantification","authors":"Marcos A. Capistrán, J. Christen, M. Daza-Torres, Hugo Flores-Arguedas, J. Montesinos-López","doi":"10.1214/20-ba1255","DOIUrl":"https://doi.org/10.1214/20-ba1255","url":null,"abstract":". In this paper, we address the numerical posterior error control problem for the Bayesian approach to inverse problems or recently known as Bayesian Uncertainty Quantification (UQ). We generalize the results of Capistr´an et al. (2016) to (a priori) expected Bayes factors (BF) and in a more general, infinite-dimensional setting. In this inverse problem, the unavoidable numerical approximation of the Forward Map (FM, i.e., the regressor function), arising from the numerical solution of a system of differential equations, demands error estimates of the corresponding approximate numerical posterior distribution. Our approach is to make such comparisons in the setting of Bayesian model selection and BFs. The main result of this paper is a bound on the absolute global error tolerated by the numerical solver of the FM in order to keep the BF of the numerical versus the theoretical posterior near one. For two examples, we provide a detailed analysis of the computation and implementation of the introduced bound. Furthermore, we show that the resulting numerical posterior turns out to be nearly identical from the theoretical posterior, given the control of the BF near one.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42140508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Shrinkage Estimation of Predictive Densities Under α-Divergences","authors":"E. George, Gourab Mukherjee, Keisuke Yano","doi":"10.1214/21-BA1264","DOIUrl":"https://doi.org/10.1214/21-BA1264","url":null,"abstract":"We consider the problem of estimating the predictive density in a heteroskedastic Gaussian model under general divergence loss. Based on a conjugate hierarchical set-up, we consider generic classes of shrinkage predictive densities that are governed by location and scale hyper-parameters. For any α-divergence loss, we propose a risk-estimation based methodology for tuning these shrinkage hyper-parameters. Our proposed predictive density estimators enjoy optimal asymptotic risk properties that are in concordance with the optimal shrinkage calibration point estimation results established by Xie, Kou, and Brown (2012) for heteroskedastic hierarchical models. These α-divergence risk optimality properties of our proposed predictors are not shared by empirical Bayes predictive density estimators that are calibrated by traditional methods such as maximum likelihood and method of moments. We conduct several numerical studies to compare the non-asymptotic performance of our proposed predictive density estimators with other competing methods and obtain encouraging results. MSC2020 subject classifications: Primary 62L20; secondary 60F15, 60G42.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46644095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perfect Sampling of the Posterior in the Hierarchical Pitman-Yor Process.","authors":"S. Bacallado, S. Favaro, Samuel Power, L. Trippa","doi":"10.1214/21-BA1269","DOIUrl":"https://doi.org/10.1214/21-BA1269","url":null,"abstract":"The predictive probabilities of the hierarchical Pitman-Yor process are approximated through Monte Carlo algorithms that exploits the Chinese Restaurant Franchise (CRF) representation. However, in order to simulate the posterior distribution of the hierarchical Pitman-Yor process, a set of auxiliary variables representing the arrangement of customers in tables of the CRF must be sampled through Markov chain Monte Carlo. This paper develops a perfect sampler for these latent variables employing ideas from the Propp-Wilson algorithm and evaluates its average running time by extensive simulations. The simulations reveal a significant dependence of running time on the parameters of the model, which exhibits sharp transitions. The algorithm is compared to simpler Gibbs sampling procedures, as well as a procedure for unbiased Monte Carlo estimation proposed by Glynn and Rhee. We illustrate its use with an example in microbial genomics studies.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":"129 1","pages":"685-709"},"PeriodicalIF":4.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66085791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Ayala, Leonardo Jofré, Luis Gutiérrez, R. H. Mena
{"title":"On a Dirichlet Process Mixture Representation of Phase-Type Distributions","authors":"Daniel Ayala, Leonardo Jofré, Luis Gutiérrez, R. H. Mena","doi":"10.1214/21-BA1272","DOIUrl":"https://doi.org/10.1214/21-BA1272","url":null,"abstract":"An explicit representation of phase-type distributions as an infinite mixture of Erlang distributions is introduced. The representation unveils a novel and useful connection between a class of Bayesian nonparametric mixture models and phase-type distributions. In particular, this sheds some light on two hot topics, estimation techniques for phase-type distributions, and the availability of closed-form expressions for some functionals related to Dirichlet process mixture models. The power of this connection is illustrated via a posterior inference algorithm to estimate phase-type distributions, avoiding some difficulties with the simulation of latent Markov jump processes, commonly encountered in phase-type Bayesian inference. On the other hand, closed-form expressions for functionals of Dirichlet process mixture models are illustrated with density and renewal function estimation, related to the optimal salmon weight distribution of an aquaculture study.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":"-1 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66085947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Nonstationary and Nonparametric Covariance Estimation for Large Spatial Data","authors":"Brian Kidd, M. Katzfuss","doi":"10.1214/21-ba1273","DOIUrl":"https://doi.org/10.1214/21-ba1273","url":null,"abstract":"In spatial statistics, it is often assumed that the spatial field of interest is stationary and its covariance has a simple parametric form, but these assumptions are not appropriate in many applications. Given replicate observations of a Gaussian spatial field, we propose nonstationary and nonparametric Bayesian inference on the spatial dependence. Instead of estimating the quadratic (in the number of spatial locations) entries of the covariance matrix, the idea is to infer a near-linear number of nonzero entries in a sparse Cholesky factor of the precision matrix. Our prior assumptions are motivated by recent results on the exponential decay of the entries of this Cholesky factor for Matern-type covariances under a specific ordering scheme. Our methods are highly scalable and parallelizable. We conduct numerical comparisons and apply our methodology to climate-model output, enabling statistical emulation of an expensive physical model.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46843087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Camille M. Moore, N. Carlson, S. MaWhinney, S. Kreidler
{"title":"A Dirichlet Process Mixture Model for Non-Ignorable Dropout","authors":"Camille M. Moore, N. Carlson, S. MaWhinney, S. Kreidler","doi":"10.1214/19-ba1181","DOIUrl":"https://doi.org/10.1214/19-ba1181","url":null,"abstract":". Longitudinal cohorts are a valuable resource for studying HIV disease progression; however, dropout is common in these studies. Subjects often fail to re-turn for visits due to disease progression, loss to follow-up, or death. When dropout depends on unobserved outcomes, data are missing not at random, and results from standard longitudinal data analyses can be biased. Several methods have been proposed to adjust for non-ignorable dropout; however, many of these approaches rely on parametric assumptions about the distribution of dropout times and the functional form of the relationship between the outcome and dropout time. More flexible approaches may be needed when the distribution of dropout times does not follow a known distribution or violates proportional hazards assumptions, or when the relationship between the outcome and dropout times does not have a simple polynomial form. We propose a Bayesian semi-parametric Dirichlet process mixture model to flexibly model the relationship between dropout time and the outcome and show that more accurate inference can be obtained by non-parametrically modeling the distribution of subject-specific effects as well as the distribution of dropout times. Results from simulation studies as well as an application to a longitudinal HIV cohort study database illustrate the strengths of our Bayesian semi-parametric approach.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49479134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Calibrating Expert Assessments Using Hierarchical Gaussian Process Models","authors":"T. Perälä, J. Vanhatalo, A. Chrysafi","doi":"10.1214/19-ba1180","DOIUrl":"https://doi.org/10.1214/19-ba1180","url":null,"abstract":". Expert assessments are routinely used to inform management and other decision making. However, often these assessments contain considerable biases and uncertainties for which reason they should be calibrated if possible. More-over, coherently combining multiple expert assessments into one estimate poses a long-standing problem in statistics since modeling expert knowledge is often dif-ficult. Here, we present a hierarchical Bayesian model for expert calibration in a task of estimating a continuous univariate parameter. The model allows experts’ biases to vary as a function of the true value of the parameter and according to the expert’s background. We follow the fully Bayesian approach (the so-called supra-Bayesian approach) and model experts’ bias functions explicitly using hierarchical Gaussian processes. We show how to use calibration data to infer the experts’ observation models with the use of bias functions and to calculate the bias corrected posterior distributions for an unknown system parameter of interest. We demonstrate and test our model and methods with simulated data and a real case study on data-limited fisheries stock assessment. The case study results show that experts’ biases vary with respect to the true system parameter value and that the calibration of the expert assessments improves the inference compared to using uncalibrated expert assessments or a vague uniform guess. Moreover, the bias functions in the real case study show important differences between the reliability of alternative experts. The model and methods presented here can be also straightforwardly applied to other applications than our case study.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46512597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Nguyen, P. Valpine, Y. Atchadé, Daniel Turek, Nick Michaud, C. Paciorek
{"title":"Nested Adaptation of MCMC Algorithms","authors":"D. Nguyen, P. Valpine, Y. Atchadé, Daniel Turek, Nick Michaud, C. Paciorek","doi":"10.1214/19-ba1190","DOIUrl":"https://doi.org/10.1214/19-ba1190","url":null,"abstract":". Markov chain Monte Carlo (MCMC) methods are ubiquitous tools for simulation-based inference in many fields but designing and identifying good MCMC samplers is still an open question. This paper introduces a novel MCMC algorithm, namely, Nested Adaptation MCMC. For sampling variables or blocks of variables, we use two levels of adaptation where the inner adaptation opti-mizes the MCMC performance within each sampler, while the outer adaptation explores the space of valid kernels to find the optimal samplers. We provide a theoretical foundation for our approach. To show the generality and usefulness of the approach, we describe a framework using only standard MCMC samplers as candidate samplers and some adaptation schemes for both inner and outer iterations. In several benchmark problems, we show that our proposed approach substantially outperforms other approaches, including an automatic blocking algorithm, in terms of MCMC efficiency and computational time.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48723393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bayesian AnalysisPub Date : 2020-12-01Epub Date: 2019-11-04DOI: 10.1214/19-ba1173
Shiwei Lan, Andrew Holbrook, Gabriel A Elias, Norbert J Fortin, Hernando Ombao, Babak Shahbaba
{"title":"Flexible Bayesian Dynamic Modeling of Correlation and Covariance Matrices.","authors":"Shiwei Lan, Andrew Holbrook, Gabriel A Elias, Norbert J Fortin, Hernando Ombao, Babak Shahbaba","doi":"10.1214/19-ba1173","DOIUrl":"10.1214/19-ba1173","url":null,"abstract":"<p><p>Modeling correlation (and covariance) matrices can be challenging due to the positive-definiteness constraint and potential high-dimensionality. Our approach is to decompose the covariance matrix into the correlation and variance matrices and propose a novel Bayesian framework based on modeling the correlations as products of unit vectors. By specifying a wide range of distributions on a sphere (e.g. the squared-Dirichlet distribution), the proposed approach induces flexible prior distributions for covariance matrices (that go beyond the commonly used inverse-Wishart prior). For modeling real-life spatio-temporal processes with complex dependence structures, we extend our method to dynamic cases and introduce unit-vector Gaussian process priors in order to capture the evolution of correlation among components of a multivariate time series. To handle the intractability of the resulting posterior, we introduce the adaptive Δ-Spherical Hamiltonian Monte Carlo. We demonstrate the validity and flexibility of our proposed framework in a simulation study of periodic processes and an analysis of rat's local field potential activity in a complex sequence memory task.</p>","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":"15 4","pages":"1199-1228"},"PeriodicalIF":4.4,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8048134/pdf/nihms-1059273.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38884999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}