P. Mikkola, Osvaldo A. Martin, Suyog H. Chandramouli, M. Hartmann, O. A. Pla, Owen Thomas, Henri Pesonen, J. Corander, Aki Vehtari, Samuel Kaski, Paul-Christian Burkner, Arto Klami
{"title":"Prior Knowledge Elicitation: The Past, Present, and Future","authors":"P. Mikkola, Osvaldo A. Martin, Suyog H. Chandramouli, M. Hartmann, O. A. Pla, Owen Thomas, Henri Pesonen, J. Corander, Aki Vehtari, Samuel Kaski, Paul-Christian Burkner, Arto Klami","doi":"10.1214/23-ba1381","DOIUrl":"https://doi.org/10.1214/23-ba1381","url":null,"abstract":"Specification of the prior distribution for a Bayesian model is a central part of the Bayesian workflow for data analysis, but it is often difficult even for statistical experts. In principle, prior elicitation transforms domain knowledge of various kinds into well-defined prior distributions, and offers a solution to the prior specification problem. In practice, however, we are still fairly far from having usable prior elicitation tools that could significantly influence the way we build probabilistic models in academia and industry. We lack elicitation methods that integrate well into the Bayesian workflow and perform elicitation efficiently in terms of costs of time and effort. We even lack a comprehensive theoretical framework for understanding different facets of the prior elicitation problem. Why are we not widely using prior elicitation? We analyse the state of the art by identifying a range of key aspects of prior knowledge elicitation, from properties of the modelling task and the nature of the priors to the form of interaction with the expert. The existing prior elicitation literature is reviewed and categorized in these terms. This allows recognizing under-studied directions in prior elicitation research, finally leading to a proposal of several new avenues to improve prior elicitation methodology.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46984282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Warped Dynamic Linear Models for Time Series of Counts","authors":"Brian King, Daniel R. Kowal","doi":"10.1214/23-BA1394","DOIUrl":"https://doi.org/10.1214/23-BA1394","url":null,"abstract":"Dynamic Linear Models (DLMs) are commonly employed for time series analysis due to their versatile structure, simple recursive updating, ability to handle missing data, and probabilistic forecasting. However, the options for count time series are limited: Gaussian DLMs require continuous data, while Poisson-based alternatives often lack sufficient modeling flexibility. We introduce a novel semiparametric methodology for count time series by warping a Gaussian DLM. The warping function has two components: a (nonparametric) transformation operator that provides distributional flexibility and a rounding operator that ensures the correct support for the discrete data-generating process. We develop conjugate inference for the warped DLM, which enables analytic and recursive updates for the state space filtering and smoothing distributions. We leverage these results to produce customized and efficient algorithms for inference and forecasting, including Monte Carlo simulation for offline analysis and an optimal particle filter for online inference. This framework unifies and extends a variety of discrete time series models and is valid for natural counts, rounded values, and multivariate observations. Simulation studies illustrate the excellent forecasting capabilities of the warped DLM. The proposed approach is applied to a multivariate time series of daily overdose counts and demonstrates both modeling and computational successes.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44928377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bayesian Analysis of Two-Stage Randomized Experiments in the Presence of Interference, Treatment Nonadherence, and Missing Outcomes","authors":"Yuki Ohnishi, Arman Sabbaghi","doi":"10.1214/22-BA1347","DOIUrl":"https://doi.org/10.1214/22-BA1347","url":null,"abstract":"Three critical issues for causal inference that often occur in modern, complicated experiments are interference, treatment nonadherence, and missing outcomes. A great deal of research efforts has been dedicated to developing causal inferential methodologies that address these issues separately. However, methodologies that can address these issues simultaneously are lacking. We propose a Bayesian causal inference methodology to address this gap. Our methodology extends existing causal frameworks and methods, specifically, two-staged randomized experiments and the principal stratification framework. In contrast to existing methods that invoke strong structural assumptions to identify principal causal effects, our Bayesian approach uses flexible distributional models that can accommodate the complexities of interference and missing outcomes, and that ensure that principal causal effects are weakly identifiable. We illustrate our methodology via simulation studies and a re-analysis of real-life data from an evaluation of India's National Health Insurance Program. Our methodology enables us to identify new active causal effects that were not identified in past analyses. Ultimately, our simulation studies and case study demonstrate how our methodology can yield more informative analyses in modern experiments with interference, treatment nonadherence, missing outcomes, and complicated outcome generation mechanisms.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48532797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robustness Against Conflicting Prior Information in Regression","authors":"Philippe Gagnon","doi":"10.1214/22-BA1330","DOIUrl":"https://doi.org/10.1214/22-BA1330","url":null,"abstract":"Including prior information about model parameters is a fundamental step of any Bayesian statistical analysis. It is viewed positively by some as it allows, among others, to quantitatively incorporate expert opinion about model parameters. It is viewed negatively by others because it sets the stage for subjectivity in statistical analysis. Certainly, it creates problems when the inference is skewed due to a conflict with the data collected. According to the theory of conflict resolution (O'Hagan and Pericchi, 2012), a solution to such problems is to diminish the impact of conflicting prior information, yielding inference consistent with the data. This is typically achieved by using heavy-tailed priors. We study both theoretically and numerically the efficacy of such a solution in a regression framework where the prior information about the coefficients takes the form of a product of density functions with known location and scale parameters. We study functions with regularly varying tails (Student distributions), log-regularly-varying tails (as introduced in Desgagn'e (2015)), and propose functions with slower tail decays that allow to resolve any conflict that can happen under that regression framework, contrarily to the two previous types of functions. The code to reproduce all numerical experiments is available online.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42726958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Greengard, J. Hoskins, Charles C.Margossian, A. Gelman, Aki Vehtari
{"title":"Fast Methods for Posterior Inference of Two-Group Normal-Normal Models","authors":"P. Greengard, J. Hoskins, Charles C.Margossian, A. Gelman, Aki Vehtari","doi":"10.1214/22-ba1329","DOIUrl":"https://doi.org/10.1214/22-ba1329","url":null,"abstract":"We describe a class of algorithms for evaluating posterior moments of certain Bayesian linear regression models with a normal likelihood and a normal prior on the regression coefficients. The proposed methods can be used for hierarchical mixed effects models with partial pooling over one group of predictors, as well as random effects models with partial pooling over two groups of predictors. We demonstrate the performance of the methods on two applications, one involving U.S. opinion polls and one involving the modeling of COVID-19 outbreaks in Israel using survey data. The algorithms involve analytical marginalization of regression coefficients followed by numerical integration of the remaining low-dimensional density. The dominant cost of the algorithms is an eigendecomposition computed once for each value of the outside parameter of integration. Our approach drastically reduces run times compared to state-of-the-art Markov chain Monte Carlo (MCMC) algorithms. The latter, in addition to being computationally expensive, can also be difficult to tune when applied to hierarchical models.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46750531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Bayesian High-dimensional Local Dependence Learning","authors":"Kyoungjae Lee, Lizhen Lin","doi":"10.1214/21-ba1299","DOIUrl":"https://doi.org/10.1214/21-ba1299","url":null,"abstract":". In this work, we propose a scalable Bayesian procedure for learning the local dependence structure in a high-dimensional model where the variables possess a natural ordering. The ordering of variables can be indexed by time, the vicinities of spatial locations, and so on, with the natural assumption that variables far apart tend to have weak correlations. Applications of such models abound in a variety of fields such as finance, genome associations analysis and spatial modeling. We adopt a flexible framework under which each variable is dependent on its neighbors or predecessors, and the neighborhood size can vary for each variable. It is of great interest to reveal this local dependence structure by estimating the covariance or precision matrix while yielding a consistent estimate of the varying neighborhood size for each variable. The existing literature on banded covariance matrix estimation, which assumes a fixed bandwidth cannot be adapted for this general setup. We employ the modified Cholesky decomposition for the precision matrix and design a flexible prior for this model through appropriate priors on the neighborhood sizes and Cholesky factors. The posterior contraction rates of the Cholesky factor are derived which are nearly or exactly minimax optimal, and our procedure leads to consistent estimates of the neighborhood size for all the variables. Another appealing feature of our procedure is its scalability to models with large numbers of variables due to efficient posterior inference without resorting to MCMC algorithms. Numerical comparisons are carried out with competitive methods, and applications are considered for some real datasets. Bayesian procedure for high-dimensional local dependence learning, where variables close to each other are more likely to be correlated. The proposed prior, LANCE prior, allows an exact computation of posteriors, which enables scalable inference even in high-dimensional settings. Furthermore, it provides a scalable Bayesian cross-validation to choose the hyperparameters. We establish selection consistency for the local dependence structure and posterior convergence rates for the Cholesky factor. The required conditions for these theoretical results are significantly weakened compared with the existing literature. Simulation studies in various settings show that LANCE prior outperforms other contenders in terms of the ROC curve, cross-validation-based analysis and computation time. Two real data analyses based on the phone call center and gun point data illustrate the satisfactory performance of the proposed method in linear prediction and classification problems, respectively.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41603090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grid-Uniform Copulas and Rectangle Exchanges: Bayesian Model and Inference for a Rich Class of Copula Functions","authors":"Nicol'as Kuschinski, A. Jara","doi":"10.1214/23-ba1396","DOIUrl":"https://doi.org/10.1214/23-ba1396","url":null,"abstract":"Copula-based models provide a great deal of flexibility in modelling multivariate distributions, allowing for the specifications of models for the marginal distributions separately from the dependence structure (copula) that links them to form a joint distribution. Choosing a class of copula models is not a trivial task and its misspecification can lead to wrong conclusions. We introduce a novel class of grid-uniform copula functions, which is dense in the space of all continuous copula functions in a Hellinger sense. We propose a Bayesian model based on this class and develop an automatic Markov chain Monte Carlo algorithm for exploring the corresponding posterior distribution. The methodology is illustrated by means of simulated data and compared to the main existing approach.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47829097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bayesian AnalysisPub Date : 2021-09-01Epub Date: 2020-07-31DOI: 10.1214/20-ba1229
Thomas A Murray, Peter F Thall, Frederique Schortgen, Pierre Asfar, Sarah Zohar, Sandrine Katsahian
{"title":"Robust Adaptive Incorporation of Historical Control Data in a Randomized Trial of External Cooling to Treat Septic Shock.","authors":"Thomas A Murray, Peter F Thall, Frederique Schortgen, Pierre Asfar, Sarah Zohar, Sandrine Katsahian","doi":"10.1214/20-ba1229","DOIUrl":"https://doi.org/10.1214/20-ba1229","url":null,"abstract":"<p><p>This paper proposes randomized controlled clinical trial design to evaluate external cooling as a means to control fever and thereby reduce mortality in patients with septic shock. The trial will include concurrent external cooling and control arms while adaptively incorporating historical control arm data. Bayesian group sequential monitoring will be done using a posterior comparative test based on the 60-day survival distribution in each concurrent arm. Posterior inference will follow from a Bayesian discrete time survival model that facilitates adaptive incorporation of the historical control data through an innovative regression framework with a multivariate spike-and-slab prior distribution on the historical bias parameters. For each interim test, the amount of information borrowed from the historical control data will be determined adaptively in a manner that reflects the degree of agreement between historical and concurrent control arm data. Guidance is provided for selecting Bayesian posterior probability group-sequential monitoring boundaries. Simulation results elucidating how the proposed method borrows strength from the historical control data are reported. In the absence of historical control arm bias, the proposed design controls the type I error rate and provides substantially larger power than reasonable comparators, whereas in the presence bias of varying magnitude, type I error rate inflation is curbed.</p>","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9585618/pdf/nihms-1804885.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40568331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bayesian AnalysisPub Date : 2021-09-01Epub Date: 2020-07-15DOI: 10.1214/20-ba1223
Yuxiang Gao, Lauren Kennedy, Daniel Simpson, Andrew Gelman
{"title":"Improving multilevel regression and poststratification with structured priors.","authors":"Yuxiang Gao, Lauren Kennedy, Daniel Simpson, Andrew Gelman","doi":"10.1214/20-ba1223","DOIUrl":"10.1214/20-ba1223","url":null,"abstract":"<p><p>A central theme in the field of survey statistics is estimating population-level quantities through data coming from potentially non-representative samples of the population. Multilevel regression and poststratification (MRP), a model-based approach, is gaining traction against the traditional weighted approach for survey estimates. MRP estimates are susceptible to bias if there is an underlying structure that the methodology does not capture. This work aims to provide a new framework for specifying structured prior distributions that lead to bias reduction in MRP estimates. We use simulation studies to explore the benefit of these prior distributions and demonstrate their efficacy on non-representative US survey data. We show that structured prior distributions offer absolute bias reduction and variance reduction for posterior MRP estimates in a large variety of data regimes.</p>","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.9,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9203002/pdf/nihms-1811398.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40000730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contaminated Gibbs-Type Priors","authors":"F. Camerlenghi, R. Corradin, A. Ongaro","doi":"10.1214/22-ba1358","DOIUrl":"https://doi.org/10.1214/22-ba1358","url":null,"abstract":"Gibbs-type priors are widely used as key components in several Bayesian nonparametric models. By virtue of their flexibility and mathematical tractability, they turn out to be predominant priors in species sampling problems, clustering and mixture modelling. We introduce a new family of processes which extend the Gibbs-type one, by including a contaminant component in the model to account for the presence of anomalies (outliers) or an excess of observations with frequency one. We first investigate the induced random partition, the associated predictive distribution and we characterize the asymptotic behaviour of the number of clusters. All the results we obtain are in closed form and easily interpretable, as a noteworthy example we focus on the contaminated version of the Pitman-Yor process. Finally we pinpoint the advantage of our construction in different applied problems: we show how the contaminant component helps to perform outlier detection for an astronomical clustering problem and to improve predictive inference in a speciesrelated dataset, exhibiting a high number of species with frequency one.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.4,"publicationDate":"2021-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47383584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}