Francesco Denti, Federico Camerlenghi, Michele Guindani, Antonietta Mira
{"title":"A Common Atoms Model for the Bayesian Nonparametric Analysis of Nested Data.","authors":"Francesco Denti, Federico Camerlenghi, Michele Guindani, Antonietta Mira","doi":"10.1080/01621459.2021.1933499","DOIUrl":null,"url":null,"abstract":"<p><p>The use of large datasets for targeted therapeutic interventions requires new ways to characterize the heterogeneity observed across subgroups of a specific population. In particular, models for partially exchangeable data are needed for inference on nested datasets, where the observations are assumed to be organized in different units and some sharing of information is required to learn distinctive features of the units. In this manuscript, we propose a nested common atoms model (CAM) that is particularly suited for the analysis of nested datasets where the distributions of the units are expected to differ only over a small fraction of the observations sampled from each unit. The proposed CAM allows a two-layered clustering at the distributional and observational level and is amenable to scalable posterior inference through the use of a computationally efficient nested slice sampler algorithm. We further discuss how to extend the proposed modeling framework to handle discrete measurements, and we conduct posterior inference on a real microbiome dataset from a diet swap study to investigate how the alterations in intestinal microbiota composition are associated with different eating habits. We further investigate the performance of our model in capturing true distributional structures in the population by means of a simulation study.</p>","PeriodicalId":17227,"journal":{"name":"Journal of the American Statistical Association","volume":"118 541","pages":"405-416"},"PeriodicalIF":3.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/01621459.2021.1933499","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Statistical Association","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/01621459.2021.1933499","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/7/14 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 16
Abstract
The use of large datasets for targeted therapeutic interventions requires new ways to characterize the heterogeneity observed across subgroups of a specific population. In particular, models for partially exchangeable data are needed for inference on nested datasets, where the observations are assumed to be organized in different units and some sharing of information is required to learn distinctive features of the units. In this manuscript, we propose a nested common atoms model (CAM) that is particularly suited for the analysis of nested datasets where the distributions of the units are expected to differ only over a small fraction of the observations sampled from each unit. The proposed CAM allows a two-layered clustering at the distributional and observational level and is amenable to scalable posterior inference through the use of a computationally efficient nested slice sampler algorithm. We further discuss how to extend the proposed modeling framework to handle discrete measurements, and we conduct posterior inference on a real microbiome dataset from a diet swap study to investigate how the alterations in intestinal microbiota composition are associated with different eating habits. We further investigate the performance of our model in capturing true distributional structures in the population by means of a simulation study.
期刊介绍:
Established in 1888 and published quarterly in March, June, September, and December, the Journal of the American Statistical Association ( JASA ) has long been considered the premier journal of statistical science. Articles focus on statistical applications, theory, and methods in economic, social, physical, engineering, and health sciences. Important books contributing to statistical advancement are reviewed in JASA .
JASA is indexed in Current Index to Statistics and MathSci Online and reviewed in Mathematical Reviews. JASA is abstracted by Access Company and is indexed and abstracted in the SRM Database of Social Research Methodology.