A Bayesian semiparametric mixture model for clustering zero-inflated microbiome data.

IF 1.7 4区 数学 Q3 BIOLOGY
Biometrics Pub Date : 2025-07-03 DOI:10.1093/biomtc/ujaf125
Suppapat Korsurat, Matthew D Koslovsky
{"title":"A Bayesian semiparametric mixture model for clustering zero-inflated microbiome data.","authors":"Suppapat Korsurat, Matthew D Koslovsky","doi":"10.1093/biomtc/ujaf125","DOIUrl":null,"url":null,"abstract":"<p><p>Microbiome research has immense potential for unlocking insights into human health and disease. A common goal in human microbiome research is identifying subgroups of individuals with similar microbial composition that may be linked to specific health states or environmental exposures. However, existing clustering methods are often not equipped to accommodate the complex structure of microbiome data and typically make limiting assumptions regarding the number of clusters in the data which can bias inference. Designed for zero-inflated multivariate compositional count data collected in microbiome research, we propose a novel Bayesian semiparametric mixture modeling framework that simultaneously learns the number of clusters in the data while performing cluster allocation. In simulation, we demonstrate the clustering performance of our method compared to distance- and model-based alternatives and the importance of accommodating zero-inflation when present in the data. We then apply the model to identify clusters in microbiome data collected in a study designed to investigate the relation between gut microbial composition and enteric diarrheal disease.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomtc/ujaf125","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Microbiome research has immense potential for unlocking insights into human health and disease. A common goal in human microbiome research is identifying subgroups of individuals with similar microbial composition that may be linked to specific health states or environmental exposures. However, existing clustering methods are often not equipped to accommodate the complex structure of microbiome data and typically make limiting assumptions regarding the number of clusters in the data which can bias inference. Designed for zero-inflated multivariate compositional count data collected in microbiome research, we propose a novel Bayesian semiparametric mixture modeling framework that simultaneously learns the number of clusters in the data while performing cluster allocation. In simulation, we demonstrate the clustering performance of our method compared to distance- and model-based alternatives and the importance of accommodating zero-inflation when present in the data. We then apply the model to identify clusters in microbiome data collected in a study designed to investigate the relation between gut microbial composition and enteric diarrheal disease.

零膨胀微生物组数据聚类的贝叶斯半参数混合模型。
微生物组研究在揭示人类健康和疾病方面具有巨大的潜力。人类微生物组研究的一个共同目标是确定可能与特定健康状况或环境暴露有关的具有相似微生物组成的个体亚群。然而,现有的聚类方法往往不能适应微生物组数据的复杂结构,并且通常对数据中的聚类数量做出有限的假设,这可能会影响推断。针对微生物组研究中收集的零膨胀多元成分计数数据,我们提出了一种新的贝叶斯半参数混合建模框架,该框架在进行聚类分配的同时学习数据中的聚类数量。在模拟中,我们展示了与基于距离和模型的替代方法相比,我们的方法的聚类性能,以及在数据中存在时适应零膨胀的重要性。然后,我们应用该模型在一项研究中收集的微生物组数据中识别集群,该研究旨在调查肠道微生物组成与肠性腹泻病之间的关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biometrics
Biometrics 生物-生物学
CiteScore
2.70
自引率
5.30%
发文量
178
审稿时长
4-8 weeks
期刊介绍: The International Biometric Society is an international society promoting the development and application of statistical and mathematical theory and methods in the biosciences, including agriculture, biomedical science and public health, ecology, environmental sciences, forestry, and allied disciplines. The Society welcomes as members statisticians, mathematicians, biological scientists, and others devoted to interdisciplinary efforts in advancing the collection and interpretation of information in the biosciences. The Society sponsors the biennial International Biometric Conference, held in sites throughout the world; through its National Groups and Regions, it also Society sponsors regional and local meetings.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信