Tung Dang, Artem Lysenko, Keith A Boroevich, Tatsuhiko Tsunoda
{"title":"VBayesMM:用于高维微生物组多组学数据重要关系排序的变分贝叶斯神经网络。","authors":"Tung Dang, Artem Lysenko, Keith A Boroevich, Tatsuhiko Tsunoda","doi":"10.1093/bib/bbaf300","DOIUrl":null,"url":null,"abstract":"<p><p>The analysis of high-dimensional microbiome multiomics datasets is crucial for understanding the complex interactions between microbial communities and host physiological states across health and disease conditions. Despite their importance, current methods, such as the microbe-metabolite vectors approach, often face challenges in predicting metabolite abundances from microbial data and identifying keystone species. This arises from the vast dimensionality of metagenomics data, which complicates the inference of significant relationships, particularly the estimation of co-occurrence probabilities between microbes and metabolites. Here we propose the variational Bayesian microbiome multiomics (VBayesMM) approach, which aims to improve the prediction of metabolite abundances from microbial metagenomics data by incorporating a spike-and-slab prior within a Bayesian neural network. This allows VBayesMM to rapidly and precisely identify crucial microbial species, leading to more accurate estimations of co-occurrence probabilities between microbes and metabolites, while also robustly managing the uncertainty inherent in high-dimensional data. Moreover, we have implemented variational inference to address computational bottlenecks, enabling scalable analysis across extensive multiomics datasets. Our large-scale comparative evaluations demonstrate that VBayesMM not only outperforms existing methods in predicting metabolite abundances but also provides a scalable solution for analyzing massive datasets. VBayesMM enhances the interpretability of the Bayesian neural network by identifying a core set of influential microbial species, thus facilitating a deeper understanding of their probabilistic relationships with the host.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12231592/pdf/","citationCount":"0","resultStr":"{\"title\":\"VBayesMM: variational Bayesian neural network to prioritize important relationships of high-dimensional microbiome multiomics data.\",\"authors\":\"Tung Dang, Artem Lysenko, Keith A Boroevich, Tatsuhiko Tsunoda\",\"doi\":\"10.1093/bib/bbaf300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The analysis of high-dimensional microbiome multiomics datasets is crucial for understanding the complex interactions between microbial communities and host physiological states across health and disease conditions. Despite their importance, current methods, such as the microbe-metabolite vectors approach, often face challenges in predicting metabolite abundances from microbial data and identifying keystone species. This arises from the vast dimensionality of metagenomics data, which complicates the inference of significant relationships, particularly the estimation of co-occurrence probabilities between microbes and metabolites. Here we propose the variational Bayesian microbiome multiomics (VBayesMM) approach, which aims to improve the prediction of metabolite abundances from microbial metagenomics data by incorporating a spike-and-slab prior within a Bayesian neural network. This allows VBayesMM to rapidly and precisely identify crucial microbial species, leading to more accurate estimations of co-occurrence probabilities between microbes and metabolites, while also robustly managing the uncertainty inherent in high-dimensional data. Moreover, we have implemented variational inference to address computational bottlenecks, enabling scalable analysis across extensive multiomics datasets. Our large-scale comparative evaluations demonstrate that VBayesMM not only outperforms existing methods in predicting metabolite abundances but also provides a scalable solution for analyzing massive datasets. VBayesMM enhances the interpretability of the Bayesian neural network by identifying a core set of influential microbial species, thus facilitating a deeper understanding of their probabilistic relationships with the host.</p>\",\"PeriodicalId\":9209,\"journal\":{\"name\":\"Briefings in bioinformatics\",\"volume\":\"26 4\",\"pages\":\"\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12231592/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bib/bbaf300\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf300","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
VBayesMM: variational Bayesian neural network to prioritize important relationships of high-dimensional microbiome multiomics data.
The analysis of high-dimensional microbiome multiomics datasets is crucial for understanding the complex interactions between microbial communities and host physiological states across health and disease conditions. Despite their importance, current methods, such as the microbe-metabolite vectors approach, often face challenges in predicting metabolite abundances from microbial data and identifying keystone species. This arises from the vast dimensionality of metagenomics data, which complicates the inference of significant relationships, particularly the estimation of co-occurrence probabilities between microbes and metabolites. Here we propose the variational Bayesian microbiome multiomics (VBayesMM) approach, which aims to improve the prediction of metabolite abundances from microbial metagenomics data by incorporating a spike-and-slab prior within a Bayesian neural network. This allows VBayesMM to rapidly and precisely identify crucial microbial species, leading to more accurate estimations of co-occurrence probabilities between microbes and metabolites, while also robustly managing the uncertainty inherent in high-dimensional data. Moreover, we have implemented variational inference to address computational bottlenecks, enabling scalable analysis across extensive multiomics datasets. Our large-scale comparative evaluations demonstrate that VBayesMM not only outperforms existing methods in predicting metabolite abundances but also provides a scalable solution for analyzing massive datasets. VBayesMM enhances the interpretability of the Bayesian neural network by identifying a core set of influential microbial species, thus facilitating a deeper understanding of their probabilistic relationships with the host.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.