DeepBiome: A Phylogenetic Tree Informed Deep Neural Network for Microbiome Data Analysis.

IF 0.4 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Statistics in Biosciences Pub Date : 2025-04-01 Epub Date: 2024-06-14 DOI:10.1007/s12561-024-09434-9
Jing Zhai, Youngwon Choi, Xingyi Yang, Yin Chen, Kenneth Knox, Homer L Twigg, Joong-Ho Won, Hua Zhou, Jin J Zhou
{"title":"DeepBiome: A Phylogenetic Tree Informed Deep Neural Network for Microbiome Data Analysis.","authors":"Jing Zhai, Youngwon Choi, Xingyi Yang, Yin Chen, Kenneth Knox, Homer L Twigg, Joong-Ho Won, Hua Zhou, Jin J Zhou","doi":"10.1007/s12561-024-09434-9","DOIUrl":null,"url":null,"abstract":"<p><p>Evidence linking the microbiome to human health is rapidly growing. The microbiome profile has the potential as a novel predictive biomarker for many diseases. However, tables of bacterial counts are typically sparse, and bacteria are classified within a hierarchy of taxonomic levels, ranging from species to phylum. Existing tools focus on identifying microbiome associations at either the community level or a specific, pre-defined taxonomic level. Incorporating the evolutionary relationship between bacteria can enhance data interpretation. This approach allows for aggregating microbiome contributions, leading to more accurate and interpretable results. We present DeepBiome, a phylogeny-informed neural network architecture, to predict phenotypes from microbiome counts and uncover the microbiome-phenotype association network. It utilizes microbiome abundance as input and employs phylogenetic taxonomy to guide the neural network's architecture. Leveraging phylogenetic information, DeepBiome is applicable to both regression and reduces the need for extensive tuning of the deep learning architecture, minimizes overfitting, and, crucially, enables the visualization of the path from microbiome counts to disease. It classification problems. Simulation studies and real-life data analysis have shown that DeepBiome is both highly accurate and efficient. It offers deep insights into complex microbiome-phenotype associations, even with small to moderate training sample sizes. In practice, the specific taxonomic level at which microbiome clusters tag the association remains unknown. Therefore, the main advantage of the presented method over other analytical methods is that it offers an ecological and evolutionary understanding of host-microbe interactions, which is important for microbiome-based medicine. DeepBiome is implemented using Python packages Keras and TensorFlow. It is an open-source tool available at https://github.com/Young-won/DeepBiome.</p>","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":"17 1","pages":"191-215"},"PeriodicalIF":0.4000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12395559/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Biosciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12561-024-09434-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/14 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Evidence linking the microbiome to human health is rapidly growing. The microbiome profile has the potential as a novel predictive biomarker for many diseases. However, tables of bacterial counts are typically sparse, and bacteria are classified within a hierarchy of taxonomic levels, ranging from species to phylum. Existing tools focus on identifying microbiome associations at either the community level or a specific, pre-defined taxonomic level. Incorporating the evolutionary relationship between bacteria can enhance data interpretation. This approach allows for aggregating microbiome contributions, leading to more accurate and interpretable results. We present DeepBiome, a phylogeny-informed neural network architecture, to predict phenotypes from microbiome counts and uncover the microbiome-phenotype association network. It utilizes microbiome abundance as input and employs phylogenetic taxonomy to guide the neural network's architecture. Leveraging phylogenetic information, DeepBiome is applicable to both regression and reduces the need for extensive tuning of the deep learning architecture, minimizes overfitting, and, crucially, enables the visualization of the path from microbiome counts to disease. It classification problems. Simulation studies and real-life data analysis have shown that DeepBiome is both highly accurate and efficient. It offers deep insights into complex microbiome-phenotype associations, even with small to moderate training sample sizes. In practice, the specific taxonomic level at which microbiome clusters tag the association remains unknown. Therefore, the main advantage of the presented method over other analytical methods is that it offers an ecological and evolutionary understanding of host-microbe interactions, which is important for microbiome-based medicine. DeepBiome is implemented using Python packages Keras and TensorFlow. It is an open-source tool available at https://github.com/Young-won/DeepBiome.

DeepBiome:用于微生物组数据分析的系统发育树信息深度神经网络。
将微生物群与人类健康联系起来的证据正在迅速增加。微生物组谱具有作为许多疾病的新型预测生物标志物的潜力。然而,细菌计数表通常是稀疏的,细菌在分类水平的层次中被分类,从种到门。现有的工具侧重于在群落水平或特定的、预定义的分类水平上识别微生物组的关联。结合细菌之间的进化关系可以加强数据的解释。这种方法允许聚集微生物组的贡献,导致更准确和可解释的结果。我们提出DeepBiome,一个系统发育信息的神经网络架构,从微生物组计数预测表型,并揭示微生物组-表型关联网络。它利用微生物组丰度作为输入,并采用系统发育分类学来指导神经网络的结构。利用系统发育信息,DeepBiome既适用于回归,也减少了对深度学习架构进行大量调整的需要,最大限度地减少了过度拟合,而且,至关重要的是,能够实现从微生物群计数到疾病的路径可视化。它的分类问题。仿真研究和实际数据分析表明,DeepBiome既精确又高效。它提供了深入了解复杂的微生物组表型关联,即使是小到中等训练样本量。在实践中,具体的分类学水平上,微生物群标记的关联仍然未知。因此,与其他分析方法相比,所提出的方法的主要优势在于它提供了宿主-微生物相互作用的生态和进化理解,这对于基于微生物组的医学非常重要。DeepBiome是使用Python包Keras和TensorFlow实现的。它是一个开源工具,可在https://github.com/Young-won/DeepBiome上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Statistics in Biosciences
Statistics in Biosciences MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
2.00
自引率
0.00%
发文量
28
期刊介绍: Statistics in Biosciences (SIBS) is published three times a year in print and electronic form. It aims at development and application of statistical methods and their interface with other quantitative methods, such as computational and mathematical methods, in biological and life science, health science, and biopharmaceutical and biotechnological science. SIBS publishes scientific papers and review articles in four sections, with the first two sections as the primary sections. Original Articles publish novel statistical and quantitative methods in biosciences. The Bioscience Case Studies and Practice Articles publish papers that advance statistical practice in biosciences, such as case studies, innovative applications of existing methods that further understanding of subject-matter science, evaluation of existing methods and data sources. Review Articles publish papers that review an area of statistical and quantitative methodology, software, and data sources in biosciences. Commentaries provide perspectives of research topics or policy issues that are of current quantitative interest in biosciences, reactions to an article published in the journal, and scholarly essays. Substantive science is essential in motivating and demonstrating the methodological development and use for an article to be acceptable. Articles published in SIBS share the goal of promoting evidence-based real world practice and policy making through effective and timely interaction and communication of statisticians and quantitative researchers with subject-matter scientists in biosciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信