Bipartite graphs for metagenomic data analysis and visualization

2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Pub Date : 2015-11-09 DOI:10.1109/BIBM.2015.7359839

K. Sedlář, Helena Skutková, P. Videnska, I. Rychlík, I. Provazník

{"title":"Bipartite graphs for metagenomic data analysis and visualization","authors":"K. Sedlář, Helena Skutková, P. Videnska, I. Rychlík, I. Provazník","doi":"10.1109/BIBM.2015.7359839","DOIUrl":null,"url":null,"abstract":"Metagenomics became very popular after expansion of next-generation sequencing techniques that allowed simple implementation of extensive studies. With a target gene sequencing approach, an identification of organisms in a metagenome is quite effortless since only a small reference database of the particular gene is needed. Moreover, by counting the copies of individual genes, also quantitative analysis can be applied. Unfortunately, current bioinformatics tools aim mainly on the analysis of a single metagenome. A cluster analysis, a heatmap of correlation coefficients, biclustering or other statistics techniques can only show relations inside the metagenome or the relation between the metagenome composition and other parameters. On the other hand, there is a lack of tools to provide a comparative analysis of two or more metagenomes. Suitable properties for this kind of analysis can be found in a bipartite graph. Here, we present a novel workflow for finding the suitable representation of metagenomic data to provide a comparative analysis of metagenomes. The resulting graph can take into account information about the actual composition of the metagenome as well as the environment it relates to. Thus, it can provide different view of the data to the naked eye that can complement other techniques such as principal coordinate analysis.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2015.7359839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Metagenomics became very popular after expansion of next-generation sequencing techniques that allowed simple implementation of extensive studies. With a target gene sequencing approach, an identification of organisms in a metagenome is quite effortless since only a small reference database of the particular gene is needed. Moreover, by counting the copies of individual genes, also quantitative analysis can be applied. Unfortunately, current bioinformatics tools aim mainly on the analysis of a single metagenome. A cluster analysis, a heatmap of correlation coefficients, biclustering or other statistics techniques can only show relations inside the metagenome or the relation between the metagenome composition and other parameters. On the other hand, there is a lack of tools to provide a comparative analysis of two or more metagenomes. Suitable properties for this kind of analysis can be found in a bipartite graph. Here, we present a novel workflow for finding the suitable representation of metagenomic data to provide a comparative analysis of metagenomes. The resulting graph can take into account information about the actual composition of the metagenome as well as the environment it relates to. Thus, it can provide different view of the data to the naked eye that can complement other techniques such as principal coordinate analysis.

查看原文本刊更多论文

用于宏基因组数据分析和可视化的二部图

元基因组学在下一代测序技术的扩展后变得非常流行，这种技术允许简单地实施广泛的研究。使用靶基因测序方法，在宏基因组中识别生物体是相当容易的，因为只需要一个特定基因的小参考数据库。此外，通过计算单个基因的拷贝数，也可以应用定量分析。不幸的是，目前的生物信息学工具主要针对单个宏基因组的分析。聚类分析、相关系数热图、双聚类或其他统计技术只能显示宏基因组内部的关系或宏基因组组成与其他参数之间的关系。另一方面，缺乏对两个或多个宏基因组进行比较分析的工具。在二部图中可以找到适合这种分析的性质。在这里，我们提出了一种新的工作流程，用于寻找宏基因组数据的合适表示，以提供宏基因组的比较分析。生成的图可以考虑到宏基因组的实际组成信息以及与之相关的环境。因此，它可以为肉眼提供不同的数据视图，可以补充其他技术，如主坐标分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

自引率

0.00%

发文量