挖掘具有代表性的近似频繁共表达式子网

Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics Pub Date : 2020-09-21 DOI:10.1145/3388440.3415584

Sangmin Seo, Saeed Salem

{"title":"挖掘具有代表性的近似频繁共表达式子网","authors":"Sangmin Seo, Saeed Salem","doi":"10.1145/3388440.3415584","DOIUrl":null,"url":null,"abstract":"Advances in high-throughput microarray and RNA-sequencing technologies have lead to a rapid accumulation of gene expression data for various biological conditions across multiple species. Mining frequent gene modules from a set of multiple gene coexpression networks has applications in functional gene annotation and biomarker discovery. Biclustering algorithms have been proposed to allow for missing coexpression links. Existing approaches report a large number of edgesets which are computationally intensive to analyze, and have high overlap among the reported subnetworks. In this work, we propose an algorithm to mine frequent dense modules from multiple coexpression networks using an online data summarization method. Our algorithm mines a succinct set of representative subgraphs that have little overlap which reduces the downstream analysis of the reported modules. Experiments on human gene expression data show that the reported modules are biologically significant as evident by the high enrichment of GO molecular functions and KEGG pathways in the reported modules.","PeriodicalId":411338,"journal":{"name":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mining representative approximate frequent coexpression subnetworks\",\"authors\":\"Sangmin Seo, Saeed Salem\",\"doi\":\"10.1145/3388440.3415584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advances in high-throughput microarray and RNA-sequencing technologies have lead to a rapid accumulation of gene expression data for various biological conditions across multiple species. Mining frequent gene modules from a set of multiple gene coexpression networks has applications in functional gene annotation and biomarker discovery. Biclustering algorithms have been proposed to allow for missing coexpression links. Existing approaches report a large number of edgesets which are computationally intensive to analyze, and have high overlap among the reported subnetworks. In this work, we propose an algorithm to mine frequent dense modules from multiple coexpression networks using an online data summarization method. Our algorithm mines a succinct set of representative subgraphs that have little overlap which reduces the downstream analysis of the reported modules. Experiments on human gene expression data show that the reported modules are biologically significant as evident by the high enrichment of GO molecular functions and KEGG pathways in the reported modules.\",\"PeriodicalId\":411338,\"journal\":{\"name\":\"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3388440.3415584\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3415584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

高通量微阵列技术和rna测序技术的进步使得多种生物条件下不同物种的基因表达数据快速积累。从一组多基因共表达网络中挖掘频繁基因模块在功能基因注释和生物标志物发现方面具有重要的应用价值。已经提出了双聚类算法来考虑缺失的共表达链接。现有的方法报告了大量的边缘集，计算量大，难以分析，并且报告的子网之间有很高的重叠。在这项工作中，我们提出了一种使用在线数据汇总方法从多个共表达网络中挖掘频繁密集模块的算法。我们的算法挖掘了一组简洁的具有代表性的子图，这些子图很少重叠，从而减少了对报告模块的下游分析。人类基因表达数据实验表明，所报道的模块具有显著的生物学意义，因为所报道的模块中GO分子功能和KEGG通路高度富集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Mining representative approximate frequent coexpression subnetworks

Advances in high-throughput microarray and RNA-sequencing technologies have lead to a rapid accumulation of gene expression data for various biological conditions across multiple species. Mining frequent gene modules from a set of multiple gene coexpression networks has applications in functional gene annotation and biomarker discovery. Biclustering algorithms have been proposed to allow for missing coexpression links. Existing approaches report a large number of edgesets which are computationally intensive to analyze, and have high overlap among the reported subnetworks. In this work, we propose an algorithm to mine frequent dense modules from multiple coexpression networks using an online data summarization method. Our algorithm mines a succinct set of representative subgraphs that have little overlap which reduces the downstream analysis of the reported modules. Experiments on human gene expression data show that the reported modules are biologically significant as evident by the high enrichment of GO molecular functions and KEGG pathways in the reported modules.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

自引率

0.00%

发文量