MFMS:从多个人类基因表达数据集中挖掘最大频繁模块集

Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference) Pub Date : 2013-08-11 DOI:10.1145/2500863.2500869

Saeed Salem, C. Ozcaglar

{"title":"MFMS:从多个人类基因表达数据集中挖掘最大频繁模块集","authors":"Saeed Salem, C. Ozcaglar","doi":"10.1145/2500863.2500869","DOIUrl":null,"url":null,"abstract":"Advances in genomic technologies have allowed vast amounts of gene expression data to be collected. Protein functional annotation and biological module discovery that are based on a single gene expression data suffers from spurious coexpression. Recent work have focused on integrating multiple independent gene expression data sets. In this paper, we propose a two-step approach for mining maximally frequent collection of highly connected modules from coexpression graphs. We first mine maximal frequent edge-sets and then extract highly connected subgraphs from the edge-induced subgraphs. Experimental results on the collection of modules mined from 52 Human gene expression data sets show that coexpression links that occur together in a significant number of experiments have a modular topological structure. Moreover, GO enrichment analysis shows that the proposed approach discovers biologically significant frequent collections of modules.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"20 1","pages":"51-57"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"MFMS: maximal frequent module set mining from multiple human gene expression data sets\",\"authors\":\"Saeed Salem, C. Ozcaglar\",\"doi\":\"10.1145/2500863.2500869\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advances in genomic technologies have allowed vast amounts of gene expression data to be collected. Protein functional annotation and biological module discovery that are based on a single gene expression data suffers from spurious coexpression. Recent work have focused on integrating multiple independent gene expression data sets. In this paper, we propose a two-step approach for mining maximally frequent collection of highly connected modules from coexpression graphs. We first mine maximal frequent edge-sets and then extract highly connected subgraphs from the edge-induced subgraphs. Experimental results on the collection of modules mined from 52 Human gene expression data sets show that coexpression links that occur together in a significant number of experiments have a modular topological structure. Moreover, GO enrichment analysis shows that the proposed approach discovers biologically significant frequent collections of modules.\",\"PeriodicalId\":90497,\"journal\":{\"name\":\"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)\",\"volume\":\"20 1\",\"pages\":\"51-57\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2500863.2500869\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2500863.2500869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

基因组技术的进步使得大量的基因表达数据得以收集。基于单个基因表达数据的蛋白质功能注释和生物模块发现存在虚假共表达的问题。最近的工作集中在整合多个独立的基因表达数据集。在本文中，我们提出了一种从共表达式图中挖掘高度连接模块的最大频繁集合的两步方法。首先挖掘最大频繁边集，然后从边诱导子图中提取高连通子图。从52个人类基因表达数据集中挖掘模块的实验结果表明，在大量实验中一起发生的共表达链接具有模块化拓扑结构。此外，氧化石墨烯富集分析表明，该方法发现了具有生物学意义的频繁模块集合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MFMS: maximal frequent module set mining from multiple human gene expression data sets

Advances in genomic technologies have allowed vast amounts of gene expression data to be collected. Protein functional annotation and biological module discovery that are based on a single gene expression data suffers from spurious coexpression. Recent work have focused on integrating multiple independent gene expression data sets. In this paper, we propose a two-step approach for mining maximally frequent collection of highly connected modules from coexpression graphs. We first mine maximal frequent edge-sets and then extract highly connected subgraphs from the edge-induced subgraphs. Experimental results on the collection of modules mined from 52 Human gene expression data sets show that coexpression links that occur together in a significant number of experiments have a modular topological structure. Moreover, GO enrichment analysis shows that the proposed approach discovers biologically significant frequent collections of modules.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)

自引率

0.00%

发文量