A biclustering method for gene expression module discovery using a closed itemset enumeration algorithm

Yoshifumi Okada, W. Fujibuchi, P. Horton
{"title":"A biclustering method for gene expression module discovery using a closed itemset enumeration algorithm","authors":"Yoshifumi Okada, W. Fujibuchi, P. Horton","doi":"10.2197/IPSJDC.3.183","DOIUrl":null,"url":null,"abstract":"A gene expression module (module for short) is a set of genes with shared expression behavior under certain experimental conditions. Discovering of modules enables us to uncover the function of uncharacterized genes or genetic networks. In recent years, several biclustering methods have been suggested to discover modules from gene expression data matrices, where a bicluster is defined as a subset of genes that exhibit a highly correlated expression pattern over a subset of conditions. Biclustering however involves combinatorial optimization in selecting the rows and columns composing modules. Hence most existing algorithms are based on heuristic or stochastic approaches and produce possibly sub-optimal solutions. In this paper, we propose a novel biclustering method, BiModule, based on a closed itemset enumeration algorithm. By exhaustive enumeration of such biclusters, it is possible to select only biclusters satisfying certain criteria such as a user-specified bicluster size, an enrichment of functional annotation terms, etc. We performed comparative experiments to existing salient biclustering methods to test the validity of biclusters extracted by BiModule using synthetic data and real expression data. We show that BiModule provides high performance compared to the other methods in extracting artificially-embedded modules as well as modules strongly related to GO annotations, protein-protein interactions and metabolic pathways.","PeriodicalId":432390,"journal":{"name":"Ipsj Digital Courier","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ipsj Digital Courier","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/IPSJDC.3.183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

Abstract

A gene expression module (module for short) is a set of genes with shared expression behavior under certain experimental conditions. Discovering of modules enables us to uncover the function of uncharacterized genes or genetic networks. In recent years, several biclustering methods have been suggested to discover modules from gene expression data matrices, where a bicluster is defined as a subset of genes that exhibit a highly correlated expression pattern over a subset of conditions. Biclustering however involves combinatorial optimization in selecting the rows and columns composing modules. Hence most existing algorithms are based on heuristic or stochastic approaches and produce possibly sub-optimal solutions. In this paper, we propose a novel biclustering method, BiModule, based on a closed itemset enumeration algorithm. By exhaustive enumeration of such biclusters, it is possible to select only biclusters satisfying certain criteria such as a user-specified bicluster size, an enrichment of functional annotation terms, etc. We performed comparative experiments to existing salient biclustering methods to test the validity of biclusters extracted by BiModule using synthetic data and real expression data. We show that BiModule provides high performance compared to the other methods in extracting artificially-embedded modules as well as modules strongly related to GO annotations, protein-protein interactions and metabolic pathways.
基于封闭项集枚举算法的基因表达模块发现的双聚类方法
基因表达模块(简称模块)是在一定实验条件下具有共同表达行为的一组基因。发现模块使我们能够揭示未表征的基因或遗传网络的功能。近年来,人们提出了几种双聚类方法来从基因表达数据矩阵中发现模块,其中双聚类被定义为在一组条件下表现出高度相关表达模式的基因子集。然而,双聚类涉及选择组成模块的行和列的组合优化。因此,大多数现有算法都是基于启发式或随机方法,并可能产生次优解。在本文中,我们提出了一种新的基于封闭项集枚举算法的双聚类方法BiModule。通过对此类双聚类的详尽枚举,可以只选择满足某些标准的双聚类,例如用户指定的双聚类大小、功能注释术语的丰富性等。我们使用合成数据和真实表达数据,与现有显著性双聚类方法进行对比实验,检验BiModule提取的双聚类的有效性。我们表明,与其他方法相比,BiModule在提取人工嵌入模块以及与GO注释、蛋白质-蛋白质相互作用和代谢途径密切相关的模块方面提供了高性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信