Biclustering of Gene Expression Data Using Genetic Algorithm

2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology Pub Date : 1900-01-01 DOI:10.1109/CIBCB.2005.1594893

Anupam Chakraborty, Hitashyam Maka

引用次数: 68

Abstract

The biclustering problem of gene expression data deals with finding a subset of genes which exhibit similar expression patterns along a subset of conditions. Most of the current algorithms use a statistically predefined threshold as an input parameter for biclustering. This threshold defines the maximum allowable dissimilarity between the cells of a bicluster and is very hard to determine beforehand. Hence we propose two genetic algorithms that embed greedy algorithm as local search procedure and find the best biclusters independent of this threshold score. We also establish that the HScore of a bicluster under the additive model approximately follows chi-square distribution. We found that these genetic algorithms outperformed other greedy algorithms on yeast and lymphoma datasets.

查看原文本刊更多论文

基于遗传算法的基因表达数据双聚类

基因表达数据的双聚类问题处理的是在一组条件下找到一组表现出相似表达模式的基因。目前大多数算法使用统计预定义的阈值作为双聚类的输入参数。这个阈值定义了双聚类的单元之间允许的最大不相似性，并且很难事先确定。因此，我们提出了两种嵌入贪婪算法作为局部搜索过程的遗传算法，并找到独立于该阈值分数的最佳双聚类。我们还建立了在加性模型下双聚类的HScore近似服从卡方分布。我们发现这些遗传算法在酵母和淋巴瘤数据集上优于其他贪婪算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology

自引率

0.00%

发文量