Phase-Wise Clustering of Time Series Gene Expression Data

2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications Pub Date : 2011-11-16 DOI:10.1109/TrustCom.2011.231

Poonam Goyal, Navneet Goyal, R. Karwa, M. John

{"title":"Phase-Wise Clustering of Time Series Gene Expression Data","authors":"Poonam Goyal, Navneet Goyal, R. Karwa, M. John","doi":"10.1109/TrustCom.2011.231","DOIUrl":null,"url":null,"abstract":"Extensive studies have shown that analyzing micro array time series data is important in bioinformatics research and biomedical applications. An observation in the analysis of gene expression data is that many genes have similarity in their expression patterns and therefore appear to be co-regulated. Previously, the time series gene expression data was analyzed mainly by checking the global similarities between the gene expression profiles and local similarities were overlooked. Local similarities can provide useful insight into gene behavior. In this paper, we propose a clustering algorithm for analyzing the time series gene expression data to identify the gene clusters based on the phase-wise local similarities in the cell cycle. Our approach exploits the fact that the genes which are involved in one phase of a cell cycle would have a characteristic profile for time points belonging to that phase and may not be involved in other phases. Moreover, a gene that is clustered with a set of genes in one phase might be involved with a different set of genes in other phases. In the proposed approach, we first clustered the genes at every time point of a phase and group genes with similar expression profiles, i.e., we group those genes together which remain in the same cluster at every time point within a phase. The functions of genes were obtained from Gene Ontology. In this paper, the results are presented for different phases of a cell cycle. Candidate genes are identified for these phases and their groups are analyzed. We found that the group of candidate genes had few genes which are known to be involved. Furthermore, some genes are found to be involved in more than one phase with different set of genes. Results presented show that local similarities can provide useful insight into gene behavior. Results are compared with an existing algorithm, STEM. We have used a saccharomyces cerevisiae cell cycle micro array database which is part of the Stanford Micro array Database (SMD).","PeriodicalId":289926,"journal":{"name":"2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TrustCom.2011.231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Extensive studies have shown that analyzing micro array time series data is important in bioinformatics research and biomedical applications. An observation in the analysis of gene expression data is that many genes have similarity in their expression patterns and therefore appear to be co-regulated. Previously, the time series gene expression data was analyzed mainly by checking the global similarities between the gene expression profiles and local similarities were overlooked. Local similarities can provide useful insight into gene behavior. In this paper, we propose a clustering algorithm for analyzing the time series gene expression data to identify the gene clusters based on the phase-wise local similarities in the cell cycle. Our approach exploits the fact that the genes which are involved in one phase of a cell cycle would have a characteristic profile for time points belonging to that phase and may not be involved in other phases. Moreover, a gene that is clustered with a set of genes in one phase might be involved with a different set of genes in other phases. In the proposed approach, we first clustered the genes at every time point of a phase and group genes with similar expression profiles, i.e., we group those genes together which remain in the same cluster at every time point within a phase. The functions of genes were obtained from Gene Ontology. In this paper, the results are presented for different phases of a cell cycle. Candidate genes are identified for these phases and their groups are analyzed. We found that the group of candidate genes had few genes which are known to be involved. Furthermore, some genes are found to be involved in more than one phase with different set of genes. Results presented show that local similarities can provide useful insight into gene behavior. Results are compared with an existing algorithm, STEM. We have used a saccharomyces cerevisiae cell cycle micro array database which is part of the Stanford Micro array Database (SMD).

查看原文本刊更多论文

时间序列基因表达数据的分相聚类

大量研究表明，分析微阵列时间序列数据在生物信息学研究和生物医学应用中具有重要意义。在基因表达数据分析中观察到，许多基因在其表达模式上具有相似性，因此似乎是共同调控的。以往对时间序列基因表达数据的分析主要是检查基因表达谱之间的全局相似性，而忽略了局部相似性。局部相似性可以为基因行为提供有用的见解。在本文中，我们提出了一种聚类算法来分析时间序列基因表达数据，并基于细胞周期的阶段性局部相似性来识别基因簇。我们的方法利用了这样一个事实，即参与细胞周期的一个阶段的基因在属于该阶段的时间点上具有特征轮廓，而可能不参与其他阶段。此外，在一个阶段与一组基因聚集在一起的基因可能在其他阶段与一组不同的基因有关。在该方法中，我们首先在一个阶段的每个时间点对基因进行聚类，并将具有相似表达谱的基因分组，即我们将在一个阶段的每个时间点保持在同一簇中的基因分组在一起。基因的功能来源于基因本体。本文给出了细胞周期不同阶段的结果。确定了这些阶段的候选基因并分析了它们的类群。我们发现候选基因组中很少有已知参与的基因。此外，还发现一些基因在不同的基因组中参与了多个阶段。结果表明，局部相似性可以为基因行为提供有用的见解。结果与现有的STEM算法进行了比较。我们使用了一个酿酒酵母细胞周期微阵列数据库，它是斯坦福微阵列数据库(SMD)的一部分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications

自引率

0.00%

发文量