An effective tri-clustering algorithm combining expression data with gene regulation information.

Ao Li, David Tuck
{"title":"An effective tri-clustering algorithm combining expression data with gene regulation information.","authors":"Ao Li,&nbsp;David Tuck","doi":"10.4137/grsb.s1150","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing.</p><p><strong>Methods: </strong>By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV) as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS) is introduced to automatically determine the boundary threshold.</p><p><strong>Results: </strong>Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.</p>","PeriodicalId":73138,"journal":{"name":"Gene regulation and systems biology","volume":"3 ","pages":"49-64"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/grsb.s1150","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gene regulation and systems biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4137/grsb.s1150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35

Abstract

Motivation: Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing.

Methods: By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV) as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS) is introduced to automatically determine the boundary threshold.

Results: Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.

Abstract Image

Abstract Image

Abstract Image

结合表达数据和基因调控信息的有效三聚类算法。
动机:双聚类算法旨在识别在一组条件下共享相似表达模式的基因集。然而,直接解释或预测基因调控机制可能是困难的,因为只使用基因表达数据。关于基因调控因子的信息也可能是可用的,最常见的是关于哪些转录因子可以结合到启动子区域,从而控制基因的表达水平。因此,需要一种整合基因表达和基因调控信息的方法来进行聚类和分析。方法:通过将基因调控信息与基因表达数据相结合,我们将调节表达值(REV)定义为基因如何被特定因子调节的指标。通过开发一种启发式三聚类算法,将现有的双聚类方法扩展到三维数据空间。引入了一种自动边界搜索算法(ABS)来自动确定边界阈值。结果:结合转录因子-基因相互作用的ChIP-chip数据的结果表明,该算法对于检测三聚类是有效和稳健的。从酵母孢子萌发REV数据中提取的三聚类的详细分析表明,该聚类的基因在中后期表现出显著差异。然后重建所涉及的调控网络,以进一步研究已定义的调控机制。该网络的拓扑和统计分析表明,在酵母产孢的不同阶段,TF活性发生了显著变化,这表明该方法可能是研究发生转变的调控网络的一般方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信