基于基因表达谱的全基因组基因调控网络逆向工程与分析的并行计算算法

Infinity Pub Date : 2010-09-30 DOI:10.1109/PDMC-HIBI.2010.20
V. Belcastro, D. Bernardo, F. Gregoretti, G. Oliva
{"title":"基于基因表达谱的全基因组基因调控网络逆向工程与分析的并行计算算法","authors":"V. Belcastro, D. Bernardo, F. Gregoretti, G. Oliva","doi":"10.1109/PDMC-HIBI.2010.20","DOIUrl":null,"url":null,"abstract":"A Gene Regulatory Network links pairs of genes through an edge if they physically or functionally interact.\"Reverse engineering&quot, a gene regulatory network means to infer the edges between genes from the available experimental data. Transcriptional responses (i.e. gene expression profiles obtained through micro array experiments) are often used to reverse-engineer a network of genes. Reverse-engineering consists in analyzing transcriptional responses to a set of treatments and adding an edge between genes if their expressions show a coordinated behavior on a subset of the treatments, according to some underlying model of gene regulation. Mammalian cells contain tens of thousands of genes, and it is necessary to analyze hundreds of transcriptional responses in order to have acceptable statistical evidence of interactions between genes. There currently exist several ready-to-use software packages able to infer gene networks, but few can be used to infer large-size networks from thousands of transcriptional responses as the dimension of the problem leads to high computational costs and memory requirements. We propose to exploit parallel computing techniques to overcome this problem. In this work, we designed and developed a parallel computing algorithm to reverse engineer large-scale gene regulatory networks from tens of thousands of gene expression profiles. The algorithm is based on computing pair-wise Mutual Information between each gene-pair. We successfully tested it to infer the Mus Musculus (mouse) gene regulatory network in liver from 312 expression profiles collected from a public Internet repository. Each profile measures the expression of 45,101 genes (more specifically, transcripts). We analyzed all of the possible gene-pairs for a total amount of about 1 billion identifying about 60 millions edges. We used a hierarchical clustering algorithm to discover communities within the gene network, and found a modular structure that highlights genes involved in the same biological functions.","PeriodicalId":31175,"journal":{"name":"Infinity","volume":"14 1","pages":"88-94"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Parallel Computing Algorithms for Reverse-Engineering and Analysis of Genome-Wide Gene Regulatory Networks from Gene Expression Profiles\",\"authors\":\"V. Belcastro, D. Bernardo, F. Gregoretti, G. Oliva\",\"doi\":\"10.1109/PDMC-HIBI.2010.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A Gene Regulatory Network links pairs of genes through an edge if they physically or functionally interact.\\\"Reverse engineering&quot, a gene regulatory network means to infer the edges between genes from the available experimental data. Transcriptional responses (i.e. gene expression profiles obtained through micro array experiments) are often used to reverse-engineer a network of genes. Reverse-engineering consists in analyzing transcriptional responses to a set of treatments and adding an edge between genes if their expressions show a coordinated behavior on a subset of the treatments, according to some underlying model of gene regulation. Mammalian cells contain tens of thousands of genes, and it is necessary to analyze hundreds of transcriptional responses in order to have acceptable statistical evidence of interactions between genes. There currently exist several ready-to-use software packages able to infer gene networks, but few can be used to infer large-size networks from thousands of transcriptional responses as the dimension of the problem leads to high computational costs and memory requirements. We propose to exploit parallel computing techniques to overcome this problem. In this work, we designed and developed a parallel computing algorithm to reverse engineer large-scale gene regulatory networks from tens of thousands of gene expression profiles. The algorithm is based on computing pair-wise Mutual Information between each gene-pair. We successfully tested it to infer the Mus Musculus (mouse) gene regulatory network in liver from 312 expression profiles collected from a public Internet repository. Each profile measures the expression of 45,101 genes (more specifically, transcripts). We analyzed all of the possible gene-pairs for a total amount of about 1 billion identifying about 60 millions edges. We used a hierarchical clustering algorithm to discover communities within the gene network, and found a modular structure that highlights genes involved in the same biological functions.\",\"PeriodicalId\":31175,\"journal\":{\"name\":\"Infinity\",\"volume\":\"14 1\",\"pages\":\"88-94\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infinity\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDMC-HIBI.2010.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infinity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDMC-HIBI.2010.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

如果一对基因在物理上或功能上相互作用,基因调控网络通过边缘将它们连接起来。“逆向工程”,一个基因调控网络意味着从现有的实验数据中推断出基因之间的边缘。转录反应(即通过微阵列实验获得的基因表达谱)通常用于对基因网络进行逆向工程。逆向工程包括分析对一系列处理的转录反应,如果它们的表达在处理的一个子集上显示出协调的行为,那么根据一些潜在的基因调控模型,在基因之间添加一个边缘。哺乳动物细胞包含数以万计的基因,有必要分析数以百计的转录反应,以便有可接受的基因之间相互作用的统计证据。目前有几个现成的软件包能够推断基因网络,但很少有软件可以从数千个转录反应中推断出大规模的网络,因为这个问题的规模导致了高计算成本和内存需求。我们建议利用并行计算技术来克服这个问题。在这项工作中,我们设计并开发了一种并行计算算法,从数万个基因表达谱中对大规模基因调控网络进行逆向工程。该算法基于计算每个基因对之间的成对互信息。我们成功地对其进行了测试,从公共互联网存储库收集的312个表达谱中推断出肝脏中的小家鼠(小鼠)基因调控网络。每个剖面测量45101个基因的表达(更具体地说,转录本)。我们分析了所有可能的基因对,总共有大约10亿对,确定了大约6000万条边。我们使用分层聚类算法来发现基因网络中的社区,并发现了一个模块化结构,突出了涉及相同生物功能的基因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Parallel Computing Algorithms for Reverse-Engineering and Analysis of Genome-Wide Gene Regulatory Networks from Gene Expression Profiles
A Gene Regulatory Network links pairs of genes through an edge if they physically or functionally interact."Reverse engineering", a gene regulatory network means to infer the edges between genes from the available experimental data. Transcriptional responses (i.e. gene expression profiles obtained through micro array experiments) are often used to reverse-engineer a network of genes. Reverse-engineering consists in analyzing transcriptional responses to a set of treatments and adding an edge between genes if their expressions show a coordinated behavior on a subset of the treatments, according to some underlying model of gene regulation. Mammalian cells contain tens of thousands of genes, and it is necessary to analyze hundreds of transcriptional responses in order to have acceptable statistical evidence of interactions between genes. There currently exist several ready-to-use software packages able to infer gene networks, but few can be used to infer large-size networks from thousands of transcriptional responses as the dimension of the problem leads to high computational costs and memory requirements. We propose to exploit parallel computing techniques to overcome this problem. In this work, we designed and developed a parallel computing algorithm to reverse engineer large-scale gene regulatory networks from tens of thousands of gene expression profiles. The algorithm is based on computing pair-wise Mutual Information between each gene-pair. We successfully tested it to infer the Mus Musculus (mouse) gene regulatory network in liver from 312 expression profiles collected from a public Internet repository. Each profile measures the expression of 45,101 genes (more specifically, transcripts). We analyzed all of the possible gene-pairs for a total amount of about 1 billion identifying about 60 millions edges. We used a hierarchical clustering algorithm to discover communities within the gene network, and found a modular structure that highlights genes involved in the same biological functions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.30
自引率
0.00%
发文量
26
审稿时长
10 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信