Identification of Granger Causality between Gene Sets

American journal of bioinformatics and computational biology Pub Date : 2010-08-01 DOI:10.1142/S0219720010004860

André Fujita, J. Sato, Kaname Kojima, L. Gomes, Masao Nagasaki, M. Sogayar, S. Miyano

{"title":"Identification of Granger Causality between Gene Sets","authors":"André Fujita, J. Sato, Kaname Kojima, L. Gomes, Masao Nagasaki, M. Sogayar, S. Miyano","doi":"10.1142/S0219720010004860","DOIUrl":null,"url":null,"abstract":"Wiener and Granger have introduced an intuitive concept of causality (Granger causality) between two variables which is based on the idea that an effect never occurs before its cause. Later, Geweke generalized this concept to a multivariate Granger causality, i.e. n variables Granger-cause another variable. Although Granger causality is not \"effective causality\" in the Aristothelic sense, this concept is useful to infer directionality and information flow in observational data. Granger causality is usually identified by using VAR (Vector Autoregressive) models due to their simplicity. In the last few years, several VAR-based models were presented in order to model gene regulatory networks. Here, we generalize the multivariate Granger causality concept in order to identify Granger causalities between sets of gene expressions, i.e. whether a set of n genes Granger-causes another set of m genes, aiming at identifying the flow of information between gene networks (or pathways). The concept of Granger causality for sets of variables is presented. Moreover, a method for its identification with a bootstrap test is proposed. This method is applied in simulated and also in actual biological gene expression data in order to model regulatory networks. This concept may be useful for the understanding of the complete information flow from one network or pathway to the other, mainly in regulatory networks. Linking this concept to graph theory, sink and source can be generalized to node sets. Moreover, hub and centrality for sets of genes can be defined based on total information flow. Another application is in annotation, when the functionality of a set of genes is unknown, but this set is Granger-caused by another set of genes which is well studied. Therefore, this information may be useful to infer or construct some hypothesis about the unknown set of genes.","PeriodicalId":90783,"journal":{"name":"American journal of bioinformatics and computational biology","volume":"40 1","pages":"679-701"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of bioinformatics and computational biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/S0219720010004860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Wiener and Granger have introduced an intuitive concept of causality (Granger causality) between two variables which is based on the idea that an effect never occurs before its cause. Later, Geweke generalized this concept to a multivariate Granger causality, i.e. n variables Granger-cause another variable. Although Granger causality is not "effective causality" in the Aristothelic sense, this concept is useful to infer directionality and information flow in observational data. Granger causality is usually identified by using VAR (Vector Autoregressive) models due to their simplicity. In the last few years, several VAR-based models were presented in order to model gene regulatory networks. Here, we generalize the multivariate Granger causality concept in order to identify Granger causalities between sets of gene expressions, i.e. whether a set of n genes Granger-causes another set of m genes, aiming at identifying the flow of information between gene networks (or pathways). The concept of Granger causality for sets of variables is presented. Moreover, a method for its identification with a bootstrap test is proposed. This method is applied in simulated and also in actual biological gene expression data in order to model regulatory networks. This concept may be useful for the understanding of the complete information flow from one network or pathway to the other, mainly in regulatory networks. Linking this concept to graph theory, sink and source can be generalized to node sets. Moreover, hub and centrality for sets of genes can be defined based on total information flow. Another application is in annotation, when the functionality of a set of genes is unknown, but this set is Granger-caused by another set of genes which is well studied. Therefore, this information may be useful to infer or construct some hypothesis about the unknown set of genes.

查看原文本刊更多论文

基因组间格兰杰因果关系的鉴定

Wiener和Granger在两个变量之间引入了一个直观的因果关系(Granger因果关系)概念，该概念基于一个结果永远不会在其原因之前发生的想法。后来，Geweke将这一概念推广到多元格兰杰因果关系，即n个变量格兰杰导致另一个变量。虽然格兰杰因果关系不是亚里士多德意义上的“有效因果关系”，但这个概念对于推断观测数据的方向性和信息流是有用的。格兰杰因果关系由于其简单性，通常使用VAR(向量自回归)模型来识别。在过去的几年里，为了模拟基因调控网络，提出了几个基于var的模型。在这里，我们推广多元格兰杰因果关系概念，以识别基因表达集之间的格兰杰因果关系，即一组n个基因是否格兰杰导致另一组m个基因，旨在识别基因网络(或途径)之间的信息流。提出了变量集的格兰杰因果关系的概念。此外，还提出了一种用自举法进行辨识的方法。该方法应用于模拟和实际的生物基因表达数据，以模拟调控网络。这个概念可能有助于理解从一个网络或途径到另一个网络的完整信息流，主要是在监管网络中。将这个概念与图论联系起来，汇和源可以推广到节点集。此外，基因集的枢纽和中心性可以根据总信息流来定义。另一个应用是在注释中，当一组基因的功能是未知的，但这组基因是由另一组研究得很好的基因引起的。因此，这些信息可能有助于推断或构建关于未知基因集的一些假设。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

American journal of bioinformatics and computational biology

自引率

0.00%

发文量