Assessing the exceptionality of coloured motifs in networks.

EURASIP journal on bioinformatics & systems biology Pub Date : 2009-01-01 Epub Date: 2009-01-26 DOI:10.1155/2009/616234

Sophie Schbath, Vincent Lacroix, Marie-France Sagot

{"title":"Assessing the exceptionality of coloured motifs in networks.","authors":"Sophie Schbath, Vincent Lacroix, Marie-France Sagot","doi":"10.1155/2009/616234","DOIUrl":null,"url":null,"abstract":"<p><p>Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive P-values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a P-value for a coloured motif, without spending time on simulations.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"616234"},"PeriodicalIF":0.0000,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/616234","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EURASIP journal on bioinformatics & systems biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2009/616234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2009/1/26 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive P-values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a P-value for a coloured motif, without spending time on simulations.

Abstract Image

查看原文本刊更多论文

网络中有色图案的异常性评估。

最近已经采用了各种方法来描述生物网络的结构。特别是，网络基序的概念和相关的彩色基序的概念已被证明有助于建模功能/进化构建块的概念。然而，枚举网络中所有基序的算法可能产生非常大的输出，并且需要确定应该选择哪些基序进行下游分析的方法。一种广泛使用的方法是评估基序是否异常，即相对于零假设而言，是否过度或不足代表。在过去的三十年里，人们花了很多精力来推导拓扑基元(即固定子图)频率的p值。他们要么依靠(复合)泊松和高斯近似在Erdös-Rényi随机图中的基序计数分布，要么依靠其他模型的模拟。我们将重点讨论与彩色图案对应的图形图案的不同定义。彩色母题是顶点颜色固定但拓扑结构未指定的连通子图。我们的工作是首次在没有任何模拟的情况下分析评估网络中有色图案的异常性。我们首先建立了一个Erdös-Rényi随机图模型中彩色图案计数的均值和方差的解析公式。通过在该模型下的模拟，我们进一步表明，与高斯或泊松分布相比，Pólya-Aeppli分布更接近基序计数的分布。Pólya-Aeppli分布，以及更普遍的复合泊松分布，确实被很好地设计用于模拟团块事件的计数。总之，这些结果使我们能够推导出彩色图案的p值，而无需花费时间进行模拟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

EURASIP journal on bioinformatics & systems biology

自引率

0.00%

发文量