{"title":"Influence of the null-model on motif detection","authors":"W. Schlauch, K. Zweig","doi":"10.1145/2808797.2809400","DOIUrl":null,"url":null,"abstract":"This paper focuses on the suitability of three different null-models to motif analysis that all get as an input a desired degree sequence. A graph theoretic null-model is defined as a set of graphs together with a probability function. Here we discuss the configuration model, as the simplest model; a variant of the configuration model where multi-edges are deleted; and the set of all graphs with a given degree sequence (FDSM), that most scientists would recommend to use but that has the disadvantage of a high time-complexity to sample from it. Furthermore, we develop equations for the expected number of motifs in the FDSM, based on the degree sequence and the assumption of simple independence. We present the motif count for several real-world graphs and compare them with the sampled average number of these motif counts in the different null-models. We check with a Kolmogorov-Smirnow two-sample test whether the samples originated from the same distribution. It can then be shown that the motif counts in the configuration model do not coincide with those of the FDSM. The equations are a good enough approximation of the motif count in generated graphs based on a prescribed degree sequence.","PeriodicalId":371988,"journal":{"name":"2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2808797.2809400","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
This paper focuses on the suitability of three different null-models to motif analysis that all get as an input a desired degree sequence. A graph theoretic null-model is defined as a set of graphs together with a probability function. Here we discuss the configuration model, as the simplest model; a variant of the configuration model where multi-edges are deleted; and the set of all graphs with a given degree sequence (FDSM), that most scientists would recommend to use but that has the disadvantage of a high time-complexity to sample from it. Furthermore, we develop equations for the expected number of motifs in the FDSM, based on the degree sequence and the assumption of simple independence. We present the motif count for several real-world graphs and compare them with the sampled average number of these motif counts in the different null-models. We check with a Kolmogorov-Smirnow two-sample test whether the samples originated from the same distribution. It can then be shown that the motif counts in the configuration model do not coincide with those of the FDSM. The equations are a good enough approximation of the motif count in generated graphs based on a prescribed degree sequence.