{"title":"Fast Algorithm for Optimal Compression of Graphs","authors":"Yongwook Choi","doi":"10.1137/1.9781611973006.5","DOIUrl":null,"url":null,"abstract":"We consider the problem of finding optimal description for general unlabeled graphs. Given a probability distribution on labeled graphs, we introduced in [4] a structural entropy as a lower bound for the lossless compression of such graphs. Specifically, we proved that the structural entropy for the Erdos--Renyi random graph, in which edges are added with probability p, is (n2)h(p) − n log n + O(n), where n is the number of vertices and h(p) = −p log p − (1 − p) log(1−p) is the entropy rate of a conventional memoryless binary source. In this paper, we prove the asymptotic equipartition property for such graphs. Then, we propose a faster compression algorithm that asymptotically achieves the structural entropy up to the first two leading terms with high probability. Our algorithm runs in O(n + e) time on average where e is the number of edges. To prove its asymptotic optimality, we introduce binary trees that one can classify as in-between tries and digital search trees. We use analytic techniques such as generating functions, Mellin transform, and poissonization to establish our findings. Our experimental results confirm theoretical results and show the usefulness of our algorithm for real-world graphs such as the Internet, biological networks, and social networks.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Analytic Algorithmics and Combinatorics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611973006.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We consider the problem of finding optimal description for general unlabeled graphs. Given a probability distribution on labeled graphs, we introduced in [4] a structural entropy as a lower bound for the lossless compression of such graphs. Specifically, we proved that the structural entropy for the Erdos--Renyi random graph, in which edges are added with probability p, is (n2)h(p) − n log n + O(n), where n is the number of vertices and h(p) = −p log p − (1 − p) log(1−p) is the entropy rate of a conventional memoryless binary source. In this paper, we prove the asymptotic equipartition property for such graphs. Then, we propose a faster compression algorithm that asymptotically achieves the structural entropy up to the first two leading terms with high probability. Our algorithm runs in O(n + e) time on average where e is the number of edges. To prove its asymptotic optimality, we introduce binary trees that one can classify as in-between tries and digital search trees. We use analytic techniques such as generating functions, Mellin transform, and poissonization to establish our findings. Our experimental results confirm theoretical results and show the usefulness of our algorithm for real-world graphs such as the Internet, biological networks, and social networks.