保持查询的图压缩

W. Fan, Jianzhong Li, Xin Wang, Yinghui Wu
{"title":"保持查询的图压缩","authors":"W. Fan, Jianzhong Li, Xin Wang, Yinghui Wu","doi":"10.1145/2213836.2213855","DOIUrl":null,"url":null,"abstract":"It is common to find graphs with millions of nodes and billions of edges in, e.g., social networks. Queries on such graphs are often prohibitively expensive. These motivate us to propose query preserving graph compression, to compress graphs relative to a class Λ of queries of users' choice. We compute a small Gr from a graph G such that (a) for any query Q Ε Λ Q, Q(G) = Q'(Gr), where Q' Ε Λ can be efficiently computed from Q; and (b) any algorithm for computing Q(G) can be directly applied to evaluating Q' on Gr as is. That is, while we cannot lower the complexity of evaluating graph queries, we reduce data graphs while preserving the answers to all the queries in Λ. To verify the effectiveness of this approach, (1) we develop compression strategies for two classes of queries: reachability and graph pattern queries via (bounded) simulation. We show that graphs can be efficiently compressed via a reachability equivalence relation and graph bisimulation, respectively, while reserving query answers. (2) We provide techniques for aintaining compressed graph Gr in response to changes ΔG to the original graph G. We show that the incremental maintenance problems are unbounded for the two lasses of queries, i.e., their costs are not a function of the size of ΔG and changes in Gr. Nevertheless, we develop incremental algorithms that depend only on ΔG and Gr, independent of G, i.e., we do not have to decompress Gr to propagate the changes. (3) Using real-life data, we experimentally verify that our compression techniques could reduce graphs in average by 95% for reachability and 57% for graph pattern matching, and that our incremental maintenance algorithms are efficient.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"201","resultStr":"{\"title\":\"Query preserving graph compression\",\"authors\":\"W. Fan, Jianzhong Li, Xin Wang, Yinghui Wu\",\"doi\":\"10.1145/2213836.2213855\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is common to find graphs with millions of nodes and billions of edges in, e.g., social networks. Queries on such graphs are often prohibitively expensive. These motivate us to propose query preserving graph compression, to compress graphs relative to a class Λ of queries of users' choice. We compute a small Gr from a graph G such that (a) for any query Q Ε Λ Q, Q(G) = Q'(Gr), where Q' Ε Λ can be efficiently computed from Q; and (b) any algorithm for computing Q(G) can be directly applied to evaluating Q' on Gr as is. That is, while we cannot lower the complexity of evaluating graph queries, we reduce data graphs while preserving the answers to all the queries in Λ. To verify the effectiveness of this approach, (1) we develop compression strategies for two classes of queries: reachability and graph pattern queries via (bounded) simulation. We show that graphs can be efficiently compressed via a reachability equivalence relation and graph bisimulation, respectively, while reserving query answers. (2) We provide techniques for aintaining compressed graph Gr in response to changes ΔG to the original graph G. We show that the incremental maintenance problems are unbounded for the two lasses of queries, i.e., their costs are not a function of the size of ΔG and changes in Gr. Nevertheless, we develop incremental algorithms that depend only on ΔG and Gr, independent of G, i.e., we do not have to decompress Gr to propagate the changes. (3) Using real-life data, we experimentally verify that our compression techniques could reduce graphs in average by 95% for reachability and 57% for graph pattern matching, and that our incremental maintenance algorithms are efficient.\",\"PeriodicalId\":212616,\"journal\":{\"name\":\"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"201\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2213836.2213855\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2213836.2213855","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 201

摘要

在社交网络中,发现具有数百万个节点和数十亿条边的图是很常见的。对此类图的查询通常非常昂贵。这促使我们提出保留查询的图压缩,将图压缩到用户选择的查询类Λ。我们从图G中计算一个小的Gr,使得(a)对于任何查询Q Ε Λ Q, Q(G) = Q'(Gr),其中Q' Ε Λ可以有效地从Q计算;(b)任何计算Q(G)的算法都可以直接用于求Gr上的Q'。也就是说,虽然我们不能降低评估图查询的复杂性,但我们减少了数据图,同时保留了Λ中所有查询的答案。为了验证这种方法的有效性,(1)我们通过(有界)模拟开发了两类查询的压缩策略:可达性查询和图模式查询。在保留查询答案的前提下,分别通过可达性等价关系和图双模拟对图进行有效压缩。(2)我们提供了维护压缩图Gr以响应原始图G ΔG变化的技术。我们表明,增量维护问题对于这两类查询是无界的,即它们的成本不是ΔG大小和Gr变化的函数。然而,我们开发了仅依赖于ΔG和Gr的增量算法,独立于G,即我们不必解压缩Gr来传播变化。(3)使用实际数据,我们实验验证了我们的压缩技术可以将图的可达性平均降低95%,图的模式匹配平均降低57%,并且我们的增量维护算法是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Query preserving graph compression
It is common to find graphs with millions of nodes and billions of edges in, e.g., social networks. Queries on such graphs are often prohibitively expensive. These motivate us to propose query preserving graph compression, to compress graphs relative to a class Λ of queries of users' choice. We compute a small Gr from a graph G such that (a) for any query Q Ε Λ Q, Q(G) = Q'(Gr), where Q' Ε Λ can be efficiently computed from Q; and (b) any algorithm for computing Q(G) can be directly applied to evaluating Q' on Gr as is. That is, while we cannot lower the complexity of evaluating graph queries, we reduce data graphs while preserving the answers to all the queries in Λ. To verify the effectiveness of this approach, (1) we develop compression strategies for two classes of queries: reachability and graph pattern queries via (bounded) simulation. We show that graphs can be efficiently compressed via a reachability equivalence relation and graph bisimulation, respectively, while reserving query answers. (2) We provide techniques for aintaining compressed graph Gr in response to changes ΔG to the original graph G. We show that the incremental maintenance problems are unbounded for the two lasses of queries, i.e., their costs are not a function of the size of ΔG and changes in Gr. Nevertheless, we develop incremental algorithms that depend only on ΔG and Gr, independent of G, i.e., we do not have to decompress Gr to propagate the changes. (3) Using real-life data, we experimentally verify that our compression techniques could reduce graphs in average by 95% for reachability and 57% for graph pattern matching, and that our incremental maintenance algorithms are efficient.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信