{"title":"SBG-sketch: a self-balanced sketch for labeled-graph stream summarization","authors":"Mohamed S. Hassan, Bruno Ribeiro, Walid G. Aref","doi":"10.1145/3221269.3223030","DOIUrl":null,"url":null,"abstract":"Applications in various domains rely on processing graph streams, e.g., communication logs of a cloud-troubleshooting system, road-network traffic updates, and interactions on a social network. A labeled-graph stream refers to a sequence of streamed edges of distinct types that form a labeled graph. Due to the large volume and high velocity of these streams, it is often more practical to incrementally build a lossy-compressed version of the graph, and use this lossy version to approximately evaluate graph queries. Challenges arise when the queries are unknown in advance but are associated with filtering predicates based on edge labels. Surprisingly common, and especially challenging, are labeled-graph streams that have highly skewed and unpredictable label-distributions. This paper introduces Self-Balanced Graph Sketch (SBG-Sketch, for short), a graph sketch for summarizing and querying labeled-graph streams, coping with highly imbalanced labels. SBG-Sketch maintains synopsis for both the edge attributes as well as the topology of the streamed graph. SBG-Sketch allows efficient processing of traversal queries, e.g., reachability queries. Experimental results over a variety of real labeled-graph streams show SBG-Sketch to reduce the estimation errors of state-of-the-art methods by up to 99%.","PeriodicalId":365491,"journal":{"name":"Proceedings of the 30th International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3221269.3223030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Applications in various domains rely on processing graph streams, e.g., communication logs of a cloud-troubleshooting system, road-network traffic updates, and interactions on a social network. A labeled-graph stream refers to a sequence of streamed edges of distinct types that form a labeled graph. Due to the large volume and high velocity of these streams, it is often more practical to incrementally build a lossy-compressed version of the graph, and use this lossy version to approximately evaluate graph queries. Challenges arise when the queries are unknown in advance but are associated with filtering predicates based on edge labels. Surprisingly common, and especially challenging, are labeled-graph streams that have highly skewed and unpredictable label-distributions. This paper introduces Self-Balanced Graph Sketch (SBG-Sketch, for short), a graph sketch for summarizing and querying labeled-graph streams, coping with highly imbalanced labels. SBG-Sketch maintains synopsis for both the edge attributes as well as the topology of the streamed graph. SBG-Sketch allows efficient processing of traversal queries, e.g., reachability queries. Experimental results over a variety of real labeled-graph streams show SBG-Sketch to reduce the estimation errors of state-of-the-art methods by up to 99%.