{"title":"Counting Simplices in Hypergraph Streams","authors":"Amit Chakrabarti, Themistoklis K. Haris","doi":"10.4230/LIPIcs.ESA.2022.32","DOIUrl":null,"url":null,"abstract":"We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a k -uniform hypergraph H with n vertices and m hyperedges, each hyperedge being a k -sized subset of vertices. A k -simplex in H is a subhypergraph on k + 1 vertices X such that all k + 1 possible hyperedges among X exist in H . The goal is to process the hyperedges of H , which arrive in an arbitrary order as a data stream, and compute a good estimate of T k ( H ), the number of k -simplices in H . We design a suite of algorithms for this problem. As with triangle-counting in graphs (which is the special case k = 2), sublinear space is achievable but only under a promise of the form T k ( H ) ≥ T . Under such a promise, our algorithms use at most four passes and together imply a space bound of O for each fixed k ≥ 3, in order to guarantee an estimate within (1 ± ε ) T k ( H ) with probability ≥ 1 − δ . We also give a simpler 1-pass algorithm that achieves O (cid:16) ε − 2 log δ − 1 log n · ( m/T ) (cid:16) ∆ E + ∆ − 1 /k V (cid:17)(cid:17) space, where ∆ E (respectively, ∆ V ) denotes the maximum number of k -simplices that share a hyperedge (respectively, a vertex), which generalizes a previous result for the k = 2 case. We complement these algorithmic results with space lower bounds of the form Ω( ε − 2 ), Ω( m 1+1 /k /T ), Ω( m/T 1 − 1 /k ) and Ω( m ∆ 1 /kV /T ) for multi-pass algorithms and Ω( m ∆ E /T ) for 1-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs. 2012 ACM Subject Classification Theory of computation → Sketching and sampling","PeriodicalId":201778,"journal":{"name":"Embedded Systems and Applications","volume":"78 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Embedded Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.ESA.2022.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a k -uniform hypergraph H with n vertices and m hyperedges, each hyperedge being a k -sized subset of vertices. A k -simplex in H is a subhypergraph on k + 1 vertices X such that all k + 1 possible hyperedges among X exist in H . The goal is to process the hyperedges of H , which arrive in an arbitrary order as a data stream, and compute a good estimate of T k ( H ), the number of k -simplices in H . We design a suite of algorithms for this problem. As with triangle-counting in graphs (which is the special case k = 2), sublinear space is achievable but only under a promise of the form T k ( H ) ≥ T . Under such a promise, our algorithms use at most four passes and together imply a space bound of O for each fixed k ≥ 3, in order to guarantee an estimate within (1 ± ε ) T k ( H ) with probability ≥ 1 − δ . We also give a simpler 1-pass algorithm that achieves O (cid:16) ε − 2 log δ − 1 log n · ( m/T ) (cid:16) ∆ E + ∆ − 1 /k V (cid:17)(cid:17) space, where ∆ E (respectively, ∆ V ) denotes the maximum number of k -simplices that share a hyperedge (respectively, a vertex), which generalizes a previous result for the k = 2 case. We complement these algorithmic results with space lower bounds of the form Ω( ε − 2 ), Ω( m 1+1 /k /T ), Ω( m/T 1 − 1 /k ) and Ω( m ∆ 1 /kV /T ) for multi-pass algorithms and Ω( m ∆ E /T ) for 1-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs. 2012 ACM Subject Classification Theory of computation → Sketching and sampling