{"title":"Two-Attribute Skew Free, Isolated CP Theorem, and Massively Parallel Joins","authors":"Miao Qiao, Yufei Tao","doi":"10.1145/3452021.3458321","DOIUrl":null,"url":null,"abstract":"This paper presents an algorithm to process a multi-way join with load $\\tO(n/p^2/(α φ) )$ under the MPC model, where n is the number of tuples in the input relations, α the maximum arity of those relations, p the number of machines, and φ a newly introduced parameter called the \\em generalized vertex packing number. The algorithm owes to two new findings. The first is a \\em two-attribute skew free technique to partition the join result for parallel computation. The second is an \\em isolated cartesian product theorem, which provides fresh graph-theoretic insights on joins with α \\ge 3$ and generalizes an existing theorem on α = 2$.","PeriodicalId":405398,"journal":{"name":"Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3452021.3458321","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
This paper presents an algorithm to process a multi-way join with load $\tO(n/p^2/(α φ) )$ under the MPC model, where n is the number of tuples in the input relations, α the maximum arity of those relations, p the number of machines, and φ a newly introduced parameter called the \em generalized vertex packing number. The algorithm owes to two new findings. The first is a \em two-attribute skew free technique to partition the join result for parallel computation. The second is an \em isolated cartesian product theorem, which provides fresh graph-theoretic insights on joins with α \ge 3$ and generalizes an existing theorem on α = 2$.