Exponential Speedup Over Locality in MPC with Optimal Memory

A. Balliu, S. Brandt, Manuela Fischer, R. Latypov, Yannic Maus, D. Olivetti, Jara Uitto
{"title":"Exponential Speedup Over Locality in MPC with Optimal Memory","authors":"A. Balliu, S. Brandt, Manuela Fischer, R. Latypov, Yannic Maus, D. Olivetti, Jara Uitto","doi":"10.48550/arXiv.2208.09453","DOIUrl":null,"url":null,"abstract":"Locally Checkable Labeling ( LCL ) problems are graph problems in which a solution is correct if it satisfies some given constraints in the local neighborhood of each node. Example problems in this class include maximal matching, maximal independent set, and coloring problems. A successful line of research has been studying the complexities of LCL problems on paths/cycles, trees, and general graphs, providing many interesting results for the LOCAL model of distributed computing. In this work, we initiate the study of LCL problems in the low-space Massively Parallel Computation ( MPC ) model. In particular, on forests, we provide a method that, given the complexity of an LCL problem in the LOCAL model, automatically provides an exponentially faster algorithm for the low-space MPC setting that uses optimal global memory, that is, truly linear. While restricting to forests may seem to weaken the result, we emphasize that all known (conditional) lower bounds for the MPC setting are obtained by lifting lower bounds obtained in the distributed setting in tree-like networks (either forests or high girth graphs), and hence the problems that we study are challenging already on forests. Moreover, the most important technical feature of our algorithms is that they use optimal global memory, that is, memory linear in the number of edges of the graph. In contrast, most of the state-of-the-art algorithms use more than linear global memory. Further, they typically start with a dense graph, sparsify it, and then solve the problem on the residual graph, exploiting the relative increase in global memory. On forests, this is not possible, because the given graph is already as sparse as it can be, and using optimal memory requires new solutions. Graph Exponentiation. A reoccurring challenge for all regimes lies in respecting the linear global memory, which roughly means that on average, every node can use only a constant amount of memory. This is particularly unfortunate because almost all recent MPC results – and in particular all that achieve exponential speedups – rely on the memory-intense graph exponentiation technique [45]. Informally, this technique enables a node to gather its 2 k -hop neighborhood in k communication rounds. Doing this in parallel for every node in the graph results in a ∆ 2 k overhead in global memory. For this technique to be useful, k has to be ω (1), yielding a non-constant multiplicative increase in the global memory requirement. In order to use this technique but not violate linear global memory, we develop new solutions that are discussed in the following paragraphs. ▶ Lemma 9. The distance- k O (∆ 2 k ) -coloring problem on general graphs can be solved in the low-space MPC model with a O (log log ∗ n + log k ) -time deterministic algorithm, as long as ∆ k < n δ . The algorithm requires O (∆ k ) words of local and O ( m + n · ∆ k ) words of global memory. If k and ∆ are constants, the runtime reduces to O (log log ∗ n ) and we require O (1) words of local and O ( m + n ) words of global memory.","PeriodicalId":89463,"journal":{"name":"Proceedings of the ... International Symposium on High Performance Distributed Computing","volume":"22 1","pages":"9:1-9:21"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International Symposium on High Performance Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2208.09453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Locally Checkable Labeling ( LCL ) problems are graph problems in which a solution is correct if it satisfies some given constraints in the local neighborhood of each node. Example problems in this class include maximal matching, maximal independent set, and coloring problems. A successful line of research has been studying the complexities of LCL problems on paths/cycles, trees, and general graphs, providing many interesting results for the LOCAL model of distributed computing. In this work, we initiate the study of LCL problems in the low-space Massively Parallel Computation ( MPC ) model. In particular, on forests, we provide a method that, given the complexity of an LCL problem in the LOCAL model, automatically provides an exponentially faster algorithm for the low-space MPC setting that uses optimal global memory, that is, truly linear. While restricting to forests may seem to weaken the result, we emphasize that all known (conditional) lower bounds for the MPC setting are obtained by lifting lower bounds obtained in the distributed setting in tree-like networks (either forests or high girth graphs), and hence the problems that we study are challenging already on forests. Moreover, the most important technical feature of our algorithms is that they use optimal global memory, that is, memory linear in the number of edges of the graph. In contrast, most of the state-of-the-art algorithms use more than linear global memory. Further, they typically start with a dense graph, sparsify it, and then solve the problem on the residual graph, exploiting the relative increase in global memory. On forests, this is not possible, because the given graph is already as sparse as it can be, and using optimal memory requires new solutions. Graph Exponentiation. A reoccurring challenge for all regimes lies in respecting the linear global memory, which roughly means that on average, every node can use only a constant amount of memory. This is particularly unfortunate because almost all recent MPC results – and in particular all that achieve exponential speedups – rely on the memory-intense graph exponentiation technique [45]. Informally, this technique enables a node to gather its 2 k -hop neighborhood in k communication rounds. Doing this in parallel for every node in the graph results in a ∆ 2 k overhead in global memory. For this technique to be useful, k has to be ω (1), yielding a non-constant multiplicative increase in the global memory requirement. In order to use this technique but not violate linear global memory, we develop new solutions that are discussed in the following paragraphs. ▶ Lemma 9. The distance- k O (∆ 2 k ) -coloring problem on general graphs can be solved in the low-space MPC model with a O (log log ∗ n + log k ) -time deterministic algorithm, as long as ∆ k < n δ . The algorithm requires O (∆ k ) words of local and O ( m + n · ∆ k ) words of global memory. If k and ∆ are constants, the runtime reduces to O (log log ∗ n ) and we require O (1) words of local and O ( m + n ) words of global memory.
具有最优内存的MPC在局部性上的指数加速
局部可检查标记(LCL)问题是一种图问题,在这种图问题中,如果解满足每个节点的局部邻域的某些给定约束,则解是正确的。这门课的例子问题包括最大匹配、最大独立集和着色问题。一个成功的研究方向是研究路径/循环、树和一般图上LCL问题的复杂性,为分布式计算的LOCAL模型提供了许多有趣的结果。在这项工作中,我们开始研究低空间大规模并行计算(MPC)模型中的LCL问题。特别是,对于森林,我们提供了一种方法,该方法考虑到LOCAL模型中LCL问题的复杂性,自动为使用最佳全局内存的低空间MPC设置提供指数级更快的算法,即真正的线性。虽然对森林的限制似乎会削弱结果,但我们强调,MPC设置的所有已知(条件)下界都是通过提升在树状网络(森林或高周长图)的分布式设置中获得的下界来获得的,因此我们研究的问题在森林上已经具有挑战性。此外,我们的算法最重要的技术特征是它们使用最优全局内存,也就是说,在图的边缘数量上,内存是线性的。相比之下,大多数最先进的算法使用的不仅仅是线性全局内存。此外,它们通常从密集图开始,对其进行稀疏化,然后在残差图上解决问题,利用全局内存的相对增加。在森林中,这是不可能的,因为给定的图已经尽可能稀疏了,使用最优内存需要新的解决方案。图求幂。对于所有机制来说,一个反复出现的挑战在于尊重线性全局内存,这大致意味着平均而言,每个节点只能使用一定量的内存。这是特别不幸的,因为几乎所有最近的MPC结果——尤其是所有实现指数级加速的结果——都依赖于内存密集型图形指数化技术[45]。非正式地说,这种技术使一个节点能够在k轮通信中收集它的2个k跳邻居。对图中的每个节点并行执行此操作会导致全局内存开销为∆2k。为了使该技术有用,k必须为ω(1),从而导致全局内存需求的非恒定倍增性增长。为了使用这种技术而不违反线性全局内存,我们开发了新的解决方案,将在下面的段落中讨论。引理9:在低空间MPC模型中,只要∆k < n δ,就可以用O (log log∗n + log k)时间确定性算法求解一般图上的距离- k O(∆2 k)着色问题。该算法需要O(∆k)个局部内存单词和O (m + n·∆k)个全局内存单词。如果k和∆是常数,则运行时间减少到O (log log * n),我们需要O(1)个局部内存单词和O (m + n)个全局内存单词。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信