Just Join for Parallel Ordered Sets

G. Blelloch, Daniel Ferizovic, Yihan Sun
{"title":"Just Join for Parallel Ordered Sets","authors":"G. Blelloch, Daniel Ferizovic, Yihan Sun","doi":"10.1145/2935764.2935768","DOIUrl":null,"url":null,"abstract":"Ordered sets (and maps when data is associated with each key) are one of the most important and useful data types. The set-set functions union, intersection and difference are particularly useful in certain applications. Brown and Tarjan first described an algorithm for these functions, based on 2-3 trees, that meet the optimal Θ(m log (n/m+1)) time bounds in the comparison model (n and m ≤ n are the input sizes). Later Adams showed very elegant algorithms for the functions, and others, based on weight-balanced trees. They only require a single function that is specific to the balancing scheme---a function that joins two balanced trees---and hence can be applied to other balancing schemes. Furthermore the algorithms are naturally parallel. However, in the twenty-four years since, no one has shown that the algorithms, sequential or parallel are asymptotically work optimal. In this paper we show that Adams' algorithms are both work efficient and highly parallel (polylog span) across four different balancing schemes---AVL trees, red-black trees, weight balanced trees and treaps. To do this we use careful, but simple, algorithms for Join that maintain certain invariants, and our proof is (mostly) generic across the schemes. To understand how the algorithms perform in practice we have also implemented them (all code except Join is generic across the balancing schemes). Interestingly the implementations on all four balancing schemes and three set functions perform similarly in time and speedup (more than 45x on 64 cores). We also compare the performance of our implementation to other existing libraries and algorithms.","PeriodicalId":346939,"journal":{"name":"Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"13 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"62","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2935764.2935768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 62

Abstract

Ordered sets (and maps when data is associated with each key) are one of the most important and useful data types. The set-set functions union, intersection and difference are particularly useful in certain applications. Brown and Tarjan first described an algorithm for these functions, based on 2-3 trees, that meet the optimal Θ(m log (n/m+1)) time bounds in the comparison model (n and m ≤ n are the input sizes). Later Adams showed very elegant algorithms for the functions, and others, based on weight-balanced trees. They only require a single function that is specific to the balancing scheme---a function that joins two balanced trees---and hence can be applied to other balancing schemes. Furthermore the algorithms are naturally parallel. However, in the twenty-four years since, no one has shown that the algorithms, sequential or parallel are asymptotically work optimal. In this paper we show that Adams' algorithms are both work efficient and highly parallel (polylog span) across four different balancing schemes---AVL trees, red-black trees, weight balanced trees and treaps. To do this we use careful, but simple, algorithms for Join that maintain certain invariants, and our proof is (mostly) generic across the schemes. To understand how the algorithms perform in practice we have also implemented them (all code except Join is generic across the balancing schemes). Interestingly the implementations on all four balancing schemes and three set functions perform similarly in time and speedup (more than 45x on 64 cores). We also compare the performance of our implementation to other existing libraries and algorithms.
平行有序集的连接
有序集(以及当数据与每个键相关联时的映射)是最重要和最有用的数据类型之一。集集函数并、交、差在某些应用中特别有用。Brown和Tarjan首先描述了这些函数的算法,基于2-3棵树,满足比较模型中最优Θ(m log (n/m+1))时间界限(n和m≤n是输入大小)。后来,亚当斯展示了非常优雅的函数算法,以及其他基于权重平衡树的算法。它们只需要一个特定于平衡方案的函数——一个连接两个平衡树的函数——因此可以应用于其他平衡方案。此外,算法是自然并行的。然而,从那以后的24年里,没有人表明,顺序或并行算法是渐近最优的。在本文中,我们证明了Adams算法在四种不同的平衡方案(AVL树、红黑树、权重平衡树和树堆)上既高效又高度并行(多对数跨度)。为此,我们使用谨慎但简单的Join算法来维护某些不变量,并且我们的证明(大部分)是跨方案的泛型。为了理解算法在实践中的表现,我们也实现了它们(除了Join之外的所有代码都是跨平衡方案的通用代码)。有趣的是,所有四个平衡方案和三个集合函数的实现在时间和加速方面表现相似(在64核上超过45倍)。我们还将实现的性能与其他现有库和算法进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信