Comparative performance of parallel join algorithms

J. Wolf, D. Dias, Philip S. Yu, John Turek
{"title":"Comparative performance of parallel join algorithms","authors":"J. Wolf, D. Dias, Philip S. Yu, John Turek","doi":"10.1109/PDIS.1991.183070","DOIUrl":null,"url":null,"abstract":"The authors recently (1990, 1991) described two new join algorithms designed to address the data skew problem. These algorithms were based, respectively, on the traditional sort merge and hash join algorithms, and employed techniques borrowed from mathematical optimization theory. The current paper proposes significant improvements to both algorithms, increasing their effectiveness while simultaneously decreasing their execution times. It then focuses on the comparative performance of the improved algorithms and their more conventional sort merge and hash counterparts. The latter two are perfectly good algorithms except that they fail to deal with data skew. Both I/O- and CPU-bound configurations were examined. The new algorithms outperform their more conventional counterparts in the presence of just about any skew at all, dramatically so in cases of high skew.<<ETX>>","PeriodicalId":210800,"journal":{"name":"[1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1991-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDIS.1991.183070","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

The authors recently (1990, 1991) described two new join algorithms designed to address the data skew problem. These algorithms were based, respectively, on the traditional sort merge and hash join algorithms, and employed techniques borrowed from mathematical optimization theory. The current paper proposes significant improvements to both algorithms, increasing their effectiveness while simultaneously decreasing their execution times. It then focuses on the comparative performance of the improved algorithms and their more conventional sort merge and hash counterparts. The latter two are perfectly good algorithms except that they fail to deal with data skew. Both I/O- and CPU-bound configurations were examined. The new algorithms outperform their more conventional counterparts in the presence of just about any skew at all, dramatically so in cases of high skew.<>
并行连接算法的性能比较
作者最近(1990,1991)描述了两种新的连接算法,旨在解决数据倾斜问题。这些算法分别基于传统的排序合并和哈希连接算法,并采用了借鉴数学优化理论的技术。本文对这两种算法提出了重大改进,提高了它们的有效性,同时减少了它们的执行时间。然后将重点放在改进算法与更传统的排序合并和散列对应算法的比较性能上。后两种算法除了不能处理数据倾斜之外,都是非常好的算法。检查了I/O和cpu绑定的配置。新算法在几乎任何歪斜的情况下都优于传统算法,在高歪斜的情况下更是如此
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信