Performing External Join Operator on PostgreSQL with Data Transfer Approach

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2018-01-28 DOI:10.1145/3149457.3149480

Ryota Takizawa, H. Kawashima, Ryuya Mitsuhashi, O. Tatebe

{"title":"Performing External Join Operator on PostgreSQL with Data Transfer Approach","authors":"Ryota Takizawa, H. Kawashima, Ryuya Mitsuhashi, O. Tatebe","doi":"10.1145/3149457.3149480","DOIUrl":null,"url":null,"abstract":"With the development of sensing devices, the size of data managed by human being has been rapidly increasing. To manage such huge data, relational database management system (RDBMS) plays a key role. RDBMS models the real world data as n-ary relational tables. Join operator is one of the most important relational operators, and its acceleration has been studied widely and deeply. How can an RDBMS provide such an efficient join operator? The performance improvement of join operator has been deeply studied for a decade, and many techniques are proposed already. The problem that we face is how to actually use such excellent techniques in real RDBMSs. We propose to implement an efficient join technique by the data transfer approach. The approach makes a hook point inside an RDBMS internal, and pulls data streams from the operator pipeline in the RDBMS, and applies our original join operator to the data, and finally returns the result to the operator pipeline in the RDBMS. The result of the experiment showed that our proposed method achieved 1.42x speedup compared with PostgreSQL. Our code is available on GitHub.","PeriodicalId":314778,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3149457.3149480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

With the development of sensing devices, the size of data managed by human being has been rapidly increasing. To manage such huge data, relational database management system (RDBMS) plays a key role. RDBMS models the real world data as n-ary relational tables. Join operator is one of the most important relational operators, and its acceleration has been studied widely and deeply. How can an RDBMS provide such an efficient join operator? The performance improvement of join operator has been deeply studied for a decade, and many techniques are proposed already. The problem that we face is how to actually use such excellent techniques in real RDBMSs. We propose to implement an efficient join technique by the data transfer approach. The approach makes a hook point inside an RDBMS internal, and pulls data streams from the operator pipeline in the RDBMS, and applies our original join operator to the data, and finally returns the result to the operator pipeline in the RDBMS. The result of the experiment showed that our proposed method achieved 1.42x speedup compared with PostgreSQL. Our code is available on GitHub.

查看原文本刊更多论文

用数据传输方法在PostgreSQL上实现外部连接运算符

随着传感设备的发展，人类管理的数据量迅速增加。要管理如此庞大的数据，关系数据库管理系统(RDBMS)起着至关重要的作用。RDBMS将现实世界的数据建模为n元关系表。连接算子是最重要的关系算子之一，它的加速问题已经得到了广泛而深入的研究。RDBMS如何提供如此高效的连接操作符?近十年来，人们对连接算子的性能改进进行了深入的研究，并提出了许多改进方法。我们面临的问题是如何在实际的rdbms中实际使用这些优秀的技术。我们提出了一种基于数据传输的高效连接技术。该方法在RDBMS内部创建一个钩子点，从RDBMS中的操作符管道中提取数据流，并对数据应用我们原来的连接操作符，最后将结果返回给RDBMS中的操作符管道。实验结果表明，与PostgreSQL相比，我们提出的方法实现了1.42倍的提速。我们的代码可以在GitHub上找到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region

自引率

0.00%

发文量