Ryota Takizawa, H. Kawashima, Ryuya Mitsuhashi, O. Tatebe
{"title":"Performing External Join Operator on PostgreSQL with Data Transfer Approach","authors":"Ryota Takizawa, H. Kawashima, Ryuya Mitsuhashi, O. Tatebe","doi":"10.1145/3149457.3149480","DOIUrl":null,"url":null,"abstract":"With the development of sensing devices, the size of data managed by human being has been rapidly increasing. To manage such huge data, relational database management system (RDBMS) plays a key role. RDBMS models the real world data as n-ary relational tables. Join operator is one of the most important relational operators, and its acceleration has been studied widely and deeply. How can an RDBMS provide such an efficient join operator? The performance improvement of join operator has been deeply studied for a decade, and many techniques are proposed already. The problem that we face is how to actually use such excellent techniques in real RDBMSs. We propose to implement an efficient join technique by the data transfer approach. The approach makes a hook point inside an RDBMS internal, and pulls data streams from the operator pipeline in the RDBMS, and applies our original join operator to the data, and finally returns the result to the operator pipeline in the RDBMS. The result of the experiment showed that our proposed method achieved 1.42x speedup compared with PostgreSQL. Our code is available on GitHub.","PeriodicalId":314778,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3149457.3149480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of sensing devices, the size of data managed by human being has been rapidly increasing. To manage such huge data, relational database management system (RDBMS) plays a key role. RDBMS models the real world data as n-ary relational tables. Join operator is one of the most important relational operators, and its acceleration has been studied widely and deeply. How can an RDBMS provide such an efficient join operator? The performance improvement of join operator has been deeply studied for a decade, and many techniques are proposed already. The problem that we face is how to actually use such excellent techniques in real RDBMSs. We propose to implement an efficient join technique by the data transfer approach. The approach makes a hook point inside an RDBMS internal, and pulls data streams from the operator pipeline in the RDBMS, and applies our original join operator to the data, and finally returns the result to the operator pipeline in the RDBMS. The result of the experiment showed that our proposed method achieved 1.42x speedup compared with PostgreSQL. Our code is available on GitHub.