Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-14 DOI:10.1145/2882903.2882925

Harald Lang, Tobias Mühlbauer, Florian Funke, P. Boncz, Thomas Neumann, A. Kemper

{"title":"Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation","authors":"Harald Lang, Tobias Mühlbauer, Florian Funke, P. Boncz, Thomas Neumann, A. Kemper","doi":"10.1145/2882903.2882925","DOIUrl":null,"url":null,"abstract":"This work aims at reducing the main-memory footprint in high performance hybrid OLTP & OLAP databases, while retaining high query performance and transactional throughput. For this purpose, an innovative compressed columnar storage format for cold data, called Data Blocks is introduced. Data Blocks further incorporate a new light-weight index structure called Positional SMA that narrows scan ranges within Data Blocks even if the entire block cannot be ruled out. To achieve highest OLTP performance, the compression schemes of Data Blocks are very light-weight, such that OLTP transactions can still quickly access individual tuples. This sets our storage scheme apart from those used in specialized analytical databases where data must usually be bit-unpacked. Up to now, high-performance analytical systems use either vectorized query execution or just-in-time (JIT) query compilation. The fine-grained adaptivity of Data Blocks necessitates the integration of the best features of each approach by an interpreted vectorized scan subsystem feeding into JIT-compiled query pipelines. Experimental evaluation of HyPer, our full-fledged hybrid OLTP & OLAP database system, shows that Data Blocks accelerate performance on a variety of query workloads while retaining high transaction throughput.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"142","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2882903.2882925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 142

Abstract

This work aims at reducing the main-memory footprint in high performance hybrid OLTP & OLAP databases, while retaining high query performance and transactional throughput. For this purpose, an innovative compressed columnar storage format for cold data, called Data Blocks is introduced. Data Blocks further incorporate a new light-weight index structure called Positional SMA that narrows scan ranges within Data Blocks even if the entire block cannot be ruled out. To achieve highest OLTP performance, the compression schemes of Data Blocks are very light-weight, such that OLTP transactions can still quickly access individual tuples. This sets our storage scheme apart from those used in specialized analytical databases where data must usually be bit-unpacked. Up to now, high-performance analytical systems use either vectorized query execution or just-in-time (JIT) query compilation. The fine-grained adaptivity of Data Blocks necessitates the integration of the best features of each approach by an interpreted vectorized scan subsystem feeding into JIT-compiled query pipelines. Experimental evaluation of HyPer, our full-fledged hybrid OLTP & OLAP database system, shows that Data Blocks accelerate performance on a variety of query workloads while retaining high transaction throughput.

查看原文本刊更多论文

数据块:使用向量化和编译的压缩存储上的混合OLTP和OLAP

这项工作旨在减少高性能混合OLTP和OLAP数据库的主内存占用，同时保持高查询性能和事务吞吐量。为此，引入了一种创新的压缩列式冷数据存储格式，称为数据块。数据块进一步纳入了一种新的轻量级索引结构，称为位置SMA，即使不能排除整个块，也可以缩小数据块内的扫描范围。为了实现最高的OLTP性能，数据块的压缩方案是非常轻量级的，这样OLTP事务仍然可以快速访问单个元组。这将我们的存储方案与专用分析数据库中使用的存储方案区别开来，在专用分析数据库中，数据通常必须进行位解压缩。到目前为止，高性能分析系统要么使用向量化查询执行，要么使用即时(JIT)查询编译。数据块的细粒度适应性要求通过将解释的矢量扫描子系统馈送到jit编译的查询管道中来集成每种方法的最佳特性。对HyPer(我们成熟的OLTP和OLAP混合数据库系统)的实验评估表明，数据块在保持高事务吞吐量的同时加速了各种查询工作负载的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2016 International Conference on Management of Data

自引率

0.00%

发文量