Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation

Harald Lang, Tobias Mühlbauer, Florian Funke, P. Boncz, Thomas Neumann, A. Kemper
{"title":"Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation","authors":"Harald Lang, Tobias Mühlbauer, Florian Funke, P. Boncz, Thomas Neumann, A. Kemper","doi":"10.1145/2882903.2882925","DOIUrl":null,"url":null,"abstract":"This work aims at reducing the main-memory footprint in high performance hybrid OLTP & OLAP databases, while retaining high query performance and transactional throughput. For this purpose, an innovative compressed columnar storage format for cold data, called Data Blocks is introduced. Data Blocks further incorporate a new light-weight index structure called Positional SMA that narrows scan ranges within Data Blocks even if the entire block cannot be ruled out. To achieve highest OLTP performance, the compression schemes of Data Blocks are very light-weight, such that OLTP transactions can still quickly access individual tuples. This sets our storage scheme apart from those used in specialized analytical databases where data must usually be bit-unpacked. Up to now, high-performance analytical systems use either vectorized query execution or just-in-time (JIT) query compilation. The fine-grained adaptivity of Data Blocks necessitates the integration of the best features of each approach by an interpreted vectorized scan subsystem feeding into JIT-compiled query pipelines. Experimental evaluation of HyPer, our full-fledged hybrid OLTP & OLAP database system, shows that Data Blocks accelerate performance on a variety of query workloads while retaining high transaction throughput.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"142","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2882903.2882925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 142

Abstract

This work aims at reducing the main-memory footprint in high performance hybrid OLTP & OLAP databases, while retaining high query performance and transactional throughput. For this purpose, an innovative compressed columnar storage format for cold data, called Data Blocks is introduced. Data Blocks further incorporate a new light-weight index structure called Positional SMA that narrows scan ranges within Data Blocks even if the entire block cannot be ruled out. To achieve highest OLTP performance, the compression schemes of Data Blocks are very light-weight, such that OLTP transactions can still quickly access individual tuples. This sets our storage scheme apart from those used in specialized analytical databases where data must usually be bit-unpacked. Up to now, high-performance analytical systems use either vectorized query execution or just-in-time (JIT) query compilation. The fine-grained adaptivity of Data Blocks necessitates the integration of the best features of each approach by an interpreted vectorized scan subsystem feeding into JIT-compiled query pipelines. Experimental evaluation of HyPer, our full-fledged hybrid OLTP & OLAP database system, shows that Data Blocks accelerate performance on a variety of query workloads while retaining high transaction throughput.
数据块:使用向量化和编译的压缩存储上的混合OLTP和OLAP
这项工作旨在减少高性能混合OLTP和OLAP数据库的主内存占用,同时保持高查询性能和事务吞吐量。为此,引入了一种创新的压缩列式冷数据存储格式,称为数据块。数据块进一步纳入了一种新的轻量级索引结构,称为位置SMA,即使不能排除整个块,也可以缩小数据块内的扫描范围。为了实现最高的OLTP性能,数据块的压缩方案是非常轻量级的,这样OLTP事务仍然可以快速访问单个元组。这将我们的存储方案与专用分析数据库中使用的存储方案区别开来,在专用分析数据库中,数据通常必须进行位解压缩。到目前为止,高性能分析系统要么使用向量化查询执行,要么使用即时(JIT)查询编译。数据块的细粒度适应性要求通过将解释的矢量扫描子系统馈送到jit编译的查询管道中来集成每种方法的最佳特性。对HyPer(我们成熟的OLTP和OLAP混合数据库系统)的实验评估表明,数据块在保持高事务吞吐量的同时加速了各种查询工作负载的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信