Partitioned Bit-Packed Vectors for In-Memory-Column-Stores

Martin Faust, Pedro Flemming, David Schwalb, H. Plattner
{"title":"Partitioned Bit-Packed Vectors for In-Memory-Column-Stores","authors":"Martin Faust, Pedro Flemming, David Schwalb, H. Plattner","doi":"10.1145/2803140.2803142","DOIUrl":null,"url":null,"abstract":"In recent database development, in-memory databases have grown more and more in popularity. The hardware development of the past years has made it possible to keep even larger data sets entirely in main memory of one or a few machines. However, most applications on in-memory databases are memory-latency-bound rather than compute-bound. Combining strong compression techniques and efficient data structures is essential to fully utilize the hardware capabilities. A common data structure for efficient storing is the bit-packed vector. The bit-packed vector uses a fixed encoding length, which cannot be changed after initialization. Therefore it requires full re-initialization, when the encoding-length changes. In this paper we propose a new data structure, the partitioned bit-packed vector. Therein the encoding length of the stored elements may increase dynamically, while still providing comparable single-value access performance. This paper outlines the access to this data structure and evaluates its performance characteristics. The results suggest that the partitioned bitvector has the capabilities to improve the performance of existing in-memory column-stores for typical enterprise workloads.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"98 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2803140.2803142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In recent database development, in-memory databases have grown more and more in popularity. The hardware development of the past years has made it possible to keep even larger data sets entirely in main memory of one or a few machines. However, most applications on in-memory databases are memory-latency-bound rather than compute-bound. Combining strong compression techniques and efficient data structures is essential to fully utilize the hardware capabilities. A common data structure for efficient storing is the bit-packed vector. The bit-packed vector uses a fixed encoding length, which cannot be changed after initialization. Therefore it requires full re-initialization, when the encoding-length changes. In this paper we propose a new data structure, the partitioned bit-packed vector. Therein the encoding length of the stored elements may increase dynamically, while still providing comparable single-value access performance. This paper outlines the access to this data structure and evaluates its performance characteristics. The results suggest that the partitioned bitvector has the capabilities to improve the performance of existing in-memory column-stores for typical enterprise workloads.
用于内存列存储的分区位打包向量
在最近的数据库开发中,内存数据库越来越受欢迎。过去几年的硬件发展已经使得将更大的数据集完全保存在一台或几台机器的主存储器中成为可能。但是,内存数据库上的大多数应用程序是内存延迟绑定的,而不是计算绑定的。结合强大的压缩技术和高效的数据结构是充分利用硬件功能的必要条件。有效存储的常用数据结构是位打包向量。位打包向量使用固定的编码长度,初始化后不能更改。因此,当编码长度改变时,需要完全重新初始化。在本文中,我们提出了一种新的数据结构,即分区位包向量。其中,所存储元素的编码长度可以动态地增加,同时仍然提供可比较的单值访问性能。本文概述了对该数据结构的访问,并对其性能特征进行了评估。结果表明,对于典型的企业工作负载,分区的位向量能够提高现有内存中列存储的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信