Martin Faust, Pedro Flemming, David Schwalb, H. Plattner
{"title":"Partitioned Bit-Packed Vectors for In-Memory-Column-Stores","authors":"Martin Faust, Pedro Flemming, David Schwalb, H. Plattner","doi":"10.1145/2803140.2803142","DOIUrl":null,"url":null,"abstract":"In recent database development, in-memory databases have grown more and more in popularity. The hardware development of the past years has made it possible to keep even larger data sets entirely in main memory of one or a few machines. However, most applications on in-memory databases are memory-latency-bound rather than compute-bound. Combining strong compression techniques and efficient data structures is essential to fully utilize the hardware capabilities. A common data structure for efficient storing is the bit-packed vector. The bit-packed vector uses a fixed encoding length, which cannot be changed after initialization. Therefore it requires full re-initialization, when the encoding-length changes. In this paper we propose a new data structure, the partitioned bit-packed vector. Therein the encoding length of the stored elements may increase dynamically, while still providing comparable single-value access performance. This paper outlines the access to this data structure and evaluates its performance characteristics. The results suggest that the partitioned bitvector has the capabilities to improve the performance of existing in-memory column-stores for typical enterprise workloads.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"98 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2803140.2803142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent database development, in-memory databases have grown more and more in popularity. The hardware development of the past years has made it possible to keep even larger data sets entirely in main memory of one or a few machines. However, most applications on in-memory databases are memory-latency-bound rather than compute-bound. Combining strong compression techniques and efficient data structures is essential to fully utilize the hardware capabilities. A common data structure for efficient storing is the bit-packed vector. The bit-packed vector uses a fixed encoding length, which cannot be changed after initialization. Therefore it requires full re-initialization, when the encoding-length changes. In this paper we propose a new data structure, the partitioned bit-packed vector. Therein the encoding length of the stored elements may increase dynamically, while still providing comparable single-value access performance. This paper outlines the access to this data structure and evaluates its performance characteristics. The results suggest that the partitioned bitvector has the capabilities to improve the performance of existing in-memory column-stores for typical enterprise workloads.