{"title":"Scaling column imprints using advanced vectorization","authors":"Lefteris Sidirourgos, H. Mühleisen","doi":"10.1145/3076113.3076120","DOIUrl":null,"url":null,"abstract":"Column Imprints is a pre-filtering secondary index for answering range queries. The main feature of imprints is that they are light-weight and are based on compressed bit-vectors, one per cacheline, that quickly determine if the values in that cacheline satisfy the predicates of a query. The main overhead of the imprints implementation is the many sequential value comparisons against the boundaries of a virtual equi-height histogram. Similarly, during query scans, many sequential value comparisons are performed to identify false positives. In this paper, we speed-up the process of imprints creation and querying by using advanced vectorization techniques. We also experimentally explore the benefits of stretching imprints to larger bit-vector sizes and blocks of data, using 256-bit SIMD registers. Our findings are very promising for both imprints and for future index design research that would employ advanced vectorization techniques and larger (up to 512-bit) and more (from 16 now to 32) SIMD registers.","PeriodicalId":185720,"journal":{"name":"Proceedings of the 13th International Workshop on Data Management on New Hardware","volume":"153 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3076113.3076120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Column Imprints is a pre-filtering secondary index for answering range queries. The main feature of imprints is that they are light-weight and are based on compressed bit-vectors, one per cacheline, that quickly determine if the values in that cacheline satisfy the predicates of a query. The main overhead of the imprints implementation is the many sequential value comparisons against the boundaries of a virtual equi-height histogram. Similarly, during query scans, many sequential value comparisons are performed to identify false positives. In this paper, we speed-up the process of imprints creation and querying by using advanced vectorization techniques. We also experimentally explore the benefits of stretching imprints to larger bit-vector sizes and blocks of data, using 256-bit SIMD registers. Our findings are very promising for both imprints and for future index design research that would employ advanced vectorization techniques and larger (up to 512-bit) and more (from 16 now to 32) SIMD registers.