BitWeaving: fast scans for main memory data processing

Proceedings. ACM-SIGMOD International Conference on Management of Data Pub Date : 2013-06-22 DOI:10.1145/2463676.2465322

Yinan Li, J. Patel

{"title":"BitWeaving: fast scans for main memory data processing","authors":"Yinan Li, J. Patel","doi":"10.1145/2463676.2465322","DOIUrl":null,"url":null,"abstract":"This paper focuses on running scans in a main memory data processing system at \"bare metal\" speed. Essentially, this means that the system must aim to process data at or near the speed of the processor (the fastest component in most system configurations). Scans are common in main memory data processing environments, and with the state-of-the-art techniques it still takes many cycles per input tuple to apply simple predicates on a single column of a table. In this paper, we propose a technique called BitWeaving that exploits the parallelism available at the bit level in modern processors. BitWeaving operates on multiple bits of data in a single cycle, processing bits from different columns in each cycle. Thus, bits from a batch of tuples are processed in each cycle, allowing BitWeaving to drop the cycles per column to below one in some case. BitWeaving comes in two flavors: BitWeaving/V which looks like a columnar organization but at the bit level, and BitWeaving/H which packs bits horizontally. In this paper we also develop the arithmetic framework that is needed to evaluate predicates using these BitWeaving organizations. Our experimental results show that both these methods produce significant performance benefits over the existing state-of-the-art methods, and in some cases produce over an order of magnitude in performance improvement.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"74 1","pages":"289-300"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"147","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. ACM-SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2463676.2465322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 147

Abstract

This paper focuses on running scans in a main memory data processing system at "bare metal" speed. Essentially, this means that the system must aim to process data at or near the speed of the processor (the fastest component in most system configurations). Scans are common in main memory data processing environments, and with the state-of-the-art techniques it still takes many cycles per input tuple to apply simple predicates on a single column of a table. In this paper, we propose a technique called BitWeaving that exploits the parallelism available at the bit level in modern processors. BitWeaving operates on multiple bits of data in a single cycle, processing bits from different columns in each cycle. Thus, bits from a batch of tuples are processed in each cycle, allowing BitWeaving to drop the cycles per column to below one in some case. BitWeaving comes in two flavors: BitWeaving/V which looks like a columnar organization but at the bit level, and BitWeaving/H which packs bits horizontally. In this paper we also develop the arithmetic framework that is needed to evaluate predicates using these BitWeaving organizations. Our experimental results show that both these methods produce significant performance benefits over the existing state-of-the-art methods, and in some cases produce over an order of magnitude in performance improvement.

查看原文本刊更多论文

BitWeaving:快速扫描主存数据处理

本文主要研究在主存数据处理系统中以“裸机”速度运行扫描。本质上，这意味着系统必须以处理器(大多数系统配置中最快的组件)的速度或接近处理器的速度处理数据。扫描在主存数据处理环境中很常见，使用最先进的技术，在表的单个列上应用简单的谓词，每个输入元组仍然需要许多周期。在本文中，我们提出了一种称为BitWeaving的技术，它利用了现代处理器在位级上可用的并行性。BitWeaving在一个周期中对多个数据位进行操作，在每个周期中处理来自不同列的位。因此，在每个周期中处理一批元组中的位，允许BitWeaving在某些情况下将每列的周期降低到1以下。BitWeaving有两种风格:BitWeaving/V看起来像一个柱状组织，但在位级，BitWeaving/H是水平打包位。在本文中，我们还开发了使用这些BitWeaving组织评估谓词所需的算术框架。我们的实验结果表明，这两种方法都比现有的最先进的方法产生了显著的性能优势，并且在某些情况下产生了超过数量级的性能改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. ACM-SIGMOD International Conference on Management of Data

自引率

0.00%

发文量