Adaptive Data Skipping in Main-Memory Systems

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI:10.1145/2882903.2914836

Wilson Qin, Stratos Idreos

引用次数: 17

Abstract

As modern main-memory optimized data systems increasingly rely on fast scans, lightweight indexes that allow for data skipping play a crucial role in data filtering to reduce system I/O. Scans benefit from data skipping when the data order is sorted, semi-sorted, or comprised of clustered values. However data skipping loses effectiveness over arbitrary data distributions. Applying data skipping techniques over non-sorted data can significantly decrease query performance since the extra cost of metadata reads result in no corresponding scan performance gains. We introduce adaptive data skipping as a framework for structures and techniques that respond to a vast array of data distributions and query workloads. We reveal an adaptive zonemaps design and implementation on a main-memory column store prototype to demonstrate that adaptive data skipping has potential for 1.4X speedup.

查看原文本刊更多论文

主存系统中的自适应数据跳变

由于现代主存优化的数据系统越来越依赖于快速扫描，允许数据跳过的轻量级索引在数据过滤中起着至关重要的作用，可以减少系统I/O。当数据顺序排序、半排序或由聚集值组成时，扫描受益于数据跳过。然而，数据跳变在任意数据分布中失去了有效性。在未排序的数据上应用数据跳过技术会显著降低查询性能，因为元数据读取的额外成本不会带来相应的扫描性能提升。我们将自适应数据跳转作为响应大量数据分布和查询工作负载的结构和技术框架引入。我们在一个主存列存储原型上展示了一个自适应区域地图的设计和实现，以证明自适应数据跳转具有1.4倍加速的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2016 International Conference on Management of Data

自引率

0.00%

发文量