Proceedings of the 17th International Workshop on Data Management on New Hardware最新文献

KallaxDB: A Table-less Hash-based Key-Value Store on Storage Hardware with Built-in Transparent Compression KallaxDB:一个内置透明压缩的存储硬件上的无表哈希键值存储

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466004

Xubin Chen, Ning Zheng, Shukun Xu, Yifan Qiao, Yang Liu, Jiangpeng Li, Tong Zhang

{"title":"KallaxDB: A Table-less Hash-based Key-Value Store on Storage Hardware with Built-in Transparent Compression","authors":"Xubin Chen, Ning Zheng, Shukun Xu, Yifan Qiao, Yang Liu, Jiangpeng Li, Tong Zhang","doi":"10.1145/3465998.3466004","DOIUrl":"https://doi.org/10.1145/3465998.3466004","url":null,"abstract":"This paper studies the design of a key-value (KV) store that can take full advantage of modern storage hardware with built-in transparent compression capability. Many modern storage appliances/drives implement hardware-based data compression, transparent to OS and applications. Moreover, the growing deployment of hardware-based compression in Cloud infrastructure leads to the imminent arrival of Cloud-based storage hardware with built-in transparent compression. By decoupling the logical storage space utilization efficiency from the true physical storage usage, transparent compression allows data management software to purposely waste logical storage space in return for simpler data structures and algorithms, leading to lower implementation complexity and higher performance. This work proposes a table-less hash-based KV store, where the basic idea is to hash the key space directly onto the logical storage space without using a hash table at all. With a substantially simplified data structure, this approach is subject to significant logical storage space under-utilization, which can be seamlessly mitigated by storage hardware with transparent compression. This paper presents the basic KV store architecture, and develops mathematical formulations to assist its configuration and analysis. We implemented such a KV store KallaxDB and carried out experiments on a commercial SSD with built-in transparent compression. The results show that, while consuming very little memory resource, it compares favorably with the other modern KV stores in terms of throughput, latency, and CPU usage.","PeriodicalId":183683,"journal":{"name":"Proceedings of the 17th International Workshop on Data Management on New Hardware","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124519211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Filter Representation in Vectorized Query Execution 向量化查询执行中的过滤器表示

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466009

Amadou Latyr Ngom, Prashanth Menon, Matthew Butrovich, Lin Ma, Wan Shen Lim, T. Mowry, Andrew Pavlo

{"title":"Filter Representation in Vectorized Query Execution","authors":"Amadou Latyr Ngom, Prashanth Menon, Matthew Butrovich, Lin Ma, Wan Shen Lim, T. Mowry, Andrew Pavlo","doi":"10.1145/3465998.3466009","DOIUrl":"https://doi.org/10.1145/3465998.3466009","url":null,"abstract":"Advances in memory technology have made it feasible for database management systems (DBMS) to store their working data set in main memory. This trend shifts the bottleneck for query execution from disk accesses to CPU efficiency. One technique to improve CPU efficiency is batch-oriented processing, or vectorization, as it reduces interpretation overhead. For each vector (batch) of tuples, the DBMS must track the set of valid (visible) tuples that survive all previous processing steps. To that end, existing systems employ one of two data structures, or filter representations: selection vectors or bitmaps. In this work, we analyze each approach's strengths and weaknesses and offer recommendations on how to implement vectorized operations. Through a wide range of micro-benchmarks, we determine that the optimal strategy is a function of many factors: the cost of iterating through tuples, the cost of the operation itself, and how amenable it is to SIMD vectorization. Our analysis shows that bitmaps perform better for operations that can be vectorized using SIMD instructions and that selection vectors perform better on all other operations due to cheaper iteration logic.","PeriodicalId":183683,"journal":{"name":"Proceedings of the 17th International Workshop on Data Management on New Hardware","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126212418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Parametric I/O Model for Modern Storage Devices 现代存储设备的参数化I/O模型

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466003

Tarikul Islam Papon, Manos Athanassoulis

{"title":"A Parametric I/O Model for Modern Storage Devices","authors":"Tarikul Islam Papon, Manos Athanassoulis","doi":"10.1145/3465998.3466003","DOIUrl":"https://doi.org/10.1145/3465998.3466003","url":null,"abstract":"Storage devices have evolved to offer increasingly faster read/write access, through flash-based and other solid-state storage technologies. When compared to classical rotating hard disk drives (HDDs), modern solid-state drives (SSDs) have two key differences: (i) the absence of mechanical parts, and (ii) an inherent difference between the process of reading and writing. The former removes a key performance bottleneck, enabling internal device parallelism, whereas the latter manifests as a read/write performance asymmetry. In other words, SSDs can serve multiple concurrent I/Os, and their writes are generally slower than reads; none of which is true for HDDs. Yet, the performance of storage-resident applications is typically modeled by the number of disk accesses performed, inherently assuming symmetric read and write performance and the ability to perform only one I/O at a time, failing to accurately capture the performance of modern storage devices. To address this mismatch, we propose a simple yet expressive storage model, termed Parametric I/O Model (PIO) that captures contemporary devices by parameterizing read/write asymmetry (α) and access concurrency (k). PIO enables device-specific decisions at algorithm design time, rather than as an optimization during deployment and testing, thus ensuring optimal algorithm design by taking into account the properties of each device. We present a benchmarking of several storage devices that shows that α and k vary significantly across devices. Further, we show that using carefully quantified values of α and k for each storage device, we can fully exploit the performance it offers, and we lay the groundwork for asymmetry/concurrency-aware storage-intensive algorithms. We also highlight that the degree of the performance benefit due to concurrent reads or writes depends on the asymmetry of the underlying device. Finally, we summarize our findings as a set of guidelines for designing storage-intensive algorithms and discuss specific examples for better algorithm and system designs as well as runtime tuning.","PeriodicalId":183683,"journal":{"name":"Proceedings of the 17th International Workshop on Data Management on New Hardware","volume":"313 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123205757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Workload-Driven Placement of Column-Store Data Structures on DRAM and NVM 工作负载驱动的列存储数据结构在DRAM和NVM上的放置

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466008

Robert Lasch, Robert Schulze, T. Legler, K. Sattler

引用次数: 4

Playing Fetch with CAT: Composing Cache Partitioning and Prefetching for Task-based Query Processing 用CAT玩取:为基于任务的查询处理组合缓存分区和预取

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466016

Qitian Zeng, Kyle C. Hale, Boris Glavic

引用次数: 1

Instant Graph Query Recovery on Persistent Memory 持久内存上的即时图形查询恢复

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466011

Alexander Baumstark, Philipp Götze, M. Jibril, K. Sattler

引用次数: 3

Resource-Efficient Database Query Processing on FPGAs 基于fpga的资源高效数据库查询处理

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466006

Mehdi Moghaddamfar, Christian Färber, Wolfgang Lehner, Norman May, Akash Kumar

引用次数: 6

Drop It In Like It's Hot: An Analysis of Persistent Memory as a Drop-in Replacement for NVMe SSDs 就像它很热一样:持久内存作为NVMe ssd的直接替代品的分析

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466010

Maximilian Böther, Otto Kißig, Lawrence Benson, T. Rabl, H. Plattner

{"title":"Drop It In Like It's Hot: An Analysis of Persistent Memory as a Drop-in Replacement for NVMe SSDs","authors":"Maximilian Böther, Otto Kißig, Lawrence Benson, T. Rabl, H. Plattner","doi":"10.1145/3465998.3466010","DOIUrl":"https://doi.org/10.1145/3465998.3466010","url":null,"abstract":"Solid-state drives (SSDs) have improved database system performance significantly due to the higher bandwidth that they provide over traditional hard disk drives. Persistent memory (PMem) is a new storage technology that offers DRAM-like speed at SSD-like capacity. Due to its byte-addressability, research has mainly treated PMem as a replacement of, or an addition to DRAM, e.g., by proposing highly-optimized, DRAM-PMem-hybrid data structures and system designs. However, PMem can also be used via a regular file system interface and standard Linux I/O operations. In this paper, we analyze PMem as a drop-in replacement for Non-Volatile Memory Express (NVMe) SSDs and evaluate possible performance gains while requiring no or only minor changes to existing applications. This drop-in approach speeds-up database systems like Postgres, without requiring any code changes. We systematically evaluate PMem and NVMe SSDs in three database microbenchmarks and the widely used TPC-H benchmark on Postgres. Our experiments show that PMem outperforms a RAID of four NVMe SSDs in read-intensive OLAP workloads by up to 4x without any modifications while achieving similar performance in write-intensive workloads. Finally, we give four practical insights to aid decision-making on when to use PMem as an SSD drop-in replacement and how to optimize for it.","PeriodicalId":183683,"journal":{"name":"Proceedings of the 17th International Workshop on Data Management on New Hardware","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116592966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Proceedings of the 17th International Workshop on Data Management on New Hardware 第17届新硬件数据管理国际研讨会论文集

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998

引用次数: 0

An Energy-Efficient Stream Join for the Internet of Things 面向物联网的节能流连接

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI: 10.1145/3465998.3466005

Adrian Michalke, P. M. Grulich, Clemens Lutz, Steffen Zeuch, V. Markl

引用次数: 7