Proceedings of the 18th International Workshop on Data Management on New Hardware最新文献

Sampling-Based AQP in Modern Analytical Engines 现代分析机中基于采样的AQP

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535095

Viktor Sanca, A. Ailamaki

{"title":"Sampling-Based AQP in Modern Analytical Engines","authors":"Viktor Sanca, A. Ailamaki","doi":"10.1145/3533737.3535095","DOIUrl":"https://doi.org/10.1145/3533737.3535095","url":null,"abstract":"As the data volume grows, reducing the query execution times remains an elusive goal. While approximate query processing (AQP) techniques present a principled method to trade off accuracy for faster queries in analytics, the sample creation is often considered a second-class citizen. Modern analytical engines optimized for high-bandwidth media and multi-core architectures only exacerbate existing inefficiencies, resulting in prohibitive query-time online sampling and longer preprocessing times in offline AQP systems. We demonstrate that the sampling operators can be practical in modern scale-up analytical systems. First, we evaluate three common sampling methods, identify algorithmic bottlenecks, and propose hardware-conscious optimizations. Second, we reduce the performance penalties of the added processing and sample materialization through system-aware operator design and compare the sample creation time to the matching relational operators of an in-memory JIT-compiled engine. The cost of data reduction with materialization is up to 2.5x of the equivalent group-by in the case of stratified sampling and virtually free (∼ 1x) for reasonable sample sizes of other strategies. As query processing starts to dominate the execution time, the gap between online and offline AQP methods diminishes.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132627332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Result-Set Management for NDP Operations on Smart Storage 智能存储NDP操作结果集管理

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535097

Tobias Vinçon, Christian Knoedler, Arthur Bernhardt, Leonardo Solis-Vasquez, Lukas Weber, Andreas Koch, Ilia Petrov

引用次数: 0

HPCache: Memory-Efficient OLAP Through Proportional Caching HPCache:通过比例缓存实现内存高效OLAP

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535100

Hamish Nicholson, Periklis Chrysogelos, A. Ailamaki

{"title":"HPCache: Memory-Efficient OLAP Through Proportional Caching","authors":"Hamish Nicholson, Periklis Chrysogelos, A. Ailamaki","doi":"10.1145/3533737.3535100","DOIUrl":"https://doi.org/10.1145/3533737.3535100","url":null,"abstract":"Analytical engines rely on in-memory caching to avoid disk accesses and provide timely responses by keeping the most frequently accessed data in memory. Purely frequency- & time-based caching decisions, however, are a proxy of the expected query execution speedup only when disk accesses are significantly slower than in-memory query processing. On the other hand, fast storage offers loading times that approach or even outperform fully in-memory query execution response times, rendering purely frequency-based statistics incapable of capturing impact of a caching decision on query execution. For example, caching the input of a frequent query that spends most of its time processing joins is less beneficial than caching a page for a slightly less frequent but scan-heavy query. As a result, existing caching policies waste valuable memory space to cache input data that offer little-to-no acceleration for analytics. This paper proposes HPCache, a buffer management policy that enables fast analytics on high-bandwidth storage by efficiently using the available in-memory space. HPCache caches data based on their speedup potential instead of relying on frequency-based statistics. We show that, with fast storage, the benefit of in-memory caching varies significantly across queries; therefore, we quantify the efficiency of caching decisions and formulate an optimization problem. We implement HPCache in Proteus and show that i) estimating speedup potential improves memory space utilization, and ii) simple runtime statistics suffice to infer speedup expectations. We show that HPCache achieves up to 12% faster query execution over state-of-the-art caching policies, or 75% less in-memory cache footprint without deteriorating query performance. Overall, HPCache enables efficient use of the in-memory space for input caching in the presence of fast storage, without any requirement for workload predictions.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121037860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Enabling CXL Memory Expansion for In-Memory Database Management Systems 为内存数据库管理系统启用CXL内存扩展

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535090

Minseon Ahn, Andrew Chang, Donghun Lee, J. Gim, Jungmin Kim, Jaemin Jung, Oliver Rebholz, Vincent Pham, Krishna T. Malladi, Y. Ki

引用次数: 12

iGPU-Accelerated Pattern Matching on Event Streams igpu加速的事件流模式匹配

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535099

Marius Kuhrt, Michael Körber, B. Seeger

引用次数: 0

Benchmarking the Second Generation of Intel SGX Hardware 对第二代英特尔SGX硬件进行基准测试

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535098

Muhammad El-Hindi, Tobias Ziegler, Matthias Heinrich, Adrian Lutsch, Zheguang Zhao, Carsten Binnig

{"title":"Benchmarking the Second Generation of Intel SGX Hardware","authors":"Muhammad El-Hindi, Tobias Ziegler, Matthias Heinrich, Adrian Lutsch, Zheguang Zhao, Carsten Binnig","doi":"10.1145/3533737.3535098","DOIUrl":"https://doi.org/10.1145/3533737.3535098","url":null,"abstract":"In recent years, trusted execution environments (TEEs) such as Intel Software Guard Extensions (SGX) have gained a lot of attention in the database community. This is because TEEs provide an interesting platform for building trusted databases in the cloud. However, until recently SGX was only available on low-end single socket servers built on the Intel Xeon E3 processor generation and came with many restrictions for building DBMSs. With the availability of the new Ice Lake processors, Intel provides a new implementation of the SGX technology that supports high-end multi-socket servers. With this new implementation, which we refer to as SGXv2 in this paper, Intel promises to address several limitations of SGX enclaves. This raises the question whether previous efforts to overcome the limitations of SGX for DBMSs are still applicable and if the new generation of SGX can truly deliver on the promise to secure data without compromising on performance. To answer this question, in this paper we conduct a first systematic performance study of Intel SGXv2 and compare it to the previous generation of SGX.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117252630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Improving In-Memory Database Operations with Acceleration DIMM (AxDIMM) 使用加速DIMM (AxDIMM)改进内存中数据库操作

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535093

Donghun Lee, J. So, Minseon Ahn, Jong-Geon Lee, Jungmin Kim, Jeonghyeon Cho, Oliver Rebholz, Vishnu Charan Thummala, JV RaviShankar, S. S. Upadhya, Mohammed Ibrahim Khan, J. Kim

{"title":"Improving In-Memory Database Operations with Acceleration DIMM (AxDIMM)","authors":"Donghun Lee, J. So, Minseon Ahn, Jong-Geon Lee, Jungmin Kim, Jeonghyeon Cho, Oliver Rebholz, Vishnu Charan Thummala, JV RaviShankar, S. S. Upadhya, Mohammed Ibrahim Khan, J. Kim","doi":"10.1145/3533737.3535093","DOIUrl":"https://doi.org/10.1145/3533737.3535093","url":null,"abstract":"The significant overhead needed to transfer the data between CPUs and memory devices is one of the hottest issues in many areas of computing, such as database management systems. Disaggregated computing on the memory devices is being highlighted as one promising approach. In this work, we introduce a new near-memory acceleration scheme for in-memory database operations, called Acceleration DIMM (AxDIMM). It behaves like a normal DIMM through the standard DIMM-compatible interface, but has embedded computing units for data-intensive operations. With the minimized data transfer overhead, it reduces CPU resource consumption, relieves the memory bandwidth bottleneck, and boosts energy efficiency. We implement scan operations, one of the most data-intensive database operations, within AxDIMM and compare its performance with SIMD (Single Instruction Multiple Data) implementation on CPU. Our investigation shows that the acceleration achieves 6.8x more throughput than the SIMD implementation.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125192077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Cache management in MASCARA-FPGA: from coalescing heuristic to replacement policy MASCARA-FPGA中的缓存管理:从合并启发式到替换策略

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535096

V. Huu, Laurent d'Orazio, E. Casseau, Julien Lallet

引用次数: 2

PipeJSON: Parsing JSON at Line Speed on FPGAs PipeJSON:在fpga上以线速解析JSON

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3535094

Jonas Dann, Royden Wagner, Daniel Ritter, Christian Faerber, H. Froening

引用次数: 5

EFA: A Viable Alternative to RDMA over InfiniBand for DBMSs? EFA:一个可行的替代RDMA在InfiniBand dbms ?

Proceedings of the 18th International Workshop on Data Management on New Hardware Pub Date : 2022-06-12 DOI: 10.1145/3533737.3538506

Tobias Ziegler, Dwarakanandan Bindiganavile Mohan, Viktor Leis, Carsten Binnig

引用次数: 3