{"title":"Sampling-Based AQP in Modern Analytical Engines","authors":"Viktor Sanca, A. Ailamaki","doi":"10.1145/3533737.3535095","DOIUrl":"https://doi.org/10.1145/3533737.3535095","url":null,"abstract":"As the data volume grows, reducing the query execution times remains an elusive goal. While approximate query processing (AQP) techniques present a principled method to trade off accuracy for faster queries in analytics, the sample creation is often considered a second-class citizen. Modern analytical engines optimized for high-bandwidth media and multi-core architectures only exacerbate existing inefficiencies, resulting in prohibitive query-time online sampling and longer preprocessing times in offline AQP systems. We demonstrate that the sampling operators can be practical in modern scale-up analytical systems. First, we evaluate three common sampling methods, identify algorithmic bottlenecks, and propose hardware-conscious optimizations. Second, we reduce the performance penalties of the added processing and sample materialization through system-aware operator design and compare the sample creation time to the matching relational operators of an in-memory JIT-compiled engine. The cost of data reduction with materialization is up to 2.5x of the equivalent group-by in the case of stratified sampling and virtually free (∼ 1x) for reasonable sample sizes of other strategies. As query processing starts to dominate the execution time, the gap between online and offline AQP methods diminishes.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132627332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Vinçon, Christian Knoedler, Arthur Bernhardt, Leonardo Solis-Vasquez, Lukas Weber, Andreas Koch, Ilia Petrov
{"title":"Result-Set Management for NDP Operations on Smart Storage","authors":"Tobias Vinçon, Christian Knoedler, Arthur Bernhardt, Leonardo Solis-Vasquez, Lukas Weber, Andreas Koch, Ilia Petrov","doi":"10.1145/3533737.3535097","DOIUrl":"https://doi.org/10.1145/3533737.3535097","url":null,"abstract":"Current data-intensive systems suffer from scalability as they transfer massive amounts of data to the host DBMS to process it there. Novel near-data processing (NDP) DBMS architectures and smart storage can provably reduce the impact of raw data movement. However, transferring the result-set of an NDP operation may increase the data movement, and thus, the performance overhead. In this paper, we introduce a set of in-situ NDP result-set management techniques, such as spilling, materialization, and reuse. Our evaluation indicates a performance improvement of 1.13 × to 400 ×.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"516 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123083955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamish Nicholson, Periklis Chrysogelos, A. Ailamaki
{"title":"HPCache: Memory-Efficient OLAP Through Proportional Caching","authors":"Hamish Nicholson, Periklis Chrysogelos, A. Ailamaki","doi":"10.1145/3533737.3535100","DOIUrl":"https://doi.org/10.1145/3533737.3535100","url":null,"abstract":"Analytical engines rely on in-memory caching to avoid disk accesses and provide timely responses by keeping the most frequently accessed data in memory. Purely frequency- & time-based caching decisions, however, are a proxy of the expected query execution speedup only when disk accesses are significantly slower than in-memory query processing. On the other hand, fast storage offers loading times that approach or even outperform fully in-memory query execution response times, rendering purely frequency-based statistics incapable of capturing impact of a caching decision on query execution. For example, caching the input of a frequent query that spends most of its time processing joins is less beneficial than caching a page for a slightly less frequent but scan-heavy query. As a result, existing caching policies waste valuable memory space to cache input data that offer little-to-no acceleration for analytics. This paper proposes HPCache, a buffer management policy that enables fast analytics on high-bandwidth storage by efficiently using the available in-memory space. HPCache caches data based on their speedup potential instead of relying on frequency-based statistics. We show that, with fast storage, the benefit of in-memory caching varies significantly across queries; therefore, we quantify the efficiency of caching decisions and formulate an optimization problem. We implement HPCache in Proteus and show that i) estimating speedup potential improves memory space utilization, and ii) simple runtime statistics suffice to infer speedup expectations. We show that HPCache achieves up to 12% faster query execution over state-of-the-art caching policies, or 75% less in-memory cache footprint without deteriorating query performance. Overall, HPCache enables efficient use of the in-memory space for input caching in the presence of fast storage, without any requirement for workload predictions.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121037860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minseon Ahn, Andrew Chang, Donghun Lee, J. Gim, Jungmin Kim, Jaemin Jung, Oliver Rebholz, Vincent Pham, Krishna T. Malladi, Y. Ki
{"title":"Enabling CXL Memory Expansion for In-Memory Database Management Systems","authors":"Minseon Ahn, Andrew Chang, Donghun Lee, J. Gim, Jungmin Kim, Jaemin Jung, Oliver Rebholz, Vincent Pham, Krishna T. Malladi, Y. Ki","doi":"10.1145/3533737.3535090","DOIUrl":"https://doi.org/10.1145/3533737.3535090","url":null,"abstract":"Limited memory volume is always a performance bottleneck in an in-memory database management system (IMDBMS) as the data size keeps increasing. To overcome the physical memory limitation, heterogeneous and disaggregated computing platforms are proposed, such as Gen-Z, CCIX, OpenCAPI, and CXL. In this work, we introduce flexible CXL memory expansion using a CXL type 3 prototype and evaluate its performance in an IMDBMS. Our evaluation shows that CXL memory devices interfaced with PCIe Gen5 are appropriate for memory expansion with nearly no throughput degradation in OLTP workloads and less than 8% throughput degradation in OLAP workloads. Thus, CXL memory is a good candidate for memory expansion with lower TCO in IMDBMSs.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124623851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"iGPU-Accelerated Pattern Matching on Event Streams","authors":"Marius Kuhrt, Michael Körber, B. Seeger","doi":"10.1145/3533737.3535099","DOIUrl":"https://doi.org/10.1145/3533737.3535099","url":null,"abstract":"Pattern matching, also known as Match-Recognize in SQL, is an expensive operator of particular relevance in many event stream applications. However, because of its sequential nature and challenging latency requirements, current stream processing engines do not provide any parallel processing support for pattern matching. In addition, hardware accelerators based on dedicated GPUs also offer limited support due to the overhead of transferring data between their local and main memory. In contrast, however, integrated GPUs (iGPUs), with their ability to access main memory directly, offer great potential to accelerate pattern matching. This paper presents the first full-fledged implementation of pattern matching cooperatively using iGPUs and CPUs. Our results obtained from a preliminary experimental performance comparison confirm the potential of our iGPU-based approaches for accelerating pattern matching.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122270852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad El-Hindi, Tobias Ziegler, Matthias Heinrich, Adrian Lutsch, Zheguang Zhao, Carsten Binnig
{"title":"Benchmarking the Second Generation of Intel SGX Hardware","authors":"Muhammad El-Hindi, Tobias Ziegler, Matthias Heinrich, Adrian Lutsch, Zheguang Zhao, Carsten Binnig","doi":"10.1145/3533737.3535098","DOIUrl":"https://doi.org/10.1145/3533737.3535098","url":null,"abstract":"In recent years, trusted execution environments (TEEs) such as Intel Software Guard Extensions (SGX) have gained a lot of attention in the database community. This is because TEEs provide an interesting platform for building trusted databases in the cloud. However, until recently SGX was only available on low-end single socket servers built on the Intel Xeon E3 processor generation and came with many restrictions for building DBMSs. With the availability of the new Ice Lake processors, Intel provides a new implementation of the SGX technology that supports high-end multi-socket servers. With this new implementation, which we refer to as SGXv2 in this paper, Intel promises to address several limitations of SGX enclaves. This raises the question whether previous efforts to overcome the limitations of SGX for DBMSs are still applicable and if the new generation of SGX can truly deliver on the promise to secure data without compromising on performance. To answer this question, in this paper we conduct a first systematic performance study of Intel SGXv2 and compare it to the previous generation of SGX.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117252630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donghun Lee, J. So, Minseon Ahn, Jong-Geon Lee, Jungmin Kim, Jeonghyeon Cho, Oliver Rebholz, Vishnu Charan Thummala, JV RaviShankar, S. S. Upadhya, Mohammed Ibrahim Khan, J. Kim
{"title":"Improving In-Memory Database Operations with Acceleration DIMM (AxDIMM)","authors":"Donghun Lee, J. So, Minseon Ahn, Jong-Geon Lee, Jungmin Kim, Jeonghyeon Cho, Oliver Rebholz, Vishnu Charan Thummala, JV RaviShankar, S. S. Upadhya, Mohammed Ibrahim Khan, J. Kim","doi":"10.1145/3533737.3535093","DOIUrl":"https://doi.org/10.1145/3533737.3535093","url":null,"abstract":"The significant overhead needed to transfer the data between CPUs and memory devices is one of the hottest issues in many areas of computing, such as database management systems. Disaggregated computing on the memory devices is being highlighted as one promising approach. In this work, we introduce a new near-memory acceleration scheme for in-memory database operations, called Acceleration DIMM (AxDIMM). It behaves like a normal DIMM through the standard DIMM-compatible interface, but has embedded computing units for data-intensive operations. With the minimized data transfer overhead, it reduces CPU resource consumption, relieves the memory bandwidth bottleneck, and boosts energy efficiency. We implement scan operations, one of the most data-intensive database operations, within AxDIMM and compare its performance with SIMD (Single Instruction Multiple Data) implementation on CPU. Our investigation shows that the acceleration achieves 6.8x more throughput than the SIMD implementation.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125192077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Huu, Laurent d'Orazio, E. Casseau, Julien Lallet
{"title":"Cache management in MASCARA-FPGA: from coalescing heuristic to replacement policy","authors":"V. Huu, Laurent d'Orazio, E. Casseau, Julien Lallet","doi":"10.1145/3533737.3535096","DOIUrl":"https://doi.org/10.1145/3533737.3535096","url":null,"abstract":"We presented ModulAr Semantic CAching fRAmework (MASCARA) that deployed Semantic Caching (SC) to perform a fast query processing based on Field Programmable Gate Arrays (FPGAs) accelerators. In addition of the accelerators, cache management plays an important role to address coalescing strategy and replacement policy so as to maximize the performance of FPGA caching. Therefore, in this paper, we present a coalescing heuristic with a new replacement function that leverages advantages of traditional strategies and overcomes their drawbacks. The proposed heuristic reduces response time, improves data availability, and saves cache space with respect to the semantic locality of query workload.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127096787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas Dann, Royden Wagner, Daniel Ritter, Christian Faerber, H. Froening
{"title":"PipeJSON: Parsing JSON at Line Speed on FPGAs","authors":"Jonas Dann, Royden Wagner, Daniel Ritter, Christian Faerber, H. Froening","doi":"10.1145/3533737.3535094","DOIUrl":"https://doi.org/10.1145/3533737.3535094","url":null,"abstract":"JavaScript Object Notation (JSON) gained popularity as a data exchange and storage format. While recent advances on modern CPUs show an improved JSON parsing by using data parallelism with vector instructions, the rigid instruction set and limited pipelining of CPUs prevent parsing performance from reaching the practical limit of memory bandwidth. We present PipeJSON, the first standard-compliant JSON parser to process tens of gigabytes of data per second. It utilizes FPGA hardware to make extensive use of pipelining and can parse multiple characters per clock cycle. To ensure usability in software projects, PipeJSON is written in Data Parallel C++ and achieves 7.95 × speedup over state-of-the-art JSON parsers on CPU, despite data transfer to the FPGA.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124287325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Ziegler, Dwarakanandan Bindiganavile Mohan, Viktor Leis, Carsten Binnig
{"title":"EFA: A Viable Alternative to RDMA over InfiniBand for DBMSs?","authors":"Tobias Ziegler, Dwarakanandan Bindiganavile Mohan, Viktor Leis, Carsten Binnig","doi":"10.1145/3533737.3538506","DOIUrl":"https://doi.org/10.1145/3533737.3538506","url":null,"abstract":"RDMA over InfiniBand offers high bandwidth and low latency which provides many benefits for distributed DBMSs. However, in the cloud RDMA is still not widely available. Instead, cloud providers often invest in their own high-speed networking technology and start to expose their own native networking interfaces. For example, the largest cloud provider, Amazon Web Services (AWS), introduced instances with Elastic Fabric Adapter (EFA) in 2018. In this paper, we aim to analyze EFA as an alternative to RDMA in the cloud by performing an in-depth and systematic evaluation.","PeriodicalId":381503,"journal":{"name":"Proceedings of the 18th International Workshop on Data Management on New Hardware","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115019512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}