具有状态的数据库操作符的缓存敏感缓冲

International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI:10.1145/1565694.1565704

J. Cieslewicz, William Mee, K. A. Ross

{"title":"具有状态的数据库操作符的缓存敏感缓冲","authors":"J. Cieslewicz, William Mee, K. A. Ross","doi":"10.1145/1565694.1565704","DOIUrl":null,"url":null,"abstract":"Database processes must be cache-efficient to effectively utilize modern hardware. In this paper, we analyze the importance of temporal locality and the resultant cache behavior in scheduling database operators for in-memory, block oriented query processing. We demonstrate how the overall performance of a workload of multiple database operators is strongly dependent on how they are interleaved with each other. Longer time slices combined with temporal locality within an operator amortize the effects of the initial compulsory cache misses needed to load the operator's state, such as a hash table, into the cache. Though running an operator to completion over all of its input results in the greatest amortization of cache misses, this is typically infeasible because of the large intermediate storage requirement to materialize all input tuples to an operator. We show experimentally that good cache performance can be obtained with smaller buffers whose size is determined at runtime. We demonstrate a low-overhead method of runtime cache miss sampling using hardware performance counters. Our evaluation considers two common database operators with state: aggregation and hash join. Sampling reveals operator temporal locality and cache miss behavior, and we use those characteristics to choose an appropriate input buffer/block size. The calculated buffer size balances cache miss amortization with buffer memory requirements.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Cache-conscious buffering for database operators with state\",\"authors\":\"J. Cieslewicz, William Mee, K. A. Ross\",\"doi\":\"10.1145/1565694.1565704\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Database processes must be cache-efficient to effectively utilize modern hardware. In this paper, we analyze the importance of temporal locality and the resultant cache behavior in scheduling database operators for in-memory, block oriented query processing. We demonstrate how the overall performance of a workload of multiple database operators is strongly dependent on how they are interleaved with each other. Longer time slices combined with temporal locality within an operator amortize the effects of the initial compulsory cache misses needed to load the operator's state, such as a hash table, into the cache. Though running an operator to completion over all of its input results in the greatest amortization of cache misses, this is typically infeasible because of the large intermediate storage requirement to materialize all input tuples to an operator. We show experimentally that good cache performance can be obtained with smaller buffers whose size is determined at runtime. We demonstrate a low-overhead method of runtime cache miss sampling using hardware performance counters. Our evaluation considers two common database operators with state: aggregation and hash join. Sampling reveals operator temporal locality and cache miss behavior, and we use those characteristics to choose an appropriate input buffer/block size. The calculated buffer size balances cache miss amortization with buffer memory requirements.\",\"PeriodicalId\":298901,\"journal\":{\"name\":\"International Workshop on Data Management on New Hardware\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Data Management on New Hardware\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1565694.1565704\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1565694.1565704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

数据库进程必须具有缓存效率才能有效地利用现代硬件。在本文中，我们分析了时间局部性的重要性和由此产生的缓存行为在调度数据库操作符在内存中，面向块的查询处理。我们将演示多个数据库操作符的工作负载的总体性能如何强烈依赖于它们如何相互交错。较长的时间片与操作符内的时间局域性相结合，可以分摊将操作符状态(如哈希表)加载到缓存中所需的初始强制缓存缺失的影响。虽然运行一个操作符直到完成它的所有输入会导致最大程度的缓存丢失分摊，但这通常是不可行的，因为将操作符的所有输入元组具体化需要大量的中间存储空间。我们通过实验证明，在运行时确定较小的缓冲区大小可以获得良好的缓存性能。我们演示了一种使用硬件性能计数器的低开销的运行时缓存缺失采样方法。我们的评估考虑了两种常见的数据库状态操作符:聚合和散列连接。采样揭示了操作员的时间局部性和缓存缺失行为，我们使用这些特征来选择合适的输入缓冲区/块大小。计算的缓冲区大小平衡缓存丢失分摊与缓冲区内存需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cache-conscious buffering for database operators with state

Database processes must be cache-efficient to effectively utilize modern hardware. In this paper, we analyze the importance of temporal locality and the resultant cache behavior in scheduling database operators for in-memory, block oriented query processing. We demonstrate how the overall performance of a workload of multiple database operators is strongly dependent on how they are interleaved with each other. Longer time slices combined with temporal locality within an operator amortize the effects of the initial compulsory cache misses needed to load the operator's state, such as a hash table, into the cache. Though running an operator to completion over all of its input results in the greatest amortization of cache misses, this is typically infeasible because of the large intermediate storage requirement to materialize all input tuples to an operator. We show experimentally that good cache performance can be obtained with smaller buffers whose size is determined at runtime. We demonstrate a low-overhead method of runtime cache miss sampling using hardware performance counters. Our evaluation considers two common database operators with state: aggregation and hash join. Sampling reveals operator temporal locality and cache miss behavior, and we use those characteristics to choose an appropriate input buffer/block size. The calculated buffer size balances cache miss amortization with buffer memory requirements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Workshop on Data Management on New Hardware

自引率

0.00%

发文量