International Workshop on Data Management on New Hardware最新文献

筛选
英文 中文
Frequent itemset mining on graphics processors 图形处理器上频繁的项集挖掘
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565702
Wenbin Fang, Mian Lu, Xiangye Xiao, Bingsheng He, Qiong Luo
{"title":"Frequent itemset mining on graphics processors","authors":"Wenbin Fang, Mian Lu, Xiangye Xiao, Bingsheng He, Qiong Luo","doi":"10.1145/1565694.1565702","DOIUrl":"https://doi.org/10.1145/1565694.1565702","url":null,"abstract":"We present two efficient Apriori implementations of Frequent Itemset Mining (FIM) that utilize new-generation graphics processing units (GPUs). Our implementations take advantage of the GPU's massively multi-threaded SIMD (Single Instruction, Multiple Data) architecture. Both implementations employ a bitmap data structure to exploit the GPU's SIMD parallelism and to accelerate the frequency counting operation. One implementation runs entirely on the GPU and eliminates intermediate data transfer between the GPU memory and the CPU memory. The other implementation employs both the GPU and the CPU for processing. It represents itemsets in a trie, and uses the CPU for trie traversing and incremental maintenance. Our preliminary results show that both implementations achieve a speedup of up to two orders of magnitude over optimized CPU Apriori implementations on a PC with an NVIDIA GTX 280 GPU and a quad-core CPU.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117177829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 133
Join processing for flash SSDs: remembering past lessons 加入闪存ssd的处理:记住过去的教训
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565696
Jaeyoung Do, J. Patel
{"title":"Join processing for flash SSDs: remembering past lessons","authors":"Jaeyoung Do, J. Patel","doi":"10.1145/1565694.1565696","DOIUrl":"https://doi.org/10.1145/1565694.1565696","url":null,"abstract":"Flash solid state drives (SSDs) provide an attractive alternative to traditional magnetic hard disk drives (HDDs) for DBMS applications. Naturally there is substantial interest in redesigning critical database internals, such as join algorithms, for flash SSDs. However, we must carefully consider the lessons that we have learnt from over three decades of designing and tuning algorithms for magnetic HDD-based systems, so that we continue to reuse techniques that worked for magnetic HDDs and also work with flash SSDs.\u0000 The focus of this paper is on recalling some of these lessons in the context of ad hoc join algorithms. Based on an actual implementation of four common ad hoc join algorithms on both a magnetic HDD and a flash SSD, we show that many of the \"surprising\" results from magnetic HDD-based join methods also hold for flash SSDs. These results include the superiority of block nested loops join over sort-merge join and Grace hash join in many cases, and the benefits of blocked I/Os. In addition, we find that simply looking at the I/O costs when designing new flash SSD join algorithms can be problematic, as the CPU cost is often a bigger component of the total join cost with SSDs. We hope that these results provide insights and better starting points for researchers designing new join algorithms for flash SSDs.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127138476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Cache-conscious buffering for database operators with state 具有状态的数据库操作符的缓存敏感缓冲
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565704
J. Cieslewicz, William Mee, K. A. Ross
{"title":"Cache-conscious buffering for database operators with state","authors":"J. Cieslewicz, William Mee, K. A. Ross","doi":"10.1145/1565694.1565704","DOIUrl":"https://doi.org/10.1145/1565694.1565704","url":null,"abstract":"Database processes must be cache-efficient to effectively utilize modern hardware. In this paper, we analyze the importance of temporal locality and the resultant cache behavior in scheduling database operators for in-memory, block oriented query processing. We demonstrate how the overall performance of a workload of multiple database operators is strongly dependent on how they are interleaved with each other. Longer time slices combined with temporal locality within an operator amortize the effects of the initial compulsory cache misses needed to load the operator's state, such as a hash table, into the cache. Though running an operator to completion over all of its input results in the greatest amortization of cache misses, this is typically infeasible because of the large intermediate storage requirement to materialize all input tuples to an operator. We show experimentally that good cache performance can be obtained with smaller buffers whose size is determined at runtime. We demonstrate a low-overhead method of runtime cache miss sampling using hardware performance counters. Our evaluation considers two common database operators with state: aggregation and hash join. Sampling reveals operator temporal locality and cache miss behavior, and we use those characteristics to choose an appropriate input buffer/block size. The calculated buffer size balances cache miss amortization with buffer memory requirements.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133052962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
k-ary search on modern processors 在现代处理器上的K-ary搜索
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565705
B. Schlegel, Rainer Gemulla, Wolfgang Lehner
{"title":"k-ary search on modern processors","authors":"B. Schlegel, Rainer Gemulla, Wolfgang Lehner","doi":"10.1145/1565694.1565705","DOIUrl":"https://doi.org/10.1145/1565694.1565705","url":null,"abstract":"This paper presents novel tree-based search algorithms that exploit the SIMD instructions found in virtually all modern processors. The algorithms are a natural extension of binary search: While binary search performs one comparison at each iteration, thereby cutting the search space in two halves, our algorithms perform k comparisons at a time and thus cut the search space into k pieces. On traditional processors, this so-called k-ary search procedure is not beneficial because the cost increase per iteration offsets the cost reduction due to the reduced number of iterations. On modern processors, however, multiple scalar operations can be executed simultaneously, which makes k-ary search attractive. In this paper, we provide two different search algorithms that differ in terms of efficiency and memory access patterns. Both algorithms are first described in a platform independent way and then evaluated on various state-of-the-art processors. Our experiments suggest that k-ary search provides significant performance improvements (factor two and more) on most platforms.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128830850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Spinning relations: high-speed networks for distributed join processing 旋转关系:用于分布式连接处理的高速网络
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565701
P. Frey, R. Goncalves, M. Kersten, J. Teubner
{"title":"Spinning relations: high-speed networks for distributed join processing","authors":"P. Frey, R. Goncalves, M. Kersten, J. Teubner","doi":"10.1145/1565694.1565701","DOIUrl":"https://doi.org/10.1145/1565694.1565701","url":null,"abstract":"By leveraging modern networking hardware (RDMA-enabled network cards), we can shift priorities in distributed database processing significantly. Complex and sophisticated mechanisms to avoid network traffic can be replaced by a scheme that takes advantage of the bandwidth and low latency offered by such interconnects.\u0000 We illustrate this phenomenon with cyclo-join, an efficient join algorithm based on continuously pumping data through a ring-structured network. Our approach is capable of exploiting the resources of all CPUs and distributed main-memory available in the network for processing queries of arbitrary shape and datasets of arbitrary size.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121404270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Evaluating and repairing write performance on flash devices 评估和修复闪存设备的写性能
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565697
R. Stoica, Manos Athanassoulis, Ryan Johnson, A. Ailamaki
{"title":"Evaluating and repairing write performance on flash devices","authors":"R. Stoica, Manos Athanassoulis, Ryan Johnson, A. Ailamaki","doi":"10.1145/1565694.1565697","DOIUrl":"https://doi.org/10.1145/1565694.1565697","url":null,"abstract":"In the last few years NAND flash storage has become more and more popular as price per GB and capacity both improve at exponential rates. Flash memory offers significant benefits compared to magnetic hard disk drives (HDDs) and DBMSs are highly likely to use flash as a general storage backend, either alone or in heterogeneous storage solutions with HDDs. Flash devices, however, respond quite differently than HDDs for common access patterns, and recent research shows a strong asymmetry between read and write performance. Moreover, flash storage devices behave unpredictably, showing a high dependence on previous IO history and usage patterns.\u0000 In this paper we investigate how a DBMS can overcome these issues to take full advantage of flash memory as persistent storage. We propose new a flash aware data layout --- append and pack --- which stabilizes device performance by eliminating random writes. We assess the impact of append and pack on OLTP workload performance using both an analytical model and micro-benchmarks, and our results suggest that significant improvements can be achieved for real workloads.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126472494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
A new look at the roles of spinning and blocking 重新审视旋转和阻挡的作用
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565700
Ryan Johnson, Manos Athanassoulis, R. Stoica, A. Ailamaki
{"title":"A new look at the roles of spinning and blocking","authors":"Ryan Johnson, Manos Athanassoulis, R. Stoica, A. Ailamaki","doi":"10.1145/1565694.1565700","DOIUrl":"https://doi.org/10.1145/1565694.1565700","url":null,"abstract":"Database engines face growing scalability challenges as core counts exponentially increase each processor generation, and the efficiency of synchronization primitives used to protect internal data structures is a crucial factor in overall database performance. The trade-offs between different implementation approaches for these primitives shift significantly with increasing degrees of available hardware parallelism. Blocking synchronization, which has long been the favored approach in database systems, becomes increasingly unattractive as growing core counts expose its bottlenecks. Spinning implementations improve peak system throughput by a factor of 2x or more for 64 hardware contexts, but suffer from poor performance under load.\u0000 In this paper we analyze the shifting trade-off between spinning and blocking synchronization, and observe that the trade-off can be simplified by isolating the load control aspects of contention management and treating the two problems separately: spinning-based contention management and blocking-based load control. We then present a proof of concept implementation that, for high concurrency, matches or exceeds the performance of both user-level spin-locks and the pthread mutex under a wide range of load factors.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121418465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
CFDC: a flash-aware replacement policy for database buffer management CFDC:用于数据库缓冲区管理的闪存感知替换策略
International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI: 10.1145/1565694.1565698
Y. Ou, T. Härder, Peiquan Jin
{"title":"CFDC: a flash-aware replacement policy for database buffer management","authors":"Y. Ou, T. Härder, Peiquan Jin","doi":"10.1145/1565694.1565698","DOIUrl":"https://doi.org/10.1145/1565694.1565698","url":null,"abstract":"Flash disks are becoming an important alternative to conventional magnetic disks. Although accessed through the same interface by applications, flash disks have some distinguished characteristics that make it necessary to reconsider the design of the software to leverage their performance potential. This paper addresses this problem at the buffer management layer of database systems and proposes a flash-aware replacement policy that significantly improves and outperforms one of the previous proposals in this area.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128939698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
Data partitioning on chip multiprocessors 芯片多处理器上的数据分区
International Workshop on Data Management on New Hardware Pub Date : 2008-06-13 DOI: 10.1145/1457150.1457156
J. Cieslewicz, K. A. Ross
{"title":"Data partitioning on chip multiprocessors","authors":"J. Cieslewicz, K. A. Ross","doi":"10.1145/1457150.1457156","DOIUrl":"https://doi.org/10.1145/1457150.1457156","url":null,"abstract":"Partitioning is a key database task. In this paper we explore partitioning performance on a chip multiprocessor (CMP) that provides a relatively high degree of on-chip thread-level parallelism. It is therefore important to implement the partitioning algorithm to take advantage of the CMP's parallel execution resources. We identify the coordination of writing partition output as the main challenge in a parallel partitioning implementation and evaluate four techniques for enabling parallel partitioning. We confirm previous work in single threaded partitioning that finds L2 cache misses and translation lookaside buffer misses to be important performance issues, but we now add the management of concurrent threads to this analysis.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121829553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Critical sections: re-emerging scalability concerns for database storage engines 关键部分:重新出现的数据库存储引擎的可伸缩性问题
International Workshop on Data Management on New Hardware Pub Date : 2008-06-13 DOI: 10.1145/1457150.1457157
Ryan Johnson, I. Pandis, A. Ailamaki
{"title":"Critical sections: re-emerging scalability concerns for database storage engines","authors":"Ryan Johnson, I. Pandis, A. Ailamaki","doi":"10.1145/1457150.1457157","DOIUrl":"https://doi.org/10.1145/1457150.1457157","url":null,"abstract":"Critical sections in database storage engines impact performance and scalability more as the number of hardware contexts per chip continues to grow exponentially. With enough threads in the system, some critical section will eventually become a bottleneck. While algorithmic changes are the only long-term solution, they tend to be complex and costly to develop. Meanwhile, changes in enforcement of critical sections require much less effort. We observe that, in practice, many critical sections are so short that enforcing them contributes a significant or even dominating fraction of their total cost and tuning them directly improves database system performance. The contribution of this paper is two-fold: we (a) make a thorough performance comparison of the various synchronization primitives in the database system developer's toolbox and highlight the best ones for practical use, and (b) show that properly enforcing critical sections can delay the need to make algorithmic changes for a target number of processors.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130407021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信