Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics最新文献

筛选
英文 中文
Cost-based Memory Partitioning and Management in Memcached Memcached中基于成本的内存分区和管理
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803146
D. Carra, P. Michiardi
{"title":"Cost-based Memory Partitioning and Management in Memcached","authors":"D. Carra, P. Michiardi","doi":"10.1145/2803140.2803146","DOIUrl":"https://doi.org/10.1145/2803140.2803146","url":null,"abstract":"In this work we present a cost-based memory partitioning and management mechanism for Memcached, an in-memory key-value store used as Web cache, that is able to dynamically adapt to user requests and manage the memory according to both object sizes and costs. We then present a comparative analysis of the vanilla memory management scheme of Memcached and our approach, using real traces from a major content delivery network operator. Our results indicate that our scheme achieves near-optimal performance, striking a good balance between the performance perceived by end-users and the pressure imposed on back-end servers.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123383404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Gaussian Mixture Models Use-Case: In-Memory Analysis with Myria 高斯混合模型用例:内存分析与Myria
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803143
R. Maas, Jeremy Hyrkas, O. Telford, M. Balazinska, A. Connolly, Bill Howe
{"title":"Gaussian Mixture Models Use-Case: In-Memory Analysis with Myria","authors":"R. Maas, Jeremy Hyrkas, O. Telford, M. Balazinska, A. Connolly, Bill Howe","doi":"10.1145/2803140.2803143","DOIUrl":"https://doi.org/10.1145/2803140.2803143","url":null,"abstract":"In our work with scientists, we find that Gaussian Mixture Modeling is a common type of analysis applied to increasingly large datasets. We implement this algorithm in the Myria shared-nothing relational data management system, which performs the computation in memory. We study resulting memory utilization challenges and implement several optimizations that yield an efficient and scalable solution. Empirical evaluations on large astronomy and oceanography datasets confirm that our Myria approach scales well and performs up to an order of magnitude faster than Hadoop.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121558297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Query Optimization Time: The New Bottleneck in Real-time Analytics 查询优化时间:实时分析的新瓶颈
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803148
Rajkumar Sen, Jack Chen, Nika Jimsheleishvilli
{"title":"Query Optimization Time: The New Bottleneck in Real-time Analytics","authors":"Rajkumar Sen, Jack Chen, Nika Jimsheleishvilli","doi":"10.1145/2803140.2803148","DOIUrl":"https://doi.org/10.1145/2803140.2803148","url":null,"abstract":"In the recent past, in-memory distributed database management systems have become increasingly popular to manage and query huge amounts of data. For an in-memory distributed database like MemSQL, it is imperative that the analytical queries run fast. A huge proportion of MemSQL's customer workloads have ad-hoc analytical queries that need to finish execution within a second or a few seconds. This leaves us with very little time to perform query optimization for complex queries involving several joins, aggregations, sub-queries etc. Even for queries that are not ad-hoc, a change in data statistics can trigger query re-optimization. Query Optimization, if not done intelligently, could very well be the bottleneck for such complex analytical queries that require real-time response. In this paper, we outline some of the early steps that we have taken to reduce the query optimization time without sacrificing plan quality. We optimized the Enumerator (the optimizer component that determines operator order), which takes up bulk of the optimization time. Generating bushy plans inside the Enumerator can be a bottleneck and so we used heuristics to generate bushy plans via query rewrite. We also implemented new distribution aware greedy heuristics to generate a good starting candidate plan that significantly prunes out states during search space analysis inside the Enumerator. We demonstrate the effectiveness of these techniques over several queries in TPC-H and TPC-DS benchmarks.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131379802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
NVC-Hashmap: A Persistent and Concurrent Hashmap For Non-Volatile Memories NVC-Hashmap:用于非易失性内存的持久和并发Hashmap
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803144
David Schwalb, Markus Dreseler, M. Uflacker, H. Plattner
{"title":"NVC-Hashmap: A Persistent and Concurrent Hashmap For Non-Volatile Memories","authors":"David Schwalb, Markus Dreseler, M. Uflacker, H. Plattner","doi":"10.1145/2803140.2803144","DOIUrl":"https://doi.org/10.1145/2803140.2803144","url":null,"abstract":"Non-volatile RAM (NVRAM) will fundamentally change in-memory databases as data structures do not have to be explicitly backed up to hard drives or SSDs, but can be inherently persistent in main memory. To guarantee consistency even in the case of power failures, programmers need to ensure that data is flushed from volatile CPU caches where it would be susceptible to power outages to NVRAM. In this paper, we present the NVC-Hashmap, a lock-free hashmap that is used for unordered dictionaries and delta indices in in-memory databases. The NVC-Hashmap is then evaluated in both stand-alone and integrated database benchmarks and compared to a B+-Tree based persistent data structure.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134431983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Write Amplification: An Analysis of In-Memory Database Durability Techniques 写放大:内存数据库持久性技术的分析
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803141
Jaemyung Kim, K. Salem, Khuzaima S. Daudjee
{"title":"Write Amplification: An Analysis of In-Memory Database Durability Techniques","authors":"Jaemyung Kim, K. Salem, Khuzaima S. Daudjee","doi":"10.1145/2803140.2803141","DOIUrl":"https://doi.org/10.1145/2803140.2803141","url":null,"abstract":"Modern in-memory database systems perform transactions an order of magnitude faster than conventional database systems. While in-memory database systems can read the database without I/O, database updates can generate a substantial amount of I/O, since updates must normally be written to persistent secondary storage to ensure that they are durable. In this paper we present a study of storage managers for in-memory database systems, with the goal of characterizing their I/O efficiency. We model the storage efficiency of two classes of storage managers: those that perform in-place updates in secondary storage, and those that use copy-on-write. Our models allow us to make meaningful, quantitative comparisons of storage managers' I/O efficiencies under a variety of conditions.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132194716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Partitioned Bit-Packed Vectors for In-Memory-Column-Stores 用于内存列存储的分区位打包向量
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803142
Martin Faust, Pedro Flemming, David Schwalb, H. Plattner
{"title":"Partitioned Bit-Packed Vectors for In-Memory-Column-Stores","authors":"Martin Faust, Pedro Flemming, David Schwalb, H. Plattner","doi":"10.1145/2803140.2803142","DOIUrl":"https://doi.org/10.1145/2803140.2803142","url":null,"abstract":"In recent database development, in-memory databases have grown more and more in popularity. The hardware development of the past years has made it possible to keep even larger data sets entirely in main memory of one or a few machines. However, most applications on in-memory databases are memory-latency-bound rather than compute-bound. Combining strong compression techniques and efficient data structures is essential to fully utilize the hardware capabilities. A common data structure for efficient storing is the bit-packed vector. The bit-packed vector uses a fixed encoding length, which cannot be changed after initialization. Therefore it requires full re-initialization, when the encoding-length changes. In this paper we propose a new data structure, the partitioned bit-packed vector. Therein the encoding length of the stored elements may increase dynamically, while still providing comparable single-value access performance. This paper outlines the access to this data structure and evaluates its performance characteristics. The results suggest that the partitioned bitvector has the capabilities to improve the performance of existing in-memory column-stores for typical enterprise workloads.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"98 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114094396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hyrise-R: Scale-out and Hot-Standby through Lazy Master Replication for Enterprise Applications Hyrise-R:通过企业应用程序的延迟主复制进行横向扩展和热备
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803147
David Schwalb, Jan Kossmann, Martin Faust, Stefan Klauck, M. Uflacker, H. Plattner
{"title":"Hyrise-R: Scale-out and Hot-Standby through Lazy Master Replication for Enterprise Applications","authors":"David Schwalb, Jan Kossmann, Martin Faust, Stefan Klauck, M. Uflacker, H. Plattner","doi":"10.1145/2803140.2803147","DOIUrl":"https://doi.org/10.1145/2803140.2803147","url":null,"abstract":"In-memory database systems are well-suited for enterprise workloads, consisting of transactional and analytical queries. A growing number of users and an increasing demand for enterprise applications can saturate or even overload single-node database systems at peak times. Better performance can be achieved by improving a single machine's hardware but it is often cheaper and more practicable to follow a scale-out approach and replicate data by using additional machines. In this paper we present Hyrise-R, a lazy master replication system for the in-memory database Hyrise. By setting up a snapshot-based Hyrise cluster, we increase both performance by distributing queries over multiple instances and availability by utilizing the redundancy of the cluster structure. This paper describes the architecture of Hyrise-R and details of the implemented replication mechanisms. We set up Hyrise-R on instances of Amazon's Elastic Compute Cloud and present a detailed performance evaluation of our system, including a linear query throughput increase for enterprise workloads.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121084635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Selection on Modern CPUs 现代cpu的选择
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics Pub Date : 2015-08-31 DOI: 10.1145/2803140.2803145
Steffen Zeuch, J. Freytag
{"title":"Selection on Modern CPUs","authors":"Steffen Zeuch, J. Freytag","doi":"10.1145/2803140.2803145","DOIUrl":"https://doi.org/10.1145/2803140.2803145","url":null,"abstract":"Modern processors employ sophisticated techniques such as speculative or out-of-order execution to hide memory latencies and keep their pipelines fully utilized. However, these techniques introduce high complexity and variance to query processing. In particular, these techniques are transparent to DBMS operations since they are managed by processors internally. To fully utilize the sophisticated capabilities of modern CPUs, it is necessary to understand their characteristics and adjust operators as well as cost models accordingly. In this paper, we extensively examine the execution of a relational selection operator on modern hardware in an in-depth performance analysis. We show, that branching behavior and memory exploitation are two main contributors to run-time. Based on these insights, we show how two common cost models would predict execution costs and why they fall short in determining run-time behavior for parallel execution. We reveal, that cost models which exploit only one performance parameter to determine execution costs are not able to predict the non-linear performance characteristics of modern CPUs.","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126809823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics 第三届VLDB内存数据管理与分析研讨会论文集
{"title":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","authors":"","doi":"10.1145/2803140","DOIUrl":"https://doi.org/10.1145/2803140","url":null,"abstract":"","PeriodicalId":175654,"journal":{"name":"Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114863487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信