在混合工作负载的行存储和列存储之间架起桥梁

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-14 DOI:10.1145/2882903.2915231

Joy Arulraj, Andrew Pavlo, Prashanth Menon

{"title":"在混合工作负载的行存储和列存储之间架起桥梁","authors":"Joy Arulraj, Andrew Pavlo, Prashanth Menon","doi":"10.1145/2882903.2915231","DOIUrl":null,"url":null,"abstract":"Data-intensive applications seek to obtain trill insights in real-time by analyzing a combination of historical data sets alongside recently collected data. This means that to support such hybrid workloads, database management systems (DBMSs) need to handle both fast ACID transactions and complex analytical queries on the same database. But the current trend is to use specialized systems that are optimized for only one of these workloads, and thus require an organization to maintain separate copies of the database. This adds additional cost to deploying a database application in terms of both storage and administration overhead. To overcome this barrier, we present a hybrid DBMS architecture that efficiently supports varied workloads on the same database. Our approach differs from previous methods in that we use a single execution engine that is oblivious to the storage layout of data without sacrificing the performance benefits of the specialized systems. This obviates the need to maintain separate copies of the database in multiple independent systems. We also present a technique to continuously evolve the database's physical storage layout by analyzing the queries' access patterns and choosing the optimal layout for different segments of data within the same table. To evaluate this work, we implemented our architecture in an in-memory DBMS. Our results show that our approach delivers up to 3x higher throughput compared to static storage layouts across different workloads. We also demonstrate that our continuous adaptation mechanism allows the DBMS to achieve a near-optimal layout for an arbitrary workload without requiring any manual tuning.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"130","resultStr":"{\"title\":\"Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads\",\"authors\":\"Joy Arulraj, Andrew Pavlo, Prashanth Menon\",\"doi\":\"10.1145/2882903.2915231\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-intensive applications seek to obtain trill insights in real-time by analyzing a combination of historical data sets alongside recently collected data. This means that to support such hybrid workloads, database management systems (DBMSs) need to handle both fast ACID transactions and complex analytical queries on the same database. But the current trend is to use specialized systems that are optimized for only one of these workloads, and thus require an organization to maintain separate copies of the database. This adds additional cost to deploying a database application in terms of both storage and administration overhead. To overcome this barrier, we present a hybrid DBMS architecture that efficiently supports varied workloads on the same database. Our approach differs from previous methods in that we use a single execution engine that is oblivious to the storage layout of data without sacrificing the performance benefits of the specialized systems. This obviates the need to maintain separate copies of the database in multiple independent systems. We also present a technique to continuously evolve the database's physical storage layout by analyzing the queries' access patterns and choosing the optimal layout for different segments of data within the same table. To evaluate this work, we implemented our architecture in an in-memory DBMS. Our results show that our approach delivers up to 3x higher throughput compared to static storage layouts across different workloads. We also demonstrate that our continuous adaptation mechanism allows the DBMS to achieve a near-optimal layout for an arbitrary workload without requiring any manual tuning.\",\"PeriodicalId\":20483,\"journal\":{\"name\":\"Proceedings of the 2016 International Conference on Management of Data\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"130\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2016 International Conference on Management of Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2882903.2915231\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2882903.2915231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 130

摘要

数据密集型应用程序通过分析历史数据集和最近收集的数据，寻求实时获得令人兴奋的见解。这意味着为了支持这种混合工作负载，数据库管理系统(dbms)需要在同一数据库上处理快速ACID事务和复杂的分析查询。但是目前的趋势是使用专门的系统，这些系统只针对这些工作负载中的一种进行了优化，因此需要组织维护数据库的单独副本。这在存储和管理开销方面增加了部署数据库应用程序的额外成本。为了克服这一障碍，我们提出了一种混合DBMS体系结构，它可以有效地支持同一数据库上的各种工作负载。我们的方法与以前的方法不同，因为我们使用一个单一的执行引擎，它忽略了数据的存储布局，而不会牺牲专用系统的性能优势。这避免了在多个独立系统中维护数据库的单独副本的需要。我们还提出了一种技术，通过分析查询的访问模式，并为同一表中的不同数据段选择最佳布局，从而不断发展数据库的物理存储布局。为了评估这项工作，我们在内存DBMS中实现了我们的体系结构。我们的结果表明，与跨不同工作负载的静态存储布局相比，我们的方法提供了高达3倍的吞吐量。我们还演示了我们的连续适应机制允许DBMS在不需要任何手动调优的情况下为任意工作负载实现近乎最佳的布局。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads

Data-intensive applications seek to obtain trill insights in real-time by analyzing a combination of historical data sets alongside recently collected data. This means that to support such hybrid workloads, database management systems (DBMSs) need to handle both fast ACID transactions and complex analytical queries on the same database. But the current trend is to use specialized systems that are optimized for only one of these workloads, and thus require an organization to maintain separate copies of the database. This adds additional cost to deploying a database application in terms of both storage and administration overhead. To overcome this barrier, we present a hybrid DBMS architecture that efficiently supports varied workloads on the same database. Our approach differs from previous methods in that we use a single execution engine that is oblivious to the storage layout of data without sacrificing the performance benefits of the specialized systems. This obviates the need to maintain separate copies of the database in multiple independent systems. We also present a technique to continuously evolve the database's physical storage layout by analyzing the queries' access patterns and choosing the optimal layout for different segments of data within the same table. To evaluate this work, we implemented our architecture in an in-memory DBMS. Our results show that our approach delivers up to 3x higher throughput compared to static storage layouts across different workloads. We also demonstrate that our continuous adaptation mechanism allows the DBMS to achieve a near-optimal layout for an arbitrary workload without requiring any manual tuning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2016 International Conference on Management of Data

自引率

0.00%

发文量