Generalized scale independence through incremental precomputation

Proceedings. ACM-SIGMOD International Conference on Management of Data Pub Date : 2013-06-22 DOI:10.1145/2463676.2465333

Michael Armbrust, Eric Liang, Tim Kraska, A. Fox, M. Franklin, D. Patterson

{"title":"Generalized scale independence through incremental precomputation","authors":"Michael Armbrust, Eric Liang, Tim Kraska, A. Fox, M. Franklin, D. Patterson","doi":"10.1145/2463676.2465333","DOIUrl":null,"url":null,"abstract":"Developers of rapidly growing applications must be able to anticipate potential scalability problems before they cause performance issues in production environments. A new type of data independence, called scale independence, seeks to address this challenge by guaranteeing a bounded amount of work is required to execute all queries in an application, independent of the size of the underlying data. While optimization strategies have been developed to provide these guarantees for the class of queries that are scale-independent when executed using simple indexes, there are important queries for which such techniques are insufficient.\n Executing these more complex queries scale-independently requires precomputation using incrementally-maintained materialized views. However, since this precomputation effectively shifts some of the query processing burden from execution time to insertion time, a scale-independent system must be careful to ensure that storage and maintenance costs do not threaten scalability. In this paper, we describe a scale-independent view selection and maintenance system, which uses novel static analysis techniques that ensure that created views do not themselves become scaling bottlenecks. Finally, we present an empirical analysis that includes all the queries from the TPC-W benchmark and validates our implementation's ability to maintain nearly constant high-quantile query and update latency even as an application scales to hundreds of machines.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"195 1","pages":"625-636"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. ACM-SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2463676.2465333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 41

Abstract

Developers of rapidly growing applications must be able to anticipate potential scalability problems before they cause performance issues in production environments. A new type of data independence, called scale independence, seeks to address this challenge by guaranteeing a bounded amount of work is required to execute all queries in an application, independent of the size of the underlying data. While optimization strategies have been developed to provide these guarantees for the class of queries that are scale-independent when executed using simple indexes, there are important queries for which such techniques are insufficient. Executing these more complex queries scale-independently requires precomputation using incrementally-maintained materialized views. However, since this precomputation effectively shifts some of the query processing burden from execution time to insertion time, a scale-independent system must be careful to ensure that storage and maintenance costs do not threaten scalability. In this paper, we describe a scale-independent view selection and maintenance system, which uses novel static analysis techniques that ensure that created views do not themselves become scaling bottlenecks. Finally, we present an empirical analysis that includes all the queries from the TPC-W benchmark and validates our implementation's ability to maintain nearly constant high-quantile query and update latency even as an application scales to hundreds of machines.

查看原文本刊更多论文

基于增量预计算的广义尺度无关性

快速增长的应用程序的开发人员必须能够在潜在的可伸缩性问题在生产环境中引起性能问题之前预测到它们。一种新的数据独立性，称为规模独立性，试图通过保证执行应用程序中所有查询所需的有限工作量来解决这一挑战，而与底层数据的大小无关。虽然已经开发了优化策略来为使用简单索引执行时与规模无关的查询类提供这些保证，但对于一些重要的查询，这些技术是不够的。执行这些更复杂的查询需要使用增量维护的物化视图进行预计算。但是，由于这种预计算有效地将一些查询处理负担从执行时间转移到插入时间，因此与规模无关的系统必须小心确保存储和维护成本不会威胁到可伸缩性。在本文中，我们描述了一个与规模无关的视图选择和维护系统，该系统使用新颖的静态分析技术来确保创建的视图本身不会成为扩展瓶颈。最后，我们提供了一个实证分析，其中包括来自TPC-W基准的所有查询，并验证了我们的实现在应用程序扩展到数百台机器时保持几乎恒定的高分位数查询和更新延迟的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. ACM-SIGMOD International Conference on Management of Data

自引率

0.00%

发文量