Tutorial: Adaptive Replication and Partitioning in Data Systems

International Middleware Conference Pub Date : 2018-12-10 DOI:10.1145/3279945.3279946

Brad Glasbergen, Michael Abebe, Khuzaima S. Daudjee

{"title":"Tutorial: Adaptive Replication and Partitioning in Data Systems","authors":"Brad Glasbergen, Michael Abebe, Khuzaima S. Daudjee","doi":"10.1145/3279945.3279946","DOIUrl":null,"url":null,"abstract":"To meet growing application demands, distributed data systems replicate and partition data across multiple machines. Replication increases the resource and request processing capabilities of a system by spreading copies of the data across multiple machines, while partitioning splits data across machines to achieve the same objectives. Replication and partitioning present different trade-offs in the form of replication maintenance and multi-machine coordination costs, which system administrators must carefully evaluate. Traditionally, administrators made replication and partitioning decisions based on their understanding of the application workload, which results in suboptimal performance if the system is misconfigured or if the workload changes. However, systems that adaptively employ replication and partitioning can adjust these decisions based on workload observations and predictions, which improves performance and reduces complexity for administrators. In this tutorial, we present an overview of techniques used by systems to adaptively partition and replicate data and services. We focus on the decision-making strategies employed by these systems, and how these decisions are executed in an online environment. Finally, we identify opportunities for research in the area.","PeriodicalId":262822,"journal":{"name":"International Middleware Conference","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3279945.3279946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

To meet growing application demands, distributed data systems replicate and partition data across multiple machines. Replication increases the resource and request processing capabilities of a system by spreading copies of the data across multiple machines, while partitioning splits data across machines to achieve the same objectives. Replication and partitioning present different trade-offs in the form of replication maintenance and multi-machine coordination costs, which system administrators must carefully evaluate. Traditionally, administrators made replication and partitioning decisions based on their understanding of the application workload, which results in suboptimal performance if the system is misconfigured or if the workload changes. However, systems that adaptively employ replication and partitioning can adjust these decisions based on workload observations and predictions, which improves performance and reduces complexity for administrators. In this tutorial, we present an overview of techniques used by systems to adaptively partition and replicate data and services. We focus on the decision-making strategies employed by these systems, and how these decisions are executed in an online environment. Finally, we identify opportunities for research in the area.

查看原文本刊更多论文

教程:数据系统中的自适应复制和分区

为了满足不断增长的应用程序需求，分布式数据系统在多台机器上复制和分区数据。复制通过在多台机器上传播数据副本来增加系统的资源和请求处理能力，而分区则通过在机器之间分割数据来实现相同的目标。复制和分区以复制维护和多机器协调成本的形式提供了不同的权衡，系统管理员必须仔细评估。传统上，管理员根据他们对应用程序工作负载的理解做出复制和分区决策，如果系统配置错误或工作负载发生变化，就会导致性能不理想。但是，自适应地使用复制和分区的系统可以根据工作负载观察和预测调整这些决策，从而提高性能并降低管理员的复杂性。在本教程中，我们概述了系统用于自适应分区和复制数据和服务的技术。我们关注这些系统所采用的决策策略，以及这些决策如何在在线环境中执行。最后，我们确定了该领域的研究机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Middleware Conference

自引率

0.00%

发文量