CAM: a topology aware minimum cost flow based resource manager for MapReduce applications in the cloud

IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2012-06-18 DOI:10.1145/2287076.2287110

Min Li, Dinesh Subhraveti, A. Butt, Aleksandr Khasymski, P. Sarkar

{"title":"CAM: a topology aware minimum cost flow based resource manager for MapReduce applications in the cloud","authors":"Min Li, Dinesh Subhraveti, A. Butt, Aleksandr Khasymski, P. Sarkar","doi":"10.1145/2287076.2287110","DOIUrl":null,"url":null,"abstract":"MapReduce has emerged as a prevailing distributed computation paradigm for enterprise and large-scale data-intensive computing. The model is also increasingly used in the massively-parallel cloud environment, where MapReduce jobs are run on a set of virtual machines (VMs) on pay-as-needed basis. However, MapReduce jobs suffer from performance degradation when running in the cloud due to inefficient resource allocation. In particular, the MapReduce model is designed for and leverages information from the native clusters to operate efficiently, whereas the cloud presents a virtual cluster topology overlying or hiding actual network information. This results in two placement anomalies: loss of data locality and loss of job locality, where jobs are placed physically away from their data or other associated jobs, adversely affecting their performance.\n In this paper we propose, CAM, a cloud platform that provides an innovative resource scheduler particularly designed for hosting MapReduce applications in the cloud. CAM reconciles both data and VM resource allocation with a variety of competing constraints, such as storage utilization, changing CPU load and network link capacities. CAM uses a flow-network-based algorithm that is able to optimize MapReduce performance under the specified constraints -- not only by initial placement, but by readjusting through VM and data migration as well. Additionally, our platform exposes, otherwise hidden, lower-level topology information to the MapReduce job scheduler so that it makes optimal task assignments. Evaluation of CAM using both micro-benchmarks and simulations on a 23 VM cluster shows that compared to a state-of-the-art resource allocator, our system reduces network traffic and average MapReduce job execution time by a factor of 3 and 8.6, respectively.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2287076.2287110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 45

Abstract

MapReduce has emerged as a prevailing distributed computation paradigm for enterprise and large-scale data-intensive computing. The model is also increasingly used in the massively-parallel cloud environment, where MapReduce jobs are run on a set of virtual machines (VMs) on pay-as-needed basis. However, MapReduce jobs suffer from performance degradation when running in the cloud due to inefficient resource allocation. In particular, the MapReduce model is designed for and leverages information from the native clusters to operate efficiently, whereas the cloud presents a virtual cluster topology overlying or hiding actual network information. This results in two placement anomalies: loss of data locality and loss of job locality, where jobs are placed physically away from their data or other associated jobs, adversely affecting their performance. In this paper we propose, CAM, a cloud platform that provides an innovative resource scheduler particularly designed for hosting MapReduce applications in the cloud. CAM reconciles both data and VM resource allocation with a variety of competing constraints, such as storage utilization, changing CPU load and network link capacities. CAM uses a flow-network-based algorithm that is able to optimize MapReduce performance under the specified constraints -- not only by initial placement, but by readjusting through VM and data migration as well. Additionally, our platform exposes, otherwise hidden, lower-level topology information to the MapReduce job scheduler so that it makes optimal task assignments. Evaluation of CAM using both micro-benchmarks and simulations on a 23 VM cluster shows that compared to a state-of-the-art resource allocator, our system reduces network traffic and average MapReduce job execution time by a factor of 3 and 8.6, respectively.

查看原文本刊更多论文

CAM:基于拓扑感知的最小成本流的资源管理器，用于云中的MapReduce应用程序

MapReduce已经成为企业和大规模数据密集型计算的主流分布式计算范式。该模型也越来越多地用于大规模并行云环境，其中MapReduce作业在一组按需付费的虚拟机(vm)上运行。然而，MapReduce作业在云中运行时，由于资源分配效率低下，导致性能下降。特别是，MapReduce模型是为本地集群设计的，并利用来自本地集群的信息来高效地运行，而云提供了覆盖或隐藏实际网络信息的虚拟集群拓扑。这将导致两种放置异常:丢失数据局部性和丢失工作局部性，即工作在物理上远离其数据或其他相关工作，从而对其性能产生不利影响。在本文中，我们提出了CAM，这是一个云平台，它提供了一个创新的资源调度程序，专门为在云中托管MapReduce应用程序而设计。CAM将数据和VM资源分配与各种竞争约束(如存储利用率、不断变化的CPU负载和网络链路容量)协调起来。CAM使用一种基于流网络的算法，能够在指定的约束条件下优化MapReduce的性能——不仅通过初始放置，还通过VM和数据迁移进行重新调整。此外，我们的平台将隐藏的底层拓扑信息公开给MapReduce作业调度器，以便它进行最佳的任务分配。在23个VM集群上使用微基准测试和模拟对CAM进行评估，结果表明，与最先进的资源分配器相比，我们的系统将网络流量和平均MapReduce作业执行时间分别减少了3倍和8.6倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Symposium on High-Performance Parallel Distributed Computing

自引率

0.00%

发文量