Performance Management of Accelerated MapReduce Workloads in Heterogeneous Clusters

2010 39th International Conference on Parallel Processing Pub Date : 2010-09-13 DOI:10.1109/ICPP.2010.73

Jordà Polo, David Carrera, Y. Becerra, Vicencc Beltran, J. Torres, E. Ayguadé

{"title":"Performance Management of Accelerated MapReduce Workloads in Heterogeneous Clusters","authors":"Jordà Polo, David Carrera, Y. Becerra, Vicencc Beltran, J. Torres, E. Ayguadé","doi":"10.1109/ICPP.2010.73","DOIUrl":null,"url":null,"abstract":"Next generation data centers will be composed of thousands of hybrid systems in an attempt to increase overall cluster performance and to minimize energy consumption. New programming models, such as MapReduce, specifically designed to make the most of very large infrastructures will be leveraged to develop massively distributed services. At the same time, data centers will bring an unprecedented degree of workload consolidation, hosting in the same infrastructure distributed services from many different users. In this paper we present our advancements in leveraging the Adaptive MapReduce Scheduler to meet user defined high level performance goals while transparently and efficiently exploiting the capabilities of hybrid systems. While the Adaptive Scheduler was already able to dynamically allocate resources to co-located MapReduce jobs based on their completion time goals, it was completely unaware of specific hardware capabilities. In our work we describe the changes introduced in the Adaptive Scheduler to enable it with hardware awareness and with the ability to co-schedule accelerable and non-accelerable jobs on the same heterogeneous MapReduce cluster, making the most of the underlying hybrid systems. The developed prototype is tested in a cluster of Cell/BE blades and relies on the use of accelerated and non-accelerated versions of the MapReduce tasks of different deployed applications to dynamically select the best version to run on each node. Decisions are made after workload composition and jobs' completion time goals. Results show that the augmented Adaptive Scheduler provides dynamic resource allocation across jobs, hardware affinity when possible, and is even able to spread jobs' tasks across accelerated and non-accelerated nodes in order to meet performance goals in extreme conditions. To our knowledge this is the first MapReduce scheduler and prototype that is able to manage high-level performance goals even in presence of hybrid systems and accelerable jobs.","PeriodicalId":180554,"journal":{"name":"2010 39th International Conference on Parallel Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"64","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 39th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2010.73","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 64

Abstract

Next generation data centers will be composed of thousands of hybrid systems in an attempt to increase overall cluster performance and to minimize energy consumption. New programming models, such as MapReduce, specifically designed to make the most of very large infrastructures will be leveraged to develop massively distributed services. At the same time, data centers will bring an unprecedented degree of workload consolidation, hosting in the same infrastructure distributed services from many different users. In this paper we present our advancements in leveraging the Adaptive MapReduce Scheduler to meet user defined high level performance goals while transparently and efficiently exploiting the capabilities of hybrid systems. While the Adaptive Scheduler was already able to dynamically allocate resources to co-located MapReduce jobs based on their completion time goals, it was completely unaware of specific hardware capabilities. In our work we describe the changes introduced in the Adaptive Scheduler to enable it with hardware awareness and with the ability to co-schedule accelerable and non-accelerable jobs on the same heterogeneous MapReduce cluster, making the most of the underlying hybrid systems. The developed prototype is tested in a cluster of Cell/BE blades and relies on the use of accelerated and non-accelerated versions of the MapReduce tasks of different deployed applications to dynamically select the best version to run on each node. Decisions are made after workload composition and jobs' completion time goals. Results show that the augmented Adaptive Scheduler provides dynamic resource allocation across jobs, hardware affinity when possible, and is even able to spread jobs' tasks across accelerated and non-accelerated nodes in order to meet performance goals in extreme conditions. To our knowledge this is the first MapReduce scheduler and prototype that is able to manage high-level performance goals even in presence of hybrid systems and accelerable jobs.

查看原文本刊更多论文

异构集群下MapReduce加速工作负载性能管理

下一代数据中心将由数千个混合系统组成，以提高整体集群性能并最大限度地减少能源消耗。新的编程模型，如MapReduce，专门设计用来充分利用非常大的基础设施，将被用来开发大规模的分布式服务。与此同时，数据中心将带来前所未有的工作负载整合程度，在同一基础设施中托管来自许多不同用户的分布式服务。在本文中，我们介绍了我们在利用自适应MapReduce调度器来满足用户定义的高级性能目标方面的进展，同时透明有效地利用混合系统的功能。虽然Adaptive Scheduler已经能够根据MapReduce作业的完成时间目标动态地为它们分配资源，但它完全不知道具体的硬件功能。在我们的工作中，我们描述了自适应调度器中引入的变化，以使其具有硬件感知能力，并能够在相同的异构MapReduce集群上共同调度可加速和不可加速的作业，从而充分利用底层混合系统。开发的原型在Cell/BE刀片集群中进行测试，并依赖于使用不同部署应用程序的加速和非加速版本的MapReduce任务，以动态选择在每个节点上运行的最佳版本。决策是在工作量构成和工作完成时间目标之后做出的。结果表明，增强的Adaptive Scheduler在可能的情况下提供了跨作业和硬件关联的动态资源分配，甚至能够将作业的任务分散到加速和非加速节点上，以满足极端条件下的性能目标。据我们所知，这是第一个能够在混合系统和可加速作业中管理高水平性能目标的MapReduce调度程序和原型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 39th International Conference on Parallel Processing

自引率

0.00%

发文量