Job Scheduling Optimization for Multi-user MapReduce Clusters

2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming Pub Date : 2011-12-09 DOI:10.1109/PAAP.2011.33

Yongcai Tao, Qing Zhang, Lei Shi, Pinhua Chen

引用次数: 27

Abstract

A shared MapReduce cluster is beneficial to build data warehouse which can be used by multiple users. FAIR scheduler gives each user the illusion of owning a private cluster. Moreover, it can dynamic redistribute capacity unused by some users to other users. However, when reassigning the slots, FAIR picks the most recently launched tasks to kill without considering the job character and data locality, which increases the network traffic while rescheduling the killed Map/Reduce tasks. The paper, based on FAIR scheduling, proposes an improved FAIR scheduling algorithm, which take into account the job character and data locality while killing tasks to make slots for new users. Performance evaluation results demonstrate that the improved FAIR decreases the data movement, speeds the execution of jobs, consequently improving the system performance.

查看原文本刊更多论文

多用户MapReduce集群作业调度优化

共享的MapReduce集群有利于构建可供多个用户使用的数据仓库。FAIR调度器给每个用户一种拥有私有集群的错觉。此外，它可以动态地将一些用户未使用的容量重新分配给其他用户。然而，当重新分配插槽时，FAIR会选择最近启动的任务来终止，而不考虑任务的特征和数据位置，这在重新调度已终止的Map/Reduce任务时增加了网络流量。本文在FAIR调度的基础上，提出了一种改进的FAIR调度算法，该算法在删除任务为新用户腾出插槽的同时，考虑了任务的特性和数据的局部性。性能评估结果表明，改进后的FAIR减少了数据移动，加快了作业的执行速度，从而提高了系统性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming

自引率

0.00%

发文量