Job Aware Scheduling Algorithm for MapReduce Framework

2011 IEEE Third International Conference on Cloud Computing Technology and Science Pub Date : 2011-11-29 DOI:10.1109/CloudCom.2011.112

R. Nanduri, N. Maheshwari, A. Reddyraja, Vasudeva Varma

{"title":"Job Aware Scheduling Algorithm for MapReduce Framework","authors":"R. Nanduri, N. Maheshwari, A. Reddyraja, Vasudeva Varma","doi":"10.1109/CloudCom.2011.112","DOIUrl":null,"url":null,"abstract":"MapReduce framework has received a wide acclaim over the past few years for large scale computing. It has become a standard paradigm for batch oriented workloads. As the adoption of this paradigm has increased rapidly, scheduling of these MapReduce jobs has become a problem of great interest in research community. We propose an approach which tries to maintain harmony among the jobs running on the cluster, and in turn decrease their runtime. In our model, the scheduler is made aware of different types of jobs running on the cluster. The scheduler tries to allocate a task on a node if the incoming task does not affect the tasks already running on that node. From the list of available pending tasks, our algorithm selects the one that is most compatible with the tasks already running on that node. We bring up heuristic and machine learning based solutions to our approach and try to maintain a resource balance on the cluster by not overloading any of the nodes, thereby reducing the overall runtime of the jobs. The results show a saving of runtime of around 21% in the case of heuristic based approach and around 27% in the case of machine learning based approach when compared to Yahoo's Capacity scheduler.","PeriodicalId":427190,"journal":{"name":"2011 IEEE Third International Conference on Cloud Computing Technology and Science","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Third International Conference on Cloud Computing Technology and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudCom.2011.112","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 63

Abstract

MapReduce framework has received a wide acclaim over the past few years for large scale computing. It has become a standard paradigm for batch oriented workloads. As the adoption of this paradigm has increased rapidly, scheduling of these MapReduce jobs has become a problem of great interest in research community. We propose an approach which tries to maintain harmony among the jobs running on the cluster, and in turn decrease their runtime. In our model, the scheduler is made aware of different types of jobs running on the cluster. The scheduler tries to allocate a task on a node if the incoming task does not affect the tasks already running on that node. From the list of available pending tasks, our algorithm selects the one that is most compatible with the tasks already running on that node. We bring up heuristic and machine learning based solutions to our approach and try to maintain a resource balance on the cluster by not overloading any of the nodes, thereby reducing the overall runtime of the jobs. The results show a saving of runtime of around 21% in the case of heuristic based approach and around 27% in the case of machine learning based approach when compared to Yahoo's Capacity scheduler.

查看原文本刊更多论文

MapReduce框架的Job感知调度算法

在过去的几年中，MapReduce框架在大规模计算方面获得了广泛的赞誉。它已经成为面向批处理工作负载的标准范例。随着这种范式的采用迅速增加，这些MapReduce作业的调度已经成为研究界非常感兴趣的问题。我们提出了一种方法，试图在集群上运行的作业之间保持和谐，从而减少它们的运行时间。在我们的模型中，调度器知道集群上运行的不同类型的作业。如果传入的任务不影响该节点上已经运行的任务，则调度器将尝试在该节点上分配任务。从可用的挂起任务列表中，我们的算法选择与该节点上已经运行的任务最兼容的任务。我们将启发式和基于机器学习的解决方案引入到我们的方法中，并尝试通过不使任何节点过载来保持集群上的资源平衡，从而减少作业的总体运行时间。结果显示，与雅虎的Capacity调度器相比，基于启发式方法的运行时间节省了约21%，基于机器学习的方法节省了约27%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE Third International Conference on Cloud Computing Technology and Science

自引率

0.00%

发文量