YARN Schedulers for Hadoop MapReduce Jobs: Design Goals, Issues and Taxonomy

Q3 Computer Science
Gnanendra Kotikam, S. Lokesh
{"title":"YARN Schedulers for Hadoop MapReduce Jobs: Design Goals, Issues and Taxonomy","authors":"Gnanendra Kotikam, S. Lokesh","doi":"10.2174/2666255816666220831125012","DOIUrl":null,"url":null,"abstract":"\n\nBig Data processing is a demanding task, and several big data processing frameworks have emerged during recent decades. The performance of these frameworks greatly dependent on resource management models.\n\n\n\nYARN is one of such models which acts as a resource management layer and provides computational resources for execution engines (Spark, MapReduce, storm, etc.) through its schedulers. The most important aspect of resource management is job scheduling.\n\n\n\nIn this paper, we first present the design goal of YARN real-life schedulers (FIFO, Capacity, and Fair) for the MapReduce engine. Later, we discuss the scheduling issues of the Hadoop MapReduce cluster.\n\n\n\nMany efforts have been carried out in the literature to address issues of data locality, heterogeneity, straggling, skew mitigation, stragglers and fairness in Hadoop MapReduce scheduling. Lastly, we present the taxonomy of different scheduling algorithms available in the literature based on some factors like environment, scope, approach, objective and addressed issues.\n","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/2666255816666220831125012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Big Data processing is a demanding task, and several big data processing frameworks have emerged during recent decades. The performance of these frameworks greatly dependent on resource management models. YARN is one of such models which acts as a resource management layer and provides computational resources for execution engines (Spark, MapReduce, storm, etc.) through its schedulers. The most important aspect of resource management is job scheduling. In this paper, we first present the design goal of YARN real-life schedulers (FIFO, Capacity, and Fair) for the MapReduce engine. Later, we discuss the scheduling issues of the Hadoop MapReduce cluster. Many efforts have been carried out in the literature to address issues of data locality, heterogeneity, straggling, skew mitigation, stragglers and fairness in Hadoop MapReduce scheduling. Lastly, we present the taxonomy of different scheduling algorithms available in the literature based on some factors like environment, scope, approach, objective and addressed issues.
Hadoop MapReduce作业的YARN调度器:设计目标、问题和分类
大数据处理是一项要求很高的任务,近几十年来出现了几种大数据处理框架。这些框架的性能在很大程度上依赖于资源管理模型。YARN就是这样一个模型,它作为一个资源管理层,通过它的调度程序为执行引擎(Spark, MapReduce, storm等)提供计算资源。资源管理最重要的方面是作业调度。在本文中,我们首先提出了MapReduce引擎的YARN现实调度程序(FIFO, Capacity和Fair)的设计目标。稍后,我们将讨论Hadoop MapReduce集群的调度问题。文献中已经进行了许多努力来解决Hadoop MapReduce调度中的数据局部性、异构性、散列、倾斜缓解、散列和公平性问题。最后,我们根据环境、范围、方法、目标和解决问题等因素对文献中不同的调度算法进行了分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Recent Advances in Computer Science and Communications
Recent Advances in Computer Science and Communications Computer Science-Computer Science (all)
CiteScore
2.50
自引率
0.00%
发文量
142
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信