Hadoop YARN中基于资源可用性控制的自适应节点和容器感知调度器的性能改进

IF 2.2 4区 计算机科学 Q2 Computer Science
J. S. Manjaly, T. Subbulakshmi
{"title":"Hadoop YARN中基于资源可用性控制的自适应节点和容器感知调度器的性能改进","authors":"J. S. Manjaly, T. Subbulakshmi","doi":"10.32604/csse.2023.036320","DOIUrl":null,"url":null,"abstract":"The default scheduler of Apache Hadoop demonstrates operational inefficiencies when connecting external sources and processing transformation jobs. This paper has proposed a novel scheduler for enhancement of the performance of the Hadoop Yet Another Resource Negotiator (YARN) scheduler, called the Adaptive Node and Container Aware Scheduler (ANACRAC), that aligns cluster resources to the demands of the applications in the real world. The approach performs to leverage the user-provided configurations as a unique design to apportion nodes, or containers within the nodes, to application thresholds. Additionally, it provides the flexibility to the applications for selecting and choosing which node’s resources they want to manage and adds limits to prevent threshold breaches by adding additional jobs as needed. Node or container awareness can be utilized individually or in combination to increase efficiency. On top of this, the resource availability within the node and containers can also be investigated. This paper also focuses on the elasticity of the containers and self-adaptiveness depending on the job type. The results proved that 15%–20% performance improvement was achieved compared with the node and container awareness feature of the ANACRAC. It has been validated that this ANACRAC scheduler demonstrates a 70%–90% performance improvement compared with the default Fair scheduler. Experimental results also demonstrated the success of the enhancement and a performance improvement in the range of 60% to 200% when applications were connected with external interfaces and high workloads.","PeriodicalId":50634,"journal":{"name":"Computer Systems Science and Engineering","volume":"48 1","pages":"0"},"PeriodicalIF":2.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Improvement through Novel Adaptive Node and Container Aware Scheduler with Resource Availability Control in Hadoop YARN\",\"authors\":\"J. S. Manjaly, T. Subbulakshmi\",\"doi\":\"10.32604/csse.2023.036320\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The default scheduler of Apache Hadoop demonstrates operational inefficiencies when connecting external sources and processing transformation jobs. This paper has proposed a novel scheduler for enhancement of the performance of the Hadoop Yet Another Resource Negotiator (YARN) scheduler, called the Adaptive Node and Container Aware Scheduler (ANACRAC), that aligns cluster resources to the demands of the applications in the real world. The approach performs to leverage the user-provided configurations as a unique design to apportion nodes, or containers within the nodes, to application thresholds. Additionally, it provides the flexibility to the applications for selecting and choosing which node’s resources they want to manage and adds limits to prevent threshold breaches by adding additional jobs as needed. Node or container awareness can be utilized individually or in combination to increase efficiency. On top of this, the resource availability within the node and containers can also be investigated. This paper also focuses on the elasticity of the containers and self-adaptiveness depending on the job type. The results proved that 15%–20% performance improvement was achieved compared with the node and container awareness feature of the ANACRAC. It has been validated that this ANACRAC scheduler demonstrates a 70%–90% performance improvement compared with the default Fair scheduler. Experimental results also demonstrated the success of the enhancement and a performance improvement in the range of 60% to 200% when applications were connected with external interfaces and high workloads.\",\"PeriodicalId\":50634,\"journal\":{\"name\":\"Computer Systems Science and Engineering\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Systems Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32604/csse.2023.036320\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Systems Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/csse.2023.036320","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

摘要

Apache Hadoop的默认调度器在连接外部源和处理转换作业时显示了操作效率低下。本文提出了一种新的调度器,用于增强Hadoop另一种资源协商器(YARN)调度器的性能,称为自适应节点和容器感知调度器(ANACRAC),它将集群资源与现实世界中应用程序的需求保持一致。该方法利用用户提供的配置作为一种独特的设计,将节点或节点内的容器分配给应用程序阈值。此外,它为应用程序提供了选择和选择它们想要管理的节点资源的灵活性,并根据需要添加额外的作业来增加限制,以防止超出阈值。节点或容器感知可以单独使用,也可以组合使用,以提高效率。除此之外,还可以调查节点和容器内的资源可用性。本文还重点讨论了容器的弹性和根据作业类型的自适应性。结果表明,与ANACRAC的节点和容器感知特性相比,该方法的性能提高了15%-20%。经过验证,与默认的Fair调度器相比,这个ANACRAC调度器的性能提高了70%-90%。实验结果也证明了增强的成功,当应用程序与外部接口和高工作负载连接时,性能提高了60%到200%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance Improvement through Novel Adaptive Node and Container Aware Scheduler with Resource Availability Control in Hadoop YARN
The default scheduler of Apache Hadoop demonstrates operational inefficiencies when connecting external sources and processing transformation jobs. This paper has proposed a novel scheduler for enhancement of the performance of the Hadoop Yet Another Resource Negotiator (YARN) scheduler, called the Adaptive Node and Container Aware Scheduler (ANACRAC), that aligns cluster resources to the demands of the applications in the real world. The approach performs to leverage the user-provided configurations as a unique design to apportion nodes, or containers within the nodes, to application thresholds. Additionally, it provides the flexibility to the applications for selecting and choosing which node’s resources they want to manage and adds limits to prevent threshold breaches by adding additional jobs as needed. Node or container awareness can be utilized individually or in combination to increase efficiency. On top of this, the resource availability within the node and containers can also be investigated. This paper also focuses on the elasticity of the containers and self-adaptiveness depending on the job type. The results proved that 15%–20% performance improvement was achieved compared with the node and container awareness feature of the ANACRAC. It has been validated that this ANACRAC scheduler demonstrates a 70%–90% performance improvement compared with the default Fair scheduler. Experimental results also demonstrated the success of the enhancement and a performance improvement in the range of 60% to 200% when applications were connected with external interfaces and high workloads.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Systems Science and Engineering
Computer Systems Science and Engineering 工程技术-计算机:理论方法
CiteScore
3.10
自引率
13.60%
发文量
308
审稿时长
>12 weeks
期刊介绍: The journal is devoted to the publication of high quality papers on theoretical developments in computer systems science, and their applications in computer systems engineering. Original research papers, state-of-the-art reviews and technical notes are invited for publication. All papers will be refereed by acknowledged experts in the field, and may be (i) accepted without change, (ii) require amendment and subsequent re-refereeing, or (iii) be rejected on the grounds of either relevance or content. The submission of a paper implies that, if accepted for publication, it will not be published elsewhere in the same form, in any language, without the prior consent of the Publisher.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信