FlexSlot:通过灵活的插槽管理将Hadoop迁移到云端

Yanfei Guo, J. Rao, Changjun Jiang, Xiaobo Zhou
{"title":"FlexSlot:通过灵活的插槽管理将Hadoop迁移到云端","authors":"Yanfei Guo, J. Rao, Changjun Jiang, Xiaobo Zhou","doi":"10.1109/SC.2014.83","DOIUrl":null,"url":null,"abstract":"Load imbalance is a major source of overhead in Hadoop where the uneven distribution of input data among tasks can significantly delays the job completion. Running Hadoop in a private cloud opens up opportunities for mitigating data skew with elastic resource allocation, where stragglers are expedited with more resources, yet introduces problems that often cancel out the performance gain: (1) performance interference from co running jobs may create new stragglers, (2) there exist a semantic gap between Hadoop task management and resource pool-based virtual cluster management preventing efficient resource usage. We present FlexSlot, a user-transparent task slot management scheme that automatically identifies map stragglers and resizes their slots accordingly to accelerate task execution. FlexSlot adaptively changes the number of slots on each virtual node to promote efficient usage of resource pool. Experimental results with representative benchmarks show that FlexSlot effectively reduces job completion time by 46% and achieves better resource utilization.","PeriodicalId":275261,"journal":{"name":"SC14: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"FlexSlot: Moving Hadoop Into the Cloud with Flexible Slot Management\",\"authors\":\"Yanfei Guo, J. Rao, Changjun Jiang, Xiaobo Zhou\",\"doi\":\"10.1109/SC.2014.83\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Load imbalance is a major source of overhead in Hadoop where the uneven distribution of input data among tasks can significantly delays the job completion. Running Hadoop in a private cloud opens up opportunities for mitigating data skew with elastic resource allocation, where stragglers are expedited with more resources, yet introduces problems that often cancel out the performance gain: (1) performance interference from co running jobs may create new stragglers, (2) there exist a semantic gap between Hadoop task management and resource pool-based virtual cluster management preventing efficient resource usage. We present FlexSlot, a user-transparent task slot management scheme that automatically identifies map stragglers and resizes their slots accordingly to accelerate task execution. FlexSlot adaptively changes the number of slots on each virtual node to promote efficient usage of resource pool. Experimental results with representative benchmarks show that FlexSlot effectively reduces job completion time by 46% and achieves better resource utilization.\",\"PeriodicalId\":275261,\"journal\":{\"name\":\"SC14: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SC14: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC.2014.83\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC14: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.2014.83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

摘要

负载不平衡是Hadoop开销的主要来源,任务之间输入数据的不均匀分布会严重延迟任务的完成。在私有云中运行Hadoop为通过弹性资源分配来减轻数据倾斜提供了机会,在这种情况下,分散的任务可以使用更多的资源来加速,但也会引入一些问题,这些问题通常会抵消性能收益:(1)共同运行作业的性能干扰可能会产生新的分散的任务,(2)在Hadoop任务管理和基于资源池的虚拟集群管理之间存在语义差距,阻碍了有效的资源使用。我们提出FlexSlot,一个用户透明的任务插槽管理方案,自动识别映射掉队者,并相应地调整其插槽大小,以加快任务执行。FlexSlot自适应地改变每个虚拟节点的槽位数,以提高资源池的利用率。具有代表性的基准测试的实验结果表明,FlexSlot有效地减少了46%的作业完成时间,实现了更好的资源利用率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
FlexSlot: Moving Hadoop Into the Cloud with Flexible Slot Management
Load imbalance is a major source of overhead in Hadoop where the uneven distribution of input data among tasks can significantly delays the job completion. Running Hadoop in a private cloud opens up opportunities for mitigating data skew with elastic resource allocation, where stragglers are expedited with more resources, yet introduces problems that often cancel out the performance gain: (1) performance interference from co running jobs may create new stragglers, (2) there exist a semantic gap between Hadoop task management and resource pool-based virtual cluster management preventing efficient resource usage. We present FlexSlot, a user-transparent task slot management scheme that automatically identifies map stragglers and resizes their slots accordingly to accelerate task execution. FlexSlot adaptively changes the number of slots on each virtual node to promote efficient usage of resource pool. Experimental results with representative benchmarks show that FlexSlot effectively reduces job completion time by 46% and achieves better resource utilization.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信