Hadoop集群MapReduce并发作业数自配置

2015 IEEE International Conference on Autonomic Computing Pub Date : 2015-07-07 DOI:10.1109/ICAC.2015.54

Bo Zhang, Filip Krikava, Romain Rouvoy, L. Seinturier

{"title":"Hadoop集群MapReduce并发作业数自配置","authors":"Bo Zhang, Filip Krikava, Romain Rouvoy, L. Seinturier","doi":"10.1109/ICAC.2015.54","DOIUrl":null,"url":null,"abstract":"There is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value can significantly reduce job response times leaving only sub optimal resource usage. To overcome this problem, we propose a feedback control loop based approach that dynamically adjusts the Hadoop resource manager configuration based on the current state of the cluster. The preliminary assessment based on workloads synthesized from real-world traces shows that the system performance can be improved by about 30% compared to default Hadoop setup.","PeriodicalId":6643,"journal":{"name":"2015 IEEE International Conference on Autonomic Computing","volume":"16 1","pages":"149-150"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Self-Configuration of the Number of Concurrently Running MapReduce Jobs in a Hadoop Cluster\",\"authors\":\"Bo Zhang, Filip Krikava, Romain Rouvoy, L. Seinturier\",\"doi\":\"10.1109/ICAC.2015.54\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value can significantly reduce job response times leaving only sub optimal resource usage. To overcome this problem, we propose a feedback control loop based approach that dynamically adjusts the Hadoop resource manager configuration based on the current state of the cluster. The preliminary assessment based on workloads synthesized from real-world traces shows that the system performance can be improved by about 30% compared to default Hadoop setup.\",\"PeriodicalId\":6643,\"journal\":{\"name\":\"2015 IEEE International Conference on Autonomic Computing\",\"volume\":\"16 1\",\"pages\":\"149-150\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Autonomic Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAC.2015.54\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Autonomic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAC.2015.54","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

在Hadoop集群的节点中，并发运行MapReduce作业的数量与其对应的map和reduce任务之间存在权衡。将这种权衡静态配置为单个值可以显著减少作业响应时间，只留下次优的资源使用。为了克服这个问题，我们提出了一种基于反馈控制循环的方法，该方法可以根据集群的当前状态动态调整Hadoop资源管理器配置。基于从实际跟踪中合成的工作负载的初步评估表明，与默认Hadoop设置相比，系统性能可以提高约30%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-Configuration of the Number of Concurrently Running MapReduce Jobs in a Hadoop Cluster

There is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value can significantly reduce job response times leaving only sub optimal resource usage. To overcome this problem, we propose a feedback control loop based approach that dynamically adjusts the Hadoop resource manager configuration based on the current state of the cluster. The preliminary assessment based on workloads synthesized from real-world traces shows that the system performance can be improved by about 30% compared to default Hadoop setup.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Conference on Autonomic Computing

自引率

0.00%

发文量