MapReduce/Hadoop的耦合调度程序

IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2012-06-18 DOI:10.1145/2287076.2287097

Jian Tan, Xiaoqiao Meng, Li Zhang

{"title":"MapReduce/Hadoop的耦合调度程序","authors":"Jian Tan, Xiaoqiao Meng, Li Zhang","doi":"10.1145/2287076.2287097","DOIUrl":null,"url":null,"abstract":"Current schedulers of MapReduce/Hadoop are quite successful in providing good performance. However improving spaces still exist: map and reduce tasks are not jointly optimized for scheduling, albeit there is a strong dependence between them. This can cause job starvation and bad data locality. We design a resource-aware scheduler for Hadoop, which couples the progresses of mappers and reducers, and jointly optimize the placements for both of them. This mitigates the starvation problem and improves the overall data locality. Our experiments demonstrate improvements to job response times by up to an order of magnitude.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Coupling scheduler for MapReduce/Hadoop\",\"authors\":\"Jian Tan, Xiaoqiao Meng, Li Zhang\",\"doi\":\"10.1145/2287076.2287097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current schedulers of MapReduce/Hadoop are quite successful in providing good performance. However improving spaces still exist: map and reduce tasks are not jointly optimized for scheduling, albeit there is a strong dependence between them. This can cause job starvation and bad data locality. We design a resource-aware scheduler for Hadoop, which couples the progresses of mappers and reducers, and jointly optimize the placements for both of them. This mitigates the starvation problem and improves the overall data locality. Our experiments demonstrate improvements to job response times by up to an order of magnitude.\",\"PeriodicalId\":330072,\"journal\":{\"name\":\"IEEE International Symposium on High-Performance Parallel Distributed Computing\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE International Symposium on High-Performance Parallel Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2287076.2287097\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2287076.2287097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

当前的MapReduce/Hadoop调度器在提供良好的性能方面相当成功。然而，改进的空间仍然存在:map和reduce任务之间存在很强的依赖性，但它们并没有共同优化调度。这可能会导致作业短缺和错误的数据局部性。我们为Hadoop设计了一个资源感知的调度程序，它将映射器和减少器的进程耦合在一起，并共同优化它们的位置。这减轻了饥饿问题并改善了整体数据局部性。我们的实验表明，工作响应时间的改善可达一个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Coupling scheduler for MapReduce/Hadoop

Current schedulers of MapReduce/Hadoop are quite successful in providing good performance. However improving spaces still exist: map and reduce tasks are not jointly optimized for scheduling, albeit there is a strong dependence between them. This can cause job starvation and bad data locality. We design a resource-aware scheduler for Hadoop, which couples the progresses of mappers and reducers, and jointly optimize the placements for both of them. This mitigates the starvation problem and improves the overall data locality. Our experiments demonstrate improvements to job response times by up to an order of magnitude.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE International Symposium on High-Performance Parallel Distributed Computing

自引率

0.00%

发文量