{"title":"调度MapReduce中可除的reduce任务","authors":"Tao Gu, Chuang Zuo, Zheng Chen, Yulu Yang, Tao Li","doi":"10.1109/ICSESS.2014.6933542","DOIUrl":null,"url":null,"abstract":"The computations in MapReduce are composed of map and reduce tasks. Although performance of map tasks has been investigated extensively, most researches ignore the scheduling of reduce tasks. This paper proposes a divisible load scheduling model for reduce tasks in a MapReduce job. By analyzing intermediate data transmission and reduce task execution in reduce phase, reduce tasks are abstracted as divisible loads. The optimal scheduling of reduce tasks is solved with linear programming. The performance is evaluated under different environments. Experiment results show that at least 40% performance improvement is achieved with the optimal scheduling.","PeriodicalId":6473,"journal":{"name":"2014 IEEE 5th International Conference on Software Engineering and Service Science","volume":"1942 1","pages":"190-194"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Scheduling divisible reduce tasks in MapReduce\",\"authors\":\"Tao Gu, Chuang Zuo, Zheng Chen, Yulu Yang, Tao Li\",\"doi\":\"10.1109/ICSESS.2014.6933542\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The computations in MapReduce are composed of map and reduce tasks. Although performance of map tasks has been investigated extensively, most researches ignore the scheduling of reduce tasks. This paper proposes a divisible load scheduling model for reduce tasks in a MapReduce job. By analyzing intermediate data transmission and reduce task execution in reduce phase, reduce tasks are abstracted as divisible loads. The optimal scheduling of reduce tasks is solved with linear programming. The performance is evaluated under different environments. Experiment results show that at least 40% performance improvement is achieved with the optimal scheduling.\",\"PeriodicalId\":6473,\"journal\":{\"name\":\"2014 IEEE 5th International Conference on Software Engineering and Service Science\",\"volume\":\"1942 1\",\"pages\":\"190-194\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 5th International Conference on Software Engineering and Service Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSESS.2014.6933542\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 5th International Conference on Software Engineering and Service Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS.2014.6933542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The computations in MapReduce are composed of map and reduce tasks. Although performance of map tasks has been investigated extensively, most researches ignore the scheduling of reduce tasks. This paper proposes a divisible load scheduling model for reduce tasks in a MapReduce job. By analyzing intermediate data transmission and reduce task execution in reduce phase, reduce tasks are abstracted as divisible loads. The optimal scheduling of reduce tasks is solved with linear programming. The performance is evaluated under different environments. Experiment results show that at least 40% performance improvement is achieved with the optimal scheduling.