{"title":"Local and Global Optimization of MapReduce Program Model","authors":"Congchong Liu, Shujia Zhou","doi":"10.1109/SERVICES.2011.64","DOIUrl":null,"url":null,"abstract":"MapReduce, which was introduced by Google, provides two functional interfaces, Map and Reduce, for a user to write the user-specific code to process the large amount of data. It has been widely deployed in cloud computing systems. The parallel tasks, data partition, and data transit are automatically managed by its runtime system. This paper proposes a solution to optimize the MapReduce program model and demonstrate it with X10. We develop an adaptive load distribution scheme to balance the load on each node and consequently reduce across-node communication cost occurring in the Reduce function. In addition, we exploit shared-memory in each node to further reduce the communication cost with multi-core programming.","PeriodicalId":429726,"journal":{"name":"2011 IEEE World Congress on Services","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE World Congress on Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERVICES.2011.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
MapReduce, which was introduced by Google, provides two functional interfaces, Map and Reduce, for a user to write the user-specific code to process the large amount of data. It has been widely deployed in cloud computing systems. The parallel tasks, data partition, and data transit are automatically managed by its runtime system. This paper proposes a solution to optimize the MapReduce program model and demonstrate it with X10. We develop an adaptive load distribution scheme to balance the load on each node and consequently reduce across-node communication cost occurring in the Reduce function. In addition, we exploit shared-memory in each node to further reduce the communication cost with multi-core programming.