Li Wang, Zhiwei Ni, Yiwen Zhang, Zhangjun Wu, Liyang Tang
{"title":"Notice of Violation of IEEE Publication PrinciplesPipelined-MapReduce: An Improved MapReduce Parallel Programing Model","authors":"Li Wang, Zhiwei Ni, Yiwen Zhang, Zhangjun Wu, Liyang Tang","doi":"10.1109/ICICTA.2011.593","DOIUrl":null,"url":null,"abstract":"MapReduce is a parallel programming model, and used to handle large datasets. The MapReduce program can be automatically concurrent executed in large-scale commodity machines. We proposed an improved MapReduce programming model -- Pipelined-MapReduce, to solve the data intensive of information retrieval problems. Pipelined-MapReduce allows data transfer by pipeline between the operations, expanding the batched MapReduce programming model, and can reduce the completion time, and improve the system utilization rate. The experimental results demonstrate that the implemention of Pipelined-MapReduce can scale well and efficiently process large datasets on commodity machines.","PeriodicalId":368130,"journal":{"name":"2011 Fourth International Conference on Intelligent Computation Technology and Automation","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Fourth International Conference on Intelligent Computation Technology and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICTA.2011.593","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
MapReduce is a parallel programming model, and used to handle large datasets. The MapReduce program can be automatically concurrent executed in large-scale commodity machines. We proposed an improved MapReduce programming model -- Pipelined-MapReduce, to solve the data intensive of information retrieval problems. Pipelined-MapReduce allows data transfer by pipeline between the operations, expanding the batched MapReduce programming model, and can reduce the completion time, and improve the system utilization rate. The experimental results demonstrate that the implemention of Pipelined-MapReduce can scale well and efficiently process large datasets on commodity machines.