{"title":"基于Hadoop MapReduce模型的并行静态算法的设计与实现","authors":"Songqing Duan, Bin Wu, Bai Wang, Juan Yang","doi":"10.1109/CCIS.2011.6045047","DOIUrl":null,"url":null,"abstract":"The rapid growth of data promotes the development of parallel computing. MapReduce, which is a simplified programming model of distributed parallel computing, is becoming more and more popular. In this paper, we design and implementation of parallel statistical algorithm based on Hadoop's MapReduce model. The algorithm, which is used to grasp the overall characteristics of massive data, involves the calculation of central tendency, dispersion and distribution tendency. By experiment, we come to the conclusion that the algorithm is suitable for dealing with large-scale data.","PeriodicalId":128504,"journal":{"name":"2011 IEEE International Conference on Cloud Computing and Intelligence Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Design and implementation of parallel statiatical algorithm based on Hadoop's MapReduce model\",\"authors\":\"Songqing Duan, Bin Wu, Bai Wang, Juan Yang\",\"doi\":\"10.1109/CCIS.2011.6045047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid growth of data promotes the development of parallel computing. MapReduce, which is a simplified programming model of distributed parallel computing, is becoming more and more popular. In this paper, we design and implementation of parallel statistical algorithm based on Hadoop's MapReduce model. The algorithm, which is used to grasp the overall characteristics of massive data, involves the calculation of central tendency, dispersion and distribution tendency. By experiment, we come to the conclusion that the algorithm is suitable for dealing with large-scale data.\",\"PeriodicalId\":128504,\"journal\":{\"name\":\"2011 IEEE International Conference on Cloud Computing and Intelligence Systems\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE International Conference on Cloud Computing and Intelligence Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCIS.2011.6045047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Cloud Computing and Intelligence Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS.2011.6045047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design and implementation of parallel statiatical algorithm based on Hadoop's MapReduce model
The rapid growth of data promotes the development of parallel computing. MapReduce, which is a simplified programming model of distributed parallel computing, is becoming more and more popular. In this paper, we design and implementation of parallel statistical algorithm based on Hadoop's MapReduce model. The algorithm, which is used to grasp the overall characteristics of massive data, involves the calculation of central tendency, dispersion and distribution tendency. By experiment, we come to the conclusion that the algorithm is suitable for dealing with large-scale data.