{"title":"Hadoop中基于mapreduce的并行串通检测算法","authors":"M. Mortazavi, B. T. Ladani","doi":"10.1109/IKT.2015.7288760","DOIUrl":null,"url":null,"abstract":"MapReduce as a programming model for parallel data processing has been used in many open systems such as cloud computing and service-oriented computing. Collusive behavior of worker entities in MapReduce model can violate integrity concern of open systems. In this paper, a MapReduce-based algorithm for parallel collusion detection of malicious workers has been proposed. This algorithm uses a voting matrix that is represented as a list of voting values of different workers. Three phases of majority selection, correlation counting and correlation computing are designed and implemented in this paper. Preliminary results show that speedup of 1.8 and efficiency of about 70% is achieved using data set containing 2000 worker's votes.","PeriodicalId":338953,"journal":{"name":"2015 7th Conference on Information and Knowledge Technology (IKT)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A MapReduce-based algorithm for parallelizing collusion detection in Hadoop\",\"authors\":\"M. Mortazavi, B. T. Ladani\",\"doi\":\"10.1109/IKT.2015.7288760\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MapReduce as a programming model for parallel data processing has been used in many open systems such as cloud computing and service-oriented computing. Collusive behavior of worker entities in MapReduce model can violate integrity concern of open systems. In this paper, a MapReduce-based algorithm for parallel collusion detection of malicious workers has been proposed. This algorithm uses a voting matrix that is represented as a list of voting values of different workers. Three phases of majority selection, correlation counting and correlation computing are designed and implemented in this paper. Preliminary results show that speedup of 1.8 and efficiency of about 70% is achieved using data set containing 2000 worker's votes.\",\"PeriodicalId\":338953,\"journal\":{\"name\":\"2015 7th Conference on Information and Knowledge Technology (IKT)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 7th Conference on Information and Knowledge Technology (IKT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IKT.2015.7288760\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 7th Conference on Information and Knowledge Technology (IKT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IKT.2015.7288760","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A MapReduce-based algorithm for parallelizing collusion detection in Hadoop
MapReduce as a programming model for parallel data processing has been used in many open systems such as cloud computing and service-oriented computing. Collusive behavior of worker entities in MapReduce model can violate integrity concern of open systems. In this paper, a MapReduce-based algorithm for parallel collusion detection of malicious workers has been proposed. This algorithm uses a voting matrix that is represented as a list of voting values of different workers. Three phases of majority selection, correlation counting and correlation computing are designed and implemented in this paper. Preliminary results show that speedup of 1.8 and efficiency of about 70% is achieved using data set containing 2000 worker's votes.