{"title":"Distributed classification using class-association rules mining algorithm","authors":"D. Mokeddem, H. Belbachir","doi":"10.1109/ICMWI.2010.5647984","DOIUrl":null,"url":null,"abstract":"Associative classification algorithms have been successfully used to construct classification systems. The major strength of such techniques is that they are able to use the most accurate rules among an exhaustive list of class-association rules. This explains their good performance in general, but to the detriment of an expensive computing cost, inherited from association rules discovery algorithms. We address this issue by proposing a distributed methodology based on FP-growth algorithm. In a shared nothing architecture, subsets of classification rules are generated in parallel from several data partitions. An inter-processor communication is established in order to make global decisions. This exchange is made only in the first level of recursion, allowing each machine to subsequently process all its assigned tasks independently. The final classifier is built by a majority vote. This approach is illustrated by a detailed example, and an analysis of communication cost.","PeriodicalId":404577,"journal":{"name":"2010 International Conference on Machine and Web Intelligence","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Machine and Web Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMWI.2010.5647984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Associative classification algorithms have been successfully used to construct classification systems. The major strength of such techniques is that they are able to use the most accurate rules among an exhaustive list of class-association rules. This explains their good performance in general, but to the detriment of an expensive computing cost, inherited from association rules discovery algorithms. We address this issue by proposing a distributed methodology based on FP-growth algorithm. In a shared nothing architecture, subsets of classification rules are generated in parallel from several data partitions. An inter-processor communication is established in order to make global decisions. This exchange is made only in the first level of recursion, allowing each machine to subsequently process all its assigned tasks independently. The final classifier is built by a majority vote. This approach is illustrated by a detailed example, and an analysis of communication cost.