{"title":"Research of local outlier mining algorithm based on spark","authors":"Haipeng Chen, Siqi Zhao, Han Bao, Hui Kang","doi":"10.1109/EIIS.2017.8298559","DOIUrl":null,"url":null,"abstract":"Cluster-based outlier mining algorithm is one of the impotent local outlier mining algorithms, but there are many problems in it. In this paper, we propose an algorithm named CFLDOF to optimize the LDOF algorithm by pruning the dataset with clustering feature trees, then the parallel design of CFLDOF is given, and use the Spark platform to set up improved parallelization algorithm, finally, a comparative experiment is carried out, it is verified that CFLDOF reduces the time complexity, the accuracy is similar to the LDOF.","PeriodicalId":434246,"journal":{"name":"2017 First International Conference on Electronics Instrumentation & Information Systems (EIIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 First International Conference on Electronics Instrumentation & Information Systems (EIIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EIIS.2017.8298559","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Cluster-based outlier mining algorithm is one of the impotent local outlier mining algorithms, but there are many problems in it. In this paper, we propose an algorithm named CFLDOF to optimize the LDOF algorithm by pruning the dataset with clustering feature trees, then the parallel design of CFLDOF is given, and use the Spark platform to set up improved parallelization algorithm, finally, a comparative experiment is carried out, it is verified that CFLDOF reduces the time complexity, the accuracy is similar to the LDOF.