{"title":"Outlier detection based on k-neighborhood MST","authors":"Qingsheng Zhu, Xiaogang Fan, Ji Feng","doi":"10.1109/IRI.2014.7051960","DOIUrl":null,"url":null,"abstract":"Outlier detection is an important task in data mining. It is mainly used for finding strange mechanism or potential danger. This paper presents an outlier detection algorithm based on k-neighborhood minimum spanning tree(MST). This algorithm is applicable to data sets of any arbitrary shape and density and can effectively detect local outliers and local outlying clusters. Taking density and directional factor into consideration, this algorithm proposes a new dissimilarity measure based on k-neighborhood. Then a minimum spanning tree (MST) is built based on this k-neighborhood dissimilarity measure. Finally, the tree is progressively constrained to cutting so that the outliers can be found. Compared with algorithm LOF, COF, KNN and INFLO, the result proves the effectiveness and excellence of this new algorithm.","PeriodicalId":360013,"journal":{"name":"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)","volume":"11 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2014.7051960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Outlier detection is an important task in data mining. It is mainly used for finding strange mechanism or potential danger. This paper presents an outlier detection algorithm based on k-neighborhood minimum spanning tree(MST). This algorithm is applicable to data sets of any arbitrary shape and density and can effectively detect local outliers and local outlying clusters. Taking density and directional factor into consideration, this algorithm proposes a new dissimilarity measure based on k-neighborhood. Then a minimum spanning tree (MST) is built based on this k-neighborhood dissimilarity measure. Finally, the tree is progressively constrained to cutting so that the outliers can be found. Compared with algorithm LOF, COF, KNN and INFLO, the result proves the effectiveness and excellence of this new algorithm.