{"title":"NDOD: An efficient neighboring dependent outlier detector for bias distributed large datasets","authors":"Yun Hu, Junyuan Xie, Cunhua Li","doi":"10.1109/ICIST.2011.5765219","DOIUrl":null,"url":null,"abstract":"Outlier detection is an important problem for many domains, including fraud detection, network intrusion and medical diagnosis. Discovery of unexpected knowledge revealed from outliers is becoming an integral aspect of data mining. Existing works in this field fall short of the adaptability to the distributive feature of the dataset. This paper presents a novel approach for outlier detection with high efficiency and the ability to closely monitor the neighboring density characteristics around outliers. A generalized neighboring dependent outlier is defined, followed by a cell-based detection algorithm. Results of extensive experimental studies on real-world and synthetic datasets demonstrate the effectiveness of the algorithm with respect to the size, the bias distributive structure of the datasets.","PeriodicalId":6408,"journal":{"name":"2009 International Conference on Environmental Science and Information Application Technology","volume":"14 1","pages":"97-102"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Environmental Science and Information Application Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST.2011.5765219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Outlier detection is an important problem for many domains, including fraud detection, network intrusion and medical diagnosis. Discovery of unexpected knowledge revealed from outliers is becoming an integral aspect of data mining. Existing works in this field fall short of the adaptability to the distributive feature of the dataset. This paper presents a novel approach for outlier detection with high efficiency and the ability to closely monitor the neighboring density characteristics around outliers. A generalized neighboring dependent outlier is defined, followed by a cell-based detection algorithm. Results of extensive experimental studies on real-world and synthetic datasets demonstrate the effectiveness of the algorithm with respect to the size, the bias distributive structure of the datasets.