{"title":"MLOD:多粒度局部离群检测","authors":"Liang Gao, Shaoyue Yu, Yu-Pan Luo, L. Shang","doi":"10.1109/GRC.2009.5255138","DOIUrl":null,"url":null,"abstract":"Outlier detection is an important data mining task, LOF(local outlier factor) was proposed to indicate the degree of outlier-ness, which is practical for finding local outliers. However, it is difficult to decide the neighborhood size. In this paper a multi-granularity local outlier detection(MLOD) method is proposed to organize the outlierness under multi-granularity. It finds local outliers in varying neighborhood granularity. This method applies approximation as well as grid-based partition to reduce time complexity. The theoretical results show that the time cost is linear to the size of data sets. Furthermore, the provided output and analysis can also assist users to choose the appropriate parameters. The performance of the algorithm is presented by experimenting on three generated data sets.","PeriodicalId":388774,"journal":{"name":"2009 IEEE International Conference on Granular Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MLOD: Multi-granularity local outlier detection\",\"authors\":\"Liang Gao, Shaoyue Yu, Yu-Pan Luo, L. Shang\",\"doi\":\"10.1109/GRC.2009.5255138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Outlier detection is an important data mining task, LOF(local outlier factor) was proposed to indicate the degree of outlier-ness, which is practical for finding local outliers. However, it is difficult to decide the neighborhood size. In this paper a multi-granularity local outlier detection(MLOD) method is proposed to organize the outlierness under multi-granularity. It finds local outliers in varying neighborhood granularity. This method applies approximation as well as grid-based partition to reduce time complexity. The theoretical results show that the time cost is linear to the size of data sets. Furthermore, the provided output and analysis can also assist users to choose the appropriate parameters. The performance of the algorithm is presented by experimenting on three generated data sets.\",\"PeriodicalId\":388774,\"journal\":{\"name\":\"2009 IEEE International Conference on Granular Computing\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Conference on Granular Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GRC.2009.5255138\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Granular Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRC.2009.5255138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Outlier detection is an important data mining task, LOF(local outlier factor) was proposed to indicate the degree of outlier-ness, which is practical for finding local outliers. However, it is difficult to decide the neighborhood size. In this paper a multi-granularity local outlier detection(MLOD) method is proposed to organize the outlierness under multi-granularity. It finds local outliers in varying neighborhood granularity. This method applies approximation as well as grid-based partition to reduce time complexity. The theoretical results show that the time cost is linear to the size of data sets. Furthermore, the provided output and analysis can also assist users to choose the appropriate parameters. The performance of the algorithm is presented by experimenting on three generated data sets.