Xuan-Hong Dang, I. Assent, R. Ng, A. Zimek, Erich Schubert
{"title":"识别和解释异常值的判别特征","authors":"Xuan-Hong Dang, I. Assent, R. Ng, A. Zimek, Erich Schubert","doi":"10.1109/ICDE.2014.6816642","DOIUrl":null,"url":null,"abstract":"We consider the problem of outlier detection and interpretation. While most existing studies focus on the first problem, we simultaneously address the equally important challenge of outlier interpretation. We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation. Our algorithm takes a mathematically appealing approach from the spectral graph embedding theory and we show that it achieves the globally optimal solution for the objective of subspace learning. By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the discriminative human-interpretable features. This is the first approach to exploit discriminative features for both outlier detection and interpretation, leading to better understanding of how and why the hidden outliers are exceptional.","PeriodicalId":159130,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering","volume":"139 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":"{\"title\":\"Discriminative features for identifying and interpreting outliers\",\"authors\":\"Xuan-Hong Dang, I. Assent, R. Ng, A. Zimek, Erich Schubert\",\"doi\":\"10.1109/ICDE.2014.6816642\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of outlier detection and interpretation. While most existing studies focus on the first problem, we simultaneously address the equally important challenge of outlier interpretation. We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation. Our algorithm takes a mathematically appealing approach from the spectral graph embedding theory and we show that it achieves the globally optimal solution for the objective of subspace learning. By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the discriminative human-interpretable features. This is the first approach to exploit discriminative features for both outlier detection and interpretation, leading to better understanding of how and why the hidden outliers are exceptional.\",\"PeriodicalId\":159130,\"journal\":{\"name\":\"2014 IEEE 30th International Conference on Data Engineering\",\"volume\":\"139 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"75\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 30th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2014.6816642\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 30th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2014.6816642","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Discriminative features for identifying and interpreting outliers
We consider the problem of outlier detection and interpretation. While most existing studies focus on the first problem, we simultaneously address the equally important challenge of outlier interpretation. We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation. Our algorithm takes a mathematically appealing approach from the spectral graph embedding theory and we show that it achieves the globally optimal solution for the objective of subspace learning. By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the discriminative human-interpretable features. This is the first approach to exploit discriminative features for both outlier detection and interpretation, leading to better understanding of how and why the hidden outliers are exceptional.