{"title":"基于自编码器的ifforest高维异常检测算法","authors":"Jinhong Yang, Xinxin Yang, Zhenyu Zhang","doi":"10.1109/DOCS55193.2022.9967746","DOIUrl":null,"url":null,"abstract":"The existing anomaly detection algorithms based on isolated forest are limited by the height of isolated tree. High-dimensional problem domains pose significant challenges for anomaly detection. The presence of irrelevant features can conceal the presence of anomalies. This problem known as curse of dimensionality, is an obstacle for many anomaly detection techniques. Building a robust anomaly detection model for high- dimensional data requires the combination of an unsupervised feature extractor and an anomaly detector. A high-dimensional anomaly detection algorithm is proposed based on isolated forest with deep autoencoder (AE-IForest). Firstly, AE-IForest maps the high-dimensional and nonlinear original data to the low- dimensional space by a deep self-coding network. In the low-dimensional space, the isolated forest algorithm is used to sort the data isolation score, and the reconstruction error of the samples is fused to detect the abnormal data. Finally, the experimental results on six data sets show that the anomaly detection effect of AE-IForest algorithm is better than three classical algorithms LOF, IForest and SVDD. AE-IForest is an efficient anomaly detection model for high-dimensional data.","PeriodicalId":348545,"journal":{"name":"2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A High-dimensional Anomaly Detection Algorithm Based on IForest with Autoencoder\",\"authors\":\"Jinhong Yang, Xinxin Yang, Zhenyu Zhang\",\"doi\":\"10.1109/DOCS55193.2022.9967746\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The existing anomaly detection algorithms based on isolated forest are limited by the height of isolated tree. High-dimensional problem domains pose significant challenges for anomaly detection. The presence of irrelevant features can conceal the presence of anomalies. This problem known as curse of dimensionality, is an obstacle for many anomaly detection techniques. Building a robust anomaly detection model for high- dimensional data requires the combination of an unsupervised feature extractor and an anomaly detector. A high-dimensional anomaly detection algorithm is proposed based on isolated forest with deep autoencoder (AE-IForest). Firstly, AE-IForest maps the high-dimensional and nonlinear original data to the low- dimensional space by a deep self-coding network. In the low-dimensional space, the isolated forest algorithm is used to sort the data isolation score, and the reconstruction error of the samples is fused to detect the abnormal data. Finally, the experimental results on six data sets show that the anomaly detection effect of AE-IForest algorithm is better than three classical algorithms LOF, IForest and SVDD. AE-IForest is an efficient anomaly detection model for high-dimensional data.\",\"PeriodicalId\":348545,\"journal\":{\"name\":\"2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DOCS55193.2022.9967746\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DOCS55193.2022.9967746","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A High-dimensional Anomaly Detection Algorithm Based on IForest with Autoencoder
The existing anomaly detection algorithms based on isolated forest are limited by the height of isolated tree. High-dimensional problem domains pose significant challenges for anomaly detection. The presence of irrelevant features can conceal the presence of anomalies. This problem known as curse of dimensionality, is an obstacle for many anomaly detection techniques. Building a robust anomaly detection model for high- dimensional data requires the combination of an unsupervised feature extractor and an anomaly detector. A high-dimensional anomaly detection algorithm is proposed based on isolated forest with deep autoencoder (AE-IForest). Firstly, AE-IForest maps the high-dimensional and nonlinear original data to the low- dimensional space by a deep self-coding network. In the low-dimensional space, the isolated forest algorithm is used to sort the data isolation score, and the reconstruction error of the samples is fused to detect the abnormal data. Finally, the experimental results on six data sets show that the anomaly detection effect of AE-IForest algorithm is better than three classical algorithms LOF, IForest and SVDD. AE-IForest is an efficient anomaly detection model for high-dimensional data.