Dongsheng Yang, Yijie Wang, Yongmou Li, Xingkong Ma
{"title":"基于可变马尔可夫的数据流多维序列离群点检测方法","authors":"Dongsheng Yang, Yijie Wang, Yongmou Li, Xingkong Ma","doi":"10.1109/PDCAT.2016.049","DOIUrl":null,"url":null,"abstract":"Nowadays sequence data tends to be multi-dimensional sequence over data stream, it has a large state space and arrives at unprecedented speed. It is a big challenge to design a multi-dimensional sequence outlier detection method to meet the accurate and high speed requirements. The traditional methods can't handle multi-dimensional sequence effectively as they have poor abilities for multi-dimensional sequence modeling, and can't detect outlier timely as they have high computational complexity. In this paper we propose a variable Markovian based outlier detection method for multi-dimensional sequence over data stream, VMOD, which consists of two algorithms: mutual information based feature selection algorithm (MIFS), variable Markovian based sequential analysis algorithm (VMSA). It uses MIFS algorithm to reduce the state space and redundant features, and uses VMSA algorithm to accelerate the outlier detection. Through VMOD method, we can improve the detection rate and detection speed. The MIFS algorithm uses mutual information as similarity measures and adopt clustering based strategy to select features, it can improve the abilities for sequence modeling through reducing the state space and redundant features, consequently, to improve the detection rate. The VMSA algorithm use random sample and index structure to accelerate the variable Markovian model construction and reduce the model complexity, consequently, to quicken the outlier detection. The experiments show that VMOD can detect outlier effectively, and reduce the detection time by at least 50% compared with the traditional methods.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Variable Markovian Based Outlier Detection Method for Multi-Dimensional Sequence over Data Stream\",\"authors\":\"Dongsheng Yang, Yijie Wang, Yongmou Li, Xingkong Ma\",\"doi\":\"10.1109/PDCAT.2016.049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays sequence data tends to be multi-dimensional sequence over data stream, it has a large state space and arrives at unprecedented speed. It is a big challenge to design a multi-dimensional sequence outlier detection method to meet the accurate and high speed requirements. The traditional methods can't handle multi-dimensional sequence effectively as they have poor abilities for multi-dimensional sequence modeling, and can't detect outlier timely as they have high computational complexity. In this paper we propose a variable Markovian based outlier detection method for multi-dimensional sequence over data stream, VMOD, which consists of two algorithms: mutual information based feature selection algorithm (MIFS), variable Markovian based sequential analysis algorithm (VMSA). It uses MIFS algorithm to reduce the state space and redundant features, and uses VMSA algorithm to accelerate the outlier detection. Through VMOD method, we can improve the detection rate and detection speed. The MIFS algorithm uses mutual information as similarity measures and adopt clustering based strategy to select features, it can improve the abilities for sequence modeling through reducing the state space and redundant features, consequently, to improve the detection rate. The VMSA algorithm use random sample and index structure to accelerate the variable Markovian model construction and reduce the model complexity, consequently, to quicken the outlier detection. The experiments show that VMOD can detect outlier effectively, and reduce the detection time by at least 50% compared with the traditional methods.\",\"PeriodicalId\":203925,\"journal\":{\"name\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2016.049\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2016.049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Variable Markovian Based Outlier Detection Method for Multi-Dimensional Sequence over Data Stream
Nowadays sequence data tends to be multi-dimensional sequence over data stream, it has a large state space and arrives at unprecedented speed. It is a big challenge to design a multi-dimensional sequence outlier detection method to meet the accurate and high speed requirements. The traditional methods can't handle multi-dimensional sequence effectively as they have poor abilities for multi-dimensional sequence modeling, and can't detect outlier timely as they have high computational complexity. In this paper we propose a variable Markovian based outlier detection method for multi-dimensional sequence over data stream, VMOD, which consists of two algorithms: mutual information based feature selection algorithm (MIFS), variable Markovian based sequential analysis algorithm (VMSA). It uses MIFS algorithm to reduce the state space and redundant features, and uses VMSA algorithm to accelerate the outlier detection. Through VMOD method, we can improve the detection rate and detection speed. The MIFS algorithm uses mutual information as similarity measures and adopt clustering based strategy to select features, it can improve the abilities for sequence modeling through reducing the state space and redundant features, consequently, to improve the detection rate. The VMSA algorithm use random sample and index structure to accelerate the variable Markovian model construction and reduce the model complexity, consequently, to quicken the outlier detection. The experiments show that VMOD can detect outlier effectively, and reduce the detection time by at least 50% compared with the traditional methods.