Dong Wei, Wu Sun, Xiaofeng Zou, Dan Ma, Huarong Xu, Panfeng Chen, Chaoshu Yang, Mei Chen, Hui Li
{"title":"具有异常感知的多变量时间序列异常检测模型","authors":"Dong Wei, Wu Sun, Xiaofeng Zou, Dan Ma, Huarong Xu, Panfeng Chen, Chaoshu Yang, Mei Chen, Hui Li","doi":"10.7717/peerj-cs.2172","DOIUrl":null,"url":null,"abstract":"Multivariate time series anomaly detection is a crucial data mining technique with a wide range of applications in areas such as IT applications. Currently, the majority of anomaly detection methods for time series data rely on unsupervised approaches due to the rarity of anomaly labels. However, in real-world scenarios, obtaining a limited number of anomaly labels is feasible and affordable. Effective usage of these labels can offer valuable insights into the temporal characteristics of anomalies and play a pivotal role in guiding anomaly detection efforts. To improve the performance of multivariate time series anomaly detection, we proposed a novel deep learning model named EDD (Encoder-Decoder-Discriminator) that leverages limited anomaly samples. The EDD model innovatively integrates a graph attention network with long short term memory (LSTM) to extract spatial and temporal features from multivariate time series data. This integrated approach enables the model to capture complex patterns and dependencies within the data. Additionally, the model skillfully maps series data into a latent space, utilizing a carefully crafted loss function to cluster normal data tightly in the latent space while dispersing abnormal data randomly. This innovative design results in distinct probability distributions for normal and abnormal data in the latent space, enabling precise identification of anomalous data. To evaluate the performance of our EDD model, we conducted extensive experimental validation across three diverse datasets. The results demonstrate the significant superiority of our model in multivariate time series anomaly detection. Specifically, the average F1-Score of our model outperformed the second-best method by 2.7% and 73.4% in both evaluation approaches, respectively, highlighting its superior detection capabilities. These findings validate the effectiveness of our proposed EDD model in leveraging limited anomaly samples for accurate and robust anomaly detection in multivariate time series data.","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An anomaly detection model for multivariate time series with anomaly perception\",\"authors\":\"Dong Wei, Wu Sun, Xiaofeng Zou, Dan Ma, Huarong Xu, Panfeng Chen, Chaoshu Yang, Mei Chen, Hui Li\",\"doi\":\"10.7717/peerj-cs.2172\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multivariate time series anomaly detection is a crucial data mining technique with a wide range of applications in areas such as IT applications. Currently, the majority of anomaly detection methods for time series data rely on unsupervised approaches due to the rarity of anomaly labels. However, in real-world scenarios, obtaining a limited number of anomaly labels is feasible and affordable. Effective usage of these labels can offer valuable insights into the temporal characteristics of anomalies and play a pivotal role in guiding anomaly detection efforts. To improve the performance of multivariate time series anomaly detection, we proposed a novel deep learning model named EDD (Encoder-Decoder-Discriminator) that leverages limited anomaly samples. The EDD model innovatively integrates a graph attention network with long short term memory (LSTM) to extract spatial and temporal features from multivariate time series data. This integrated approach enables the model to capture complex patterns and dependencies within the data. Additionally, the model skillfully maps series data into a latent space, utilizing a carefully crafted loss function to cluster normal data tightly in the latent space while dispersing abnormal data randomly. This innovative design results in distinct probability distributions for normal and abnormal data in the latent space, enabling precise identification of anomalous data. To evaluate the performance of our EDD model, we conducted extensive experimental validation across three diverse datasets. The results demonstrate the significant superiority of our model in multivariate time series anomaly detection. Specifically, the average F1-Score of our model outperformed the second-best method by 2.7% and 73.4% in both evaluation approaches, respectively, highlighting its superior detection capabilities. These findings validate the effectiveness of our proposed EDD model in leveraging limited anomaly samples for accurate and robust anomaly detection in multivariate time series data.\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.2172\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2172","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
An anomaly detection model for multivariate time series with anomaly perception
Multivariate time series anomaly detection is a crucial data mining technique with a wide range of applications in areas such as IT applications. Currently, the majority of anomaly detection methods for time series data rely on unsupervised approaches due to the rarity of anomaly labels. However, in real-world scenarios, obtaining a limited number of anomaly labels is feasible and affordable. Effective usage of these labels can offer valuable insights into the temporal characteristics of anomalies and play a pivotal role in guiding anomaly detection efforts. To improve the performance of multivariate time series anomaly detection, we proposed a novel deep learning model named EDD (Encoder-Decoder-Discriminator) that leverages limited anomaly samples. The EDD model innovatively integrates a graph attention network with long short term memory (LSTM) to extract spatial and temporal features from multivariate time series data. This integrated approach enables the model to capture complex patterns and dependencies within the data. Additionally, the model skillfully maps series data into a latent space, utilizing a carefully crafted loss function to cluster normal data tightly in the latent space while dispersing abnormal data randomly. This innovative design results in distinct probability distributions for normal and abnormal data in the latent space, enabling precise identification of anomalous data. To evaluate the performance of our EDD model, we conducted extensive experimental validation across three diverse datasets. The results demonstrate the significant superiority of our model in multivariate time series anomaly detection. Specifically, the average F1-Score of our model outperformed the second-best method by 2.7% and 73.4% in both evaluation approaches, respectively, highlighting its superior detection capabilities. These findings validate the effectiveness of our proposed EDD model in leveraging limited anomaly samples for accurate and robust anomaly detection in multivariate time series data.