{"title":"STING: Self-attention based Time-series Imputation Networks using GAN","authors":"Eunkyu Oh, Taehun Kim, Yunhu Ji, Sushil Khyalia","doi":"10.1109/ICDM51629.2021.00155","DOIUrl":null,"url":null,"abstract":"Time series data are ubiquitous in real-world applications. However, one of the most common problems is that the time series could have missing values by the inherent nature of the data collection process. So imputing missing values from multivariate (correlated) time series is imperative to improve a prediction performance while making an accurate data-driven decision. Conventional works for imputation simply delete missing values or fill them based on mean/zero. Although recent works based on deep neural networks have shown remarkable results, they still have a limitation to capture the complex generation process of multivariate time series. In this paper, we propose a novel imputation method for multivariate time series, called STING (Self-attention based Time-series Imputation Networks using GAN). We take advantage of generative adversarial networks and bidirectional recurrent neural networks to learn the latent representations of time series. In addition, we introduce a novel attention mechanism to capture the weighted correlations of a whole sequence and avoid the potential bias brought by unrelated ones. The experimental results on three real-world datasets demonstrate that STING outperforms the existing state-of-the-art methods in terms of imputation accuracy as well as downstream tasks with the imputed values therein.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Time series data are ubiquitous in real-world applications. However, one of the most common problems is that the time series could have missing values by the inherent nature of the data collection process. So imputing missing values from multivariate (correlated) time series is imperative to improve a prediction performance while making an accurate data-driven decision. Conventional works for imputation simply delete missing values or fill them based on mean/zero. Although recent works based on deep neural networks have shown remarkable results, they still have a limitation to capture the complex generation process of multivariate time series. In this paper, we propose a novel imputation method for multivariate time series, called STING (Self-attention based Time-series Imputation Networks using GAN). We take advantage of generative adversarial networks and bidirectional recurrent neural networks to learn the latent representations of time series. In addition, we introduce a novel attention mechanism to capture the weighted correlations of a whole sequence and avoid the potential bias brought by unrelated ones. The experimental results on three real-world datasets demonstrate that STING outperforms the existing state-of-the-art methods in terms of imputation accuracy as well as downstream tasks with the imputed values therein.
时间序列数据在实际应用中无处不在。然而,最常见的问题之一是,由于数据收集过程的固有性质,时间序列可能存在缺失值。因此,在做出准确的数据驱动决策的同时,从多变量(相关)时间序列中输入缺失值是提高预测性能的必要条件。传统的估算工作只是简单地删除缺失值或根据平均值/零填充它们。尽管近年来基于深度神经网络的研究已经取得了显著的成果,但在捕捉多元时间序列的复杂生成过程方面仍然存在一定的局限性。在本文中,我们提出了一种新的多元时间序列的imputation方法,称为STING (Self-attention based time -series imputation Networks using GAN)。我们利用生成对抗网络和双向递归神经网络来学习时间序列的潜在表征。此外,我们引入了一种新的注意力机制来捕捉整个序列的加权相关性,避免不相关序列带来的潜在偏差。在三个真实数据集上的实验结果表明,STING在输入精度以及使用其中的输入值完成下游任务方面优于现有的最先进方法。