Dominik Ostroski, Karlo Slovenec, Ivona Brajdic, M. Mikuc
{"title":"Anomaly Correction in Time Series Data for Improved Forecasting","authors":"Dominik Ostroski, Karlo Slovenec, Ivona Brajdic, M. Mikuc","doi":"10.23919/ConTEL52528.2021.9495986","DOIUrl":null,"url":null,"abstract":"This paper presents a method for detecting and correcting anomalies in time series data. This method was tested on time series data of disk usage over a period of few months. For the method to be able to detect and correct anomalies, it has to calculate the difference of time series, find the mean value of transformed data and use it to set a threshold. Any point in transformed data that has a value higher than the threshold corresponds to an anomaly in original data. After an anomaly is found, data is transformed in such a way that all data before the anomaly is shifted by the value of the anomaly. By removing anomalies this way, trend and seasonality of time series are kept intact. Results show that time series forecasting performed on transformed disk usage time series produces better results than when the original data is used.","PeriodicalId":269755,"journal":{"name":"2021 16th International Conference on Telecommunications (ConTEL)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 16th International Conference on Telecommunications (ConTEL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ConTEL52528.2021.9495986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents a method for detecting and correcting anomalies in time series data. This method was tested on time series data of disk usage over a period of few months. For the method to be able to detect and correct anomalies, it has to calculate the difference of time series, find the mean value of transformed data and use it to set a threshold. Any point in transformed data that has a value higher than the threshold corresponds to an anomaly in original data. After an anomaly is found, data is transformed in such a way that all data before the anomaly is shifted by the value of the anomaly. By removing anomalies this way, trend and seasonality of time series are kept intact. Results show that time series forecasting performed on transformed disk usage time series produces better results than when the original data is used.