Wang Wang, Xuehai Tang, Biyu Zhou, Wenjie Xiao, Jizhong Han, Songlin Hu
{"title":"Improving disk failure detection accuracy via data augmentation","authors":"Wang Wang, Xuehai Tang, Biyu Zhou, Wenjie Xiao, Jizhong Han, Songlin Hu","doi":"10.1109/IWQoS54832.2022.9812864","DOIUrl":null,"url":null,"abstract":"Frequently happening of disk failures seriously affects the dependability and service quality of cloud data centers. Recently, machine learning (ML) based methods are popularly adopted to proactively predict forthcoming disk failures via supervised learning. However, the high imbalance of failure samples and healthy samples is a huge obstacle for existing detection methods to establish high performance detection model. This paper presents a data augmentation method MSGMD, which can efficiently generate high quality failure samples to alleviate the data imbalance of the training set, so as to effectively improve the performance of any supervised failure detection models. First, MSGMD converts failure samples (multivariate time series) into multiple univariate time series via decomposing the spatial relations among features. Then it learns the temporal correlation of each feature via a policy-based reinforcement learning model trained in an adversarial way. After that, it generates failure samples by combining feature series sampled from learned distribution. Finally, it filters out low quality generated samples with a confidence-based method. Experimental results on real-world datasets show that, through data augmentation, MSGMD can improve the FDR and F1-Score of the state-of-the-art disk failure detection model by 31.59% and 30.74% respectively on average.","PeriodicalId":353365,"journal":{"name":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","volume":"10 23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS54832.2022.9812864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Frequently happening of disk failures seriously affects the dependability and service quality of cloud data centers. Recently, machine learning (ML) based methods are popularly adopted to proactively predict forthcoming disk failures via supervised learning. However, the high imbalance of failure samples and healthy samples is a huge obstacle for existing detection methods to establish high performance detection model. This paper presents a data augmentation method MSGMD, which can efficiently generate high quality failure samples to alleviate the data imbalance of the training set, so as to effectively improve the performance of any supervised failure detection models. First, MSGMD converts failure samples (multivariate time series) into multiple univariate time series via decomposing the spatial relations among features. Then it learns the temporal correlation of each feature via a policy-based reinforcement learning model trained in an adversarial way. After that, it generates failure samples by combining feature series sampled from learned distribution. Finally, it filters out low quality generated samples with a confidence-based method. Experimental results on real-world datasets show that, through data augmentation, MSGMD can improve the FDR and F1-Score of the state-of-the-art disk failure detection model by 31.59% and 30.74% respectively on average.