Improving disk failure detection accuracy via data augmentation

Wang Wang, Xuehai Tang, Biyu Zhou, Wenjie Xiao, Jizhong Han, Songlin Hu
{"title":"Improving disk failure detection accuracy via data augmentation","authors":"Wang Wang, Xuehai Tang, Biyu Zhou, Wenjie Xiao, Jizhong Han, Songlin Hu","doi":"10.1109/IWQoS54832.2022.9812864","DOIUrl":null,"url":null,"abstract":"Frequently happening of disk failures seriously affects the dependability and service quality of cloud data centers. Recently, machine learning (ML) based methods are popularly adopted to proactively predict forthcoming disk failures via supervised learning. However, the high imbalance of failure samples and healthy samples is a huge obstacle for existing detection methods to establish high performance detection model. This paper presents a data augmentation method MSGMD, which can efficiently generate high quality failure samples to alleviate the data imbalance of the training set, so as to effectively improve the performance of any supervised failure detection models. First, MSGMD converts failure samples (multivariate time series) into multiple univariate time series via decomposing the spatial relations among features. Then it learns the temporal correlation of each feature via a policy-based reinforcement learning model trained in an adversarial way. After that, it generates failure samples by combining feature series sampled from learned distribution. Finally, it filters out low quality generated samples with a confidence-based method. Experimental results on real-world datasets show that, through data augmentation, MSGMD can improve the FDR and F1-Score of the state-of-the-art disk failure detection model by 31.59% and 30.74% respectively on average.","PeriodicalId":353365,"journal":{"name":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","volume":"10 23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS54832.2022.9812864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Frequently happening of disk failures seriously affects the dependability and service quality of cloud data centers. Recently, machine learning (ML) based methods are popularly adopted to proactively predict forthcoming disk failures via supervised learning. However, the high imbalance of failure samples and healthy samples is a huge obstacle for existing detection methods to establish high performance detection model. This paper presents a data augmentation method MSGMD, which can efficiently generate high quality failure samples to alleviate the data imbalance of the training set, so as to effectively improve the performance of any supervised failure detection models. First, MSGMD converts failure samples (multivariate time series) into multiple univariate time series via decomposing the spatial relations among features. Then it learns the temporal correlation of each feature via a policy-based reinforcement learning model trained in an adversarial way. After that, it generates failure samples by combining feature series sampled from learned distribution. Finally, it filters out low quality generated samples with a confidence-based method. Experimental results on real-world datasets show that, through data augmentation, MSGMD can improve the FDR and F1-Score of the state-of-the-art disk failure detection model by 31.59% and 30.74% respectively on average.
通过数据增强提高磁盘故障检测的准确性
硬盘故障的频繁发生严重影响云数据中心的可靠性和服务质量。最近,基于机器学习(ML)的方法被广泛采用,通过监督学习来主动预测即将发生的磁盘故障。然而,失效样本与健康样本的高度不平衡是现有检测方法建立高性能检测模型的巨大障碍。本文提出了一种数据增强方法MSGMD,该方法可以有效地生成高质量的故障样本,以缓解训练集的数据不平衡,从而有效地提高任何监督故障检测模型的性能。首先,MSGMD通过分解特征间的空间关系,将故障样本(多元时间序列)转化为多个单变量时间序列。然后,它通过一个以对抗方式训练的基于策略的强化学习模型来学习每个特征的时间相关性。然后结合从学习分布中采样的特征序列生成故障样本。最后,用基于置信度的方法过滤掉低质量的生成样本。在真实数据集上的实验结果表明,通过数据增强,MSGMD可使最先进的磁盘故障检测模型的FDR和F1-Score平均分别提高31.59%和30.74%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信