Hard Disk Failure Prediction Based on Lightgbm with CID

Hengrui Wang, Yahui Yang, Hongzhang Yang
{"title":"Hard Disk Failure Prediction Based on Lightgbm with CID","authors":"Hengrui Wang, Yahui Yang, Hongzhang Yang","doi":"10.1109/ISCC53001.2021.9631504","DOIUrl":null,"url":null,"abstract":"In data centers, hard disks are the most prone to failure of IT equipment. Although there is data backup, data reliability still faces challenges due to hard disks failure. In recent years, many hard disk failure prediction approaches based on SMART data have been proposed. In this paper, we proposed a novel disk failure prediction approach based on Lightgbm algorithm with CID (complexity invariant distance). Our failure prediction model has been built and evaluated on SMART data of about 80,000 hard disks from two manufacturers. The experimental result shows that by adding CID features, the TPR is increased from 0.28 to 0.96, and the number of days that the model can predict failures in advance is extended by 1.2 days. Compared with the several existing failure prediction models, our model has better performance on AUC score, f1-score and TPR.","PeriodicalId":270786,"journal":{"name":"2021 IEEE Symposium on Computers and Communications (ISCC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Symposium on Computers and Communications (ISCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCC53001.2021.9631504","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In data centers, hard disks are the most prone to failure of IT equipment. Although there is data backup, data reliability still faces challenges due to hard disks failure. In recent years, many hard disk failure prediction approaches based on SMART data have been proposed. In this paper, we proposed a novel disk failure prediction approach based on Lightgbm algorithm with CID (complexity invariant distance). Our failure prediction model has been built and evaluated on SMART data of about 80,000 hard disks from two manufacturers. The experimental result shows that by adding CID features, the TPR is increased from 0.28 to 0.96, and the number of days that the model can predict failures in advance is extended by 1.2 days. Compared with the several existing failure prediction models, our model has better performance on AUC score, f1-score and TPR.
基于CID的Lightgbm硬盘故障预测
在数据中心,硬盘是最容易发生故障的IT设备。虽然有数据备份,但由于硬盘故障,数据可靠性仍面临挑战。近年来,人们提出了许多基于SMART数据的硬盘故障预测方法。本文提出了一种基于复杂度不变距离(CID)的Lightgbm算法的磁盘故障预测方法。我们建立了故障预测模型,并对两家厂商约8万个硬盘的SMART数据进行了评估。实验结果表明,加入CID特征后,TPR由0.28提高到0.96,模型提前预测故障的天数延长了1.2天。与现有的几种故障预测模型相比,我们的模型在AUC评分、f1评分和TPR方面具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信