Using Hidden Semi-Markov Models for Effective Online Failure Prediction

Felix Salfner, M. Malek
{"title":"Using Hidden Semi-Markov Models for Effective Online Failure Prediction","authors":"Felix Salfner, M. Malek","doi":"10.1109/SRDS.2007.35","DOIUrl":null,"url":null,"abstract":"A proactive handling of faults requires that the risk of upcoming failures is continuously assessed. One of the promising approaches is online failure prediction, which means that the current state of the system is evaluated in order to predict the occurrence of failures in the near future. More specifically, we focus on methods that use event-driven sources such as errors. We use hidden semi-Markov models (HSMMs)for this purpose and demonstrate effectiveness based on field data of a commercial telecommunication system. For comparative analysis we selected three well-known failure prediction techniques: a straightforward method that is based on a reliability model, dispersion frame technique by Lin and Siewiorek and the eventset-based method introduced by Vilalta et al. We assess and compare the methods in terms of precision, recall, F-measure, false-positive rate, and computing time. The experiments suggest that our HSMM approach is very effective with respect to online failure prediction.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"161","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2007.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 161

Abstract

A proactive handling of faults requires that the risk of upcoming failures is continuously assessed. One of the promising approaches is online failure prediction, which means that the current state of the system is evaluated in order to predict the occurrence of failures in the near future. More specifically, we focus on methods that use event-driven sources such as errors. We use hidden semi-Markov models (HSMMs)for this purpose and demonstrate effectiveness based on field data of a commercial telecommunication system. For comparative analysis we selected three well-known failure prediction techniques: a straightforward method that is based on a reliability model, dispersion frame technique by Lin and Siewiorek and the eventset-based method introduced by Vilalta et al. We assess and compare the methods in terms of precision, recall, F-measure, false-positive rate, and computing time. The experiments suggest that our HSMM approach is very effective with respect to online failure prediction.
基于隐半马尔可夫模型的有效在线故障预测
主动处理故障需要持续评估即将发生故障的风险。其中一个很有前途的方法是在线故障预测,这意味着评估系统的当前状态,以预测在不久的将来发生的故障。更具体地说,我们关注使用事件驱动源(如错误)的方法。我们使用隐半马尔可夫模型(HSMMs)来实现这一目的,并基于商业电信系统的现场数据证明了其有效性。为了进行比较分析,我们选择了三种著名的故障预测技术:基于可靠性模型的直接方法,Lin和Siewiorek的分散框架技术以及Vilalta等人介绍的基于事件集的方法。我们在精确度、召回率、f值、假阳性率和计算时间方面评估和比较了这些方法。实验表明,我们的HSMM方法对于在线故障预测是非常有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信