Tracking Probabilistic Correlation of Monitoring Data for Fault Detection in Complex Systems

Zhen Guo, Guofei Jiang, Haifeng Chen, K. Yoshihira
{"title":"Tracking Probabilistic Correlation of Monitoring Data for Fault Detection in Complex Systems","authors":"Zhen Guo, Guofei Jiang, Haifeng Chen, K. Yoshihira","doi":"10.1109/DSN.2006.70","DOIUrl":null,"url":null,"abstract":"Due to their growing complexity, it becomes extremely difficult to detect and isolate faults in complex systems. While large amount of monitoring data can be collected from such systems for fault analysis, one challenge is how to correlate the data effectively across distributed systems and observation time. Much of the internal monitoring data reacts to the volume of user requests accordingly when user requests flow through distributed systems. In this paper, we use Gaussian mixture models to characterize probabilistic correlation between flow-intensities measured at multiple points. A novel algorithm derived from expectation-maximization (EM) algorithm is proposed to learn the \"likely\" boundary of normal data relationship, which is further used as an oracle in anomaly detection. Our recursive algorithm can adaptively estimate the boundary of dynamic data relationship and detect faults in real time. Our approach is tested in a real system with injected faults and the results demonstrate its feasibility","PeriodicalId":228470,"journal":{"name":"International Conference on Dependable Systems and Networks (DSN'06)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"90","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Dependable Systems and Networks (DSN'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2006.70","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 90

Abstract

Due to their growing complexity, it becomes extremely difficult to detect and isolate faults in complex systems. While large amount of monitoring data can be collected from such systems for fault analysis, one challenge is how to correlate the data effectively across distributed systems and observation time. Much of the internal monitoring data reacts to the volume of user requests accordingly when user requests flow through distributed systems. In this paper, we use Gaussian mixture models to characterize probabilistic correlation between flow-intensities measured at multiple points. A novel algorithm derived from expectation-maximization (EM) algorithm is proposed to learn the "likely" boundary of normal data relationship, which is further used as an oracle in anomaly detection. Our recursive algorithm can adaptively estimate the boundary of dynamic data relationship and detect faults in real time. Our approach is tested in a real system with injected faults and the results demonstrate its feasibility
复杂系统故障检测监测数据的概率相关性跟踪
由于复杂系统的复杂性日益增加,在复杂系统中检测和隔离故障变得极其困难。虽然可以从这些系统中收集大量的监测数据用于故障分析,但一个挑战是如何跨分布式系统和观察时间有效地关联数据。当用户请求流经分布式系统时,许多内部监控数据会对用户请求量做出相应的反应。在本文中,我们使用高斯混合模型来描述在多点测量的流强度之间的概率相关性。在期望最大化(EM)算法的基础上,提出了一种学习正常数据关系“可能”边界的新算法,并将其作为异常检测的“预言器”。递归算法能够自适应估计动态数据关系的边界,实时检测故障。在实际系统中对该方法进行了测试,结果表明了该方法的可行性
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信