Anomaly-based Fault Detection System in Distributed System

B. Kim, S. Hariri
{"title":"Anomaly-based Fault Detection System in Distributed System","authors":"B. Kim, S. Hariri","doi":"10.1109/SERA.2007.55","DOIUrl":null,"url":null,"abstract":"One of the important design criteria for distributed systems and their applications is their reliability and robustness to hardware and software failures. The increase in complexity, inter connectedness, dependency and the asynchronous interactions between the components that include hardware resources (computers, servers, network devices), and software (application services, middleware, web services, etc.) makes the fault detection and tolerance a challenging research problem. In this paper, we present an innovative approach based on statistical and data mining techniques to detect faults (hardware or software) and also identify the source of the fault. In our approach, we monitor and analyze in realtime all the interactions between all the components of a distributed system. We used data mining and supervised learning techniques to obtain the rules that can accurately model the normal interactions among these components. Our anomaly analysis engine will immediately produce an alert whenever one or more of the interaction rules that capture normal operations is violated due to a software or hardware failure. We evaluate the effectiveness of our approach and its performance to detect software faults that we inject asynchronously, and compare the results for different noise level.","PeriodicalId":181543,"journal":{"name":"5th ACIS International Conference on Software Engineering Research, Management & Applications (SERA 2007)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th ACIS International Conference on Software Engineering Research, Management & Applications (SERA 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERA.2007.55","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

One of the important design criteria for distributed systems and their applications is their reliability and robustness to hardware and software failures. The increase in complexity, inter connectedness, dependency and the asynchronous interactions between the components that include hardware resources (computers, servers, network devices), and software (application services, middleware, web services, etc.) makes the fault detection and tolerance a challenging research problem. In this paper, we present an innovative approach based on statistical and data mining techniques to detect faults (hardware or software) and also identify the source of the fault. In our approach, we monitor and analyze in realtime all the interactions between all the components of a distributed system. We used data mining and supervised learning techniques to obtain the rules that can accurately model the normal interactions among these components. Our anomaly analysis engine will immediately produce an alert whenever one or more of the interaction rules that capture normal operations is violated due to a software or hardware failure. We evaluate the effectiveness of our approach and its performance to detect software faults that we inject asynchronously, and compare the results for different noise level.
基于异常的分布式系统故障检测系统
分布式系统及其应用的重要设计标准之一是其对硬件和软件故障的可靠性和鲁棒性。包括硬件资源(计算机、服务器、网络设备)和软件(应用程序服务、中间件、web服务等)在内的组件之间的复杂性、互连性、依赖性和异步交互的增加使得故障检测和容错成为一个具有挑战性的研究问题。在本文中,我们提出了一种基于统计和数据挖掘技术的创新方法来检测故障(硬件或软件)并识别故障来源。在我们的方法中,我们实时监控和分析分布式系统中所有组件之间的所有交互。我们使用数据挖掘和监督学习技术来获得能够准确建模这些组件之间正常交互的规则。每当捕获正常操作的一个或多个交互规则由于软件或硬件故障而被违反时,我们的异常分析引擎将立即产生警报。我们评估了该方法的有效性及其检测异步注入的软件故障的性能,并比较了不同噪声水平下的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信