Stateful Detection in High Throughput Distributed Systems

2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007) Pub Date : 2007-10-10 DOI:10.1109/SRDS.2007.15

G. Khanna, I. Laguna, F. Arshad, S. Bagchi

引用次数: 9

Abstract

With the increasing speed of computers and the complexity of applications, many of today's distributed systems exchange data at a high rate. Significant work has been done in error detection achieved through external fault tolerance systems. However, the high data rate coupled with complex detection can cause the capacity of the fault tolerance system to be exhausted resulting in low detection accuracy. We present a new stateful detection mechanism which observes the exchanged application messages, deduces the application state, and matches against anomaly-based rules. We extend our previous framework (the monitor) to incorporate a sampling approach which adjusts the rate of verified messages. The sampling approach avoids the previously reported breakdown in the monitor capacity at high application message rates, reduces the overall detection cost and allows the monitor to provide accurate detection. We apply the approach to a reliable multicast protocol (TRAM) and demonstrate its performance by comparing it with our previous framework.

查看原文本刊更多论文

高吞吐量分布式系统中的状态检测

随着计算机速度的提高和应用程序的复杂性，当今许多分布式系统都以高速率交换数据。在通过外部容错系统实现错误检测方面已经做了大量的工作。然而，高数据速率加上复杂的检测会使容错系统的容量耗尽，导致检测精度低。我们提出了一种新的状态检测机制，通过观察交换的应用程序消息，推断应用程序状态，并根据异常规则进行匹配。我们扩展了之前的框架(监视器)，以纳入一种采样方法，该方法可以调整已验证消息的速率。采样方法避免了以前报告的在高应用消息率下监视器容量的崩溃，降低了总体检测成本，并允许监视器提供准确的检测。我们将该方法应用于可靠组播协议(TRAM)，并通过与之前的框架进行比较来验证其性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)

自引率

0.00%

发文量