{"title":"Stateful Detection in High Throughput Distributed Systems","authors":"G. Khanna, I. Laguna, F. Arshad, S. Bagchi","doi":"10.1109/SRDS.2007.15","DOIUrl":null,"url":null,"abstract":"With the increasing speed of computers and the complexity of applications, many of today's distributed systems exchange data at a high rate. Significant work has been done in error detection achieved through external fault tolerance systems. However, the high data rate coupled with complex detection can cause the capacity of the fault tolerance system to be exhausted resulting in low detection accuracy. We present a new stateful detection mechanism which observes the exchanged application messages, deduces the application state, and matches against anomaly-based rules. We extend our previous framework (the monitor) to incorporate a sampling approach which adjusts the rate of verified messages. The sampling approach avoids the previously reported breakdown in the monitor capacity at high application message rates, reduces the overall detection cost and allows the monitor to provide accurate detection. We apply the approach to a reliable multicast protocol (TRAM) and demonstrate its performance by comparing it with our previous framework.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2007.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
With the increasing speed of computers and the complexity of applications, many of today's distributed systems exchange data at a high rate. Significant work has been done in error detection achieved through external fault tolerance systems. However, the high data rate coupled with complex detection can cause the capacity of the fault tolerance system to be exhausted resulting in low detection accuracy. We present a new stateful detection mechanism which observes the exchanged application messages, deduces the application state, and matches against anomaly-based rules. We extend our previous framework (the monitor) to incorporate a sampling approach which adjusts the rate of verified messages. The sampling approach avoids the previously reported breakdown in the monitor capacity at high application message rates, reduces the overall detection cost and allows the monitor to provide accurate detection. We apply the approach to a reliable multicast protocol (TRAM) and demonstrate its performance by comparing it with our previous framework.