{"title":"Scalable Topology Discovery and Link State Detection Using Routing Events","authors":"M. Srivatsa, B. J. Ko, A. Beygelzimer, V. Madduri","doi":"10.1109/SRDS.2008.17","DOIUrl":"https://doi.org/10.1109/SRDS.2008.17","url":null,"abstract":"Discovering the topology of a network and detecting link state changes (e.g.: link failures) is an essential element for various network management and monitoring tasks. In this paper, we investigate scalable mechanisms to monitor the topology and link states of networks based on information available in network nodes' routing tables. We first present an algorithm that infers the network topology based on the full or partial information about network distances between nodes, based on which we obtain a scalable network topology discovery solution via a novel use of random walk in graphs. We then present scalable algorithms to detect the state changes of remote links by monitoring the routing tables of a small fraction of the routers, where the routers to be monitored are selected by a greedy approach to an NP-complete Tree Cover problem. We show the efficacy and scalability of our topology monitoring algorithms through experimental evaluation performed both on synthetic topologies and on a large topology data-set from a real enterprise network.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115221423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Incremental File System Consistency Checker for Block-Level CDP Systems","authors":"Maohua Lu, T. Chiueh, Shibiao Lin","doi":"10.1109/SRDS.2008.20","DOIUrl":"https://doi.org/10.1109/SRDS.2008.20","url":null,"abstract":"A block-level continuous data protection (CDP) system logs every disk block update from an application server (e.g., a file or DBMS server) to a storage system so that any disk updates within a time window are undoable, and thus is able to provide a more flexible and efficient data protection service than conventional periodic data backup systems. Unfortunately, no existing block-level CDP systems can support arbitrary point-in-time snapshots that are guaranteed to be consistent with respect to the metadata of the application server. This deficiency seriously limits the flexibility in recovery point objective (RTO) of block-level CDP systems from the standpoint of the application servers whose data they protect. This paper describes an incremental file system check mechanism (iFSCK) that is designed to address this deficiency for file servers, and exploits file system-specific knowledge to quickly fix an arbitrary point-in-time block-level snapshot so that it is consistent with respect to file system metadata. Performance measurements taken from a fully operational iFSCK prototype show that iFSCK can turn a 10 GB point-in-time block-level snapshot to be file-system consistent in less than 1 second, and takes less than 25% of the time required by the Fsck utility for vanilla ext3 under relaxed metadata consistency requirements.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116733641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gal Badishi, A. Herzberg, I. Keidar, Oleg Romanov, Avital Yachin
{"title":"An Empirical Study of Denial of Service Mitigation Techniques","authors":"Gal Badishi, A. Herzberg, I. Keidar, Oleg Romanov, Avital Yachin","doi":"10.1109/SRDS.2008.27","DOIUrl":"https://doi.org/10.1109/SRDS.2008.27","url":null,"abstract":"We present an empirical study of the resistance of several protocols to denial of service (DoS) attacks on client-server communication. We show that protocols that use authentication alone, e.g., IPSec, provide protection to some extent, but are still susceptible to DoS attacks, even when the network is not congested. In contrast, a protocol that uses a changing filtering identifier (FI) is usually immune to DoS attacks, as long as the network itself is not congested. This approach is called FI hopping. We build and experiment with two prototype implementations of FI hopping. One implementation is a modification of IPSec in a Linux kernel, and a second implementation comes as an NDIS hook driver on a Windows machine. We present results of experiments in which client-server communication is subject to a DoS-attack. Our measurements illustrate that FI hopping withstands severe DoS attacks without hampering the client-server communication. Moreover, our implementations show that FI hopping is simple, practical, and easy to deploy.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134574605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Tian, Zhi Yang, Wei Chen, Ben Y. Zhao, Yafei Dai
{"title":"Probabilistic Failure Detection for Efficient Distributed Storage Maintenance","authors":"Jing Tian, Zhi Yang, Wei Chen, Ben Y. Zhao, Yafei Dai","doi":"10.1109/SRDS.2008.28","DOIUrl":"https://doi.org/10.1109/SRDS.2008.28","url":null,"abstract":"Distributed storage systems often use data replication to mask failures and guarantee high data availability. Node failures can be transient or permanent. While the system must generate new replicas to replace replica lost to permanent failures, it can save significant replication costs by not replicating following transient faults. Given the unpredictability of network dynamics, however, distinguishing permanent and transient failures is extremely difficult. Traditional timeout approaches are difficult to tune and can introduce unnecessary replication. In this paper, we propose Protector, an algorithm that addresses this problem using network-wide statistical prediction. Our algorithm drastically improves prediction accuracy by making predictions across aggregate replica groups instead of single nodes. These estimates of the number of \"live replicas\" can guide efficient data replication policies. We prove that given data on node down times and the probability of permanent failures, the estimate given by our algorithm is more accurate than all alternatives. We describe two ways to obtain the failure probability function driven by models or traces. We conduct extensive simulations based both on synthetic and real traces, and show that Protector closely approximates the performance of a perfect \"oracle\" failure detector, while significantly outperforming timeout-based detectors using a wide range of parameters.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"124 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114087271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Barcellos, Daniel Bauermann, Henrique Sant'anna, Matheus B. Lehmann, R. Mansilha
{"title":"Protecting BitTorrent: Design and Evaluation of Effective Countermeasures against DoS Attacks","authors":"M. Barcellos, Daniel Bauermann, Henrique Sant'anna, Matheus B. Lehmann, R. Mansilha","doi":"10.1109/SRDS.2008.26","DOIUrl":"https://doi.org/10.1109/SRDS.2008.26","url":null,"abstract":"BitTorrent is a P2P file-sharing protocol that can be used to efficiently distribute files such as software updates and digital content to very large numbers of users. In a previous paper, we have shown that vulnerabilities can be exploited to launch Denial-of-Service attacks against BitTorrent swarms, which can substantially increase download times and network traffic. In this paper, we review the three most damaging attacks, and propose two algorithms as countermeasures to effectively tackle them. We implemented the attacks and countermeasures in a packet-level BitTorrent simulator. The results indicate that our proposed approach is effective when there is an ongoing attack while at the same time efficient when the countermeasure is active but there is no attack. To the best of our knowledge, this is the first proposal in the literature to make BitTorrent more robust against Denial-of-Service (DoS) attacks.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132315119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Internal Clock Synchronization","authors":"Zbigniew Jerzak, Robert Fach, C. Fetzer","doi":"10.1109/SRDS.2008.32","DOIUrl":"https://doi.org/10.1109/SRDS.2008.32","url":null,"abstract":"Existing clock synchronization algorithms assume a bounded clock reading error. This, in turn, results in an inflexible design that typically requires node crashes whenever the given bound might be violated. We propose a novel, adaptive internal clock synchronization algorithm which allows to compute the deviation between the clocks during runtime. The computed deviation can be propagated to the application layer to allow it to adapt its behavior according to the current clock deviation. The contributions of this paper are: (1) a new specification of a relaxed clock synchronization problem, and (2) a new clock synchronization algorithm with a novel approach to dealing with crash failures.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130427361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-Stabilization in Tree-Structured Peer-to-Peer Service Discovery Systems","authors":"E. Caron, A. Datta, F. Petit, Cédric Tedeschi","doi":"10.1109/SRDS.2008.18","DOIUrl":"https://doi.org/10.1109/SRDS.2008.18","url":null,"abstract":"The efficiency of service discovery is critical in the development of fully decentralized middleware intended to manage large scale computational grids. This demand influenced the design of many peer-to-peer based approaches. The ability to cope with the expressiveness of the service discovery was behind the design of a new kind of overlay structures that is based on tries, or prefix trees. Although these overlays are well designed, one of their weaknesses is the lack of any concrete fault tolerant mechanism, especially in dynamic platforms; the faults are handled by using preventive and costly mechanisms, eg using a high degree of replication. Moreover, those systems cannot handle any arbitrary transient failure. Self-stabilization, which is an efficient approach to designreliable solutions for dynamic systems, was recently suggested to be a good alternative to inject fault-tolerance in peer-to-peer systems. However, most of the previous research on self-stabilization in tree and/or P2P networks was designed in theoretical models, making these approaches hard to implement in practice. In this paper, we provide a self-stabilizing message passing protocol to maintain prefix trees over practical peer-to-peer networks. A complete correctness proof is provided, as well as simulation results to estimate the practical impact of our protocol.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130996217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assuring Resilient Time Synchronization","authors":"A. Bondavalli, A. Ceccarelli, Lorenzo Falai","doi":"10.1109/SRDS.2008.12","DOIUrl":"https://doi.org/10.1109/SRDS.2008.12","url":null,"abstract":"In many distributed and pervasive systems the clocks of nodes are required to be synchronized to a unique global time. Due to unpredictable system and environment characteristics, the distance of a local clock from global time is a variable factor very hard to predict. Systems usually adopt measures to guarantee an upper bound on such distance from global time that are very often quite far from typical execution scenarios and thus are of practical little use. As a consequence, while in many circumstances reliable information on the actual distance from global time would improve system behaviour, unfortunately such information is usually not available. In this paper we propose the Reliable and Self-Aware Clock (R&SAClock), a low-intrusive software service that is able to compute a conservative estimation of distance from an external global time. R&SAClock acts as a new clock that couples information gained from synchronization mechanisms with information collected from the local clock to provide both current time and a self-adaptive reliable estimation of distance from global time. This paper describes the R&SAClock as a system component: we define its main functions, services and time-related mechanisms. Finally details of an implementation of the R&SAClock for the NTP synchronization mechanism and Linux OS are shown.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133119154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ALPS: Authenticating Live Peer-to-Peer Live Streams","authors":"R. Meier, Roger Wattenhofer","doi":"10.1109/SRDS.2008.33","DOIUrl":"https://doi.org/10.1109/SRDS.2008.33","url":null,"abstract":"Live streaming is one of many applications where data is continuously created, and has to be quickly distributed among a large number of users. The peer-to-peer paradigm is thereby attracting interest with the prospect of overcoming scalability issues of more centralized approaches. Since data blocks travel along multiple (possibly malicious) peers, authenticating the origin of blocks becomes of prime importance to guarantee safety and reliability. The asymmetry of a single source and an arbitrary number of untrusted receivers requires the use of digital signatures and public key cryptography in general. This paper proposes a new signature scheme for broadcast authentication tailored towards peer-to-peer systems to overcome limitations of traditional approaches based on signature schemes like RSA and DSA, most notably in terms of delays, signature size, and computational complexity. It may further be of practical interest for other real-time applications such as massive multiplayer peer-to-peer gaming.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115678788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"POSH: Proactive co-Operative Self-Healing in Unattended Wireless Sensor Networks","authors":"R. D. Pietro, Di Ma, Claudio Soriente, G. Tsudik","doi":"10.1109/SRDS.2008.23","DOIUrl":"https://doi.org/10.1109/SRDS.2008.23","url":null,"abstract":"Unattended Wireless Sensor Networks (UWSNs) are composed of many small resource-constrained devices and operate autonomously, gathering data which is periodically collected by a visiting sink. Unattended mode of operation, deployment in hostile environments and value (or criticality) of collected data are some of the factors that complicate UWSN security. This paper makes two contributions. First, it explores a new threat model involving a mobile adversary who periodically compromises and releases sensors aiming to maximize its advantage and overall knowledge of collected data. Second, it constructs a self-healing protocol that allows sensors to continuously and collectively recover from compromise. The proposed protocol is both effective and efficient, as supported by analytical and simulation results.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134488611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}