{"title":"Activity Monitoring to Guarantee File Availability in Structured P2P File-sharing Systems","authors":"Guowei Huang, Zhi Chen, Qi Zhao, G. Wu","doi":"10.1109/SRDS.2007.19","DOIUrl":"https://doi.org/10.1109/SRDS.2007.19","url":null,"abstract":"A cooperative structured peer-to-peer file-sharing system requires that the nodes participating in the system need to maintain the location mappings of other nodes. However, the selfish nodes have no incentive to donate their own resources to other nodes and they will refuse to take the responsibility for the maintenance of location mappings, if they can get the system's resources for free. Thus, this free-riding behavior of selfish nodes will make many files unavailable to the whole system. To address this problem, we propose a robust distributed system to monitor the activities of each node and evaluate the node's reputation, which influences the node's access to the system resources. We analyze the system and demonstrate that it is scalable and is secure under a strong attack model. Furthermore, simulation results show that the system can detect and prevent the free- riding behavior effectively and efficiently.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116640132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model Checking of Consensus Algorit","authors":"Tatsuhiro Tsuchiya, A. Schiper","doi":"10.1109/SRDS.2007.20","DOIUrl":"https://doi.org/10.1109/SRDS.2007.20","url":null,"abstract":"We show for the first time that standard model checking allows one to completely verify asynchronous algorithms for solving consensus, a fundamental problem in fault-tolerant distributed computing. Model checking is a powerful verification methodology based on state exploration. However it has rarely been applied to consensus algorithms, because these algorithms induce huge, often infinite state spaces. Here we focus on consensus algorithms based on the Heard-Of model, a new computation model for distributed computing. By making use of the high abstraction level provided by this computation model and by devising a finite representation of unbounded timestamps, we develop a methodology for verifying consensus algorithms in every possible state by model checking.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120952045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harry C. Li, Allen Clement, Amitanand S. Aiyer, L. Alvisi
{"title":"The Paxos Register","authors":"Harry C. Li, Allen Clement, Amitanand S. Aiyer, L. Alvisi","doi":"10.1109/SRDS.2007.32","DOIUrl":"https://doi.org/10.1109/SRDS.2007.32","url":null,"abstract":"We introduce the Paxos register to simplify and unify the presentation of Paxos-style consensus protocols. We use our register to show how Lamport's Classic Paxos and Castro and Liskov's Byzantine Paxos are the same consensus protocol, but for different failure models. We also use our register to compare and contrast Byzantine Paxos with Martin and Alvisi's fast Byzantine consensus. The Paxos register is a write-once register that exposes two important abstractions for reaching consensus: (i) read and write operations that capture how processes in Paxos protocols propose and decide values and (ii) tokens that capture how these protocols guarantee agreement despite partial failures. We encapsulate the differences of several Paxos-style protocols in the implementation details of these abstractions.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130462837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Customizable Fault Tolerance forWide-Area Replication","authors":"Y. Amir, B. Coan, J. Kirsch, John Lane","doi":"10.1109/SRDS.2007.40","DOIUrl":"https://doi.org/10.1109/SRDS.2007.40","url":null,"abstract":"Constructing logical machines out of collections of physical machines is a well-known technique for improving the robustness and fault tolerance of distributed systems. We present a new, scalable replication architecture, built upon logical machines specifically designed to perform well in wide-area systems spanning multiple sites. The physical machines in each site implement a logical machine by running a local state machine replication protocol, and a wide-area replication protocol runs among the logical machines. Implementing logical machines via the state machine approach affords free substitution of the fault tolerance method used in each site and in the wide-area replication protocol, allowing one to balance performance and fault tolerance based on perceived risk. We present a new byzantine fault-tolerant protocol that establishes a reliable virtual communication link between logical machines. Our communication protocol is efficient (a necessity in wide-area environments), avoiding the need for redundant message sending during normal-case operation and allowing a logical machine to consume approximately the same wide-area bandwidth as a single physical machine. This dramatically improves the wide-area performance of our system compared to existing logical machine based approaches. We implemented a prototype system and compare its performance and fault tolerance to existing solutions.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131713335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Epidemic Broadcast Trees","authors":"J. Leitao, J. Pereira, L. Rodrigues","doi":"10.1109/SRDS.2007.27","DOIUrl":"https://doi.org/10.1109/SRDS.2007.27","url":null,"abstract":"There is an inherent trade-off between epidemic and deterministic tree-based broadcast primitives. Tree-based approaches have a small message complexity in steady-state but are very fragile in the presence of faults. Gossip, or epidemic, protocols have a higher message complexity but also offer much higher resilience. This paper proposes an integrated broadcast scheme that combines both approaches. We use a low cost scheme to build and maintain broadcast trees embedded on a gossip-based overlay. The protocol sends the message payload preferably via tree branches but uses the remaining links of the gossip overlay for fast recovery and expedite tree healing. Experimental evaluation presented in the paper shows that our new strategy has a low overhead and that is able to support large number of faults while maintaining a high reliability.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130804134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating Byzantine Quorum Systems","authors":"W. Dantas, A. Bessani, J. Fraga, M. Correia","doi":"10.1109/SRDS.2007.34","DOIUrl":"https://doi.org/10.1109/SRDS.2007.34","url":null,"abstract":"Replication is a mechanism extensively used to guarantee the availability and good performance of data storage services. Byzantine Quorum Systems (BQS) have been proposed as a solution to guarantee the consistency of that kind of services, even if some of the replicas fail arbitrarily. Many BQS have been proposed recently, but comparing their performance is not simple. In fact, it has been shown that theoretical metrics like the number of steps or communication rounds say as much about the practical performance of distributed algorithms as they hide. This paper presents a comparative evaluation of several BQS algorithms in the literature. The evaluation is based both on experiments and simulations. For that purpose, a framework for evaluating BQS called BQSNeko was developed. The results of the evaluation allow a better understanding of the algorithms and the tradeoffs involved.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130210160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Fail-Heterogeneous Architectural Model","authors":"M. Serafini, N. Suri","doi":"10.1109/SRDS.2007.33","DOIUrl":"https://doi.org/10.1109/SRDS.2007.33","url":null,"abstract":"Fault tolerant distributed protocols typically utilize a homogeneous fault model, either fail-crash or fail-Byzantine, where all processors are assumed to fail in the same manner. In practice, due to complexity and evolvability reasons, only a subset of the nodes can actually be designed to have a restricted, fail-crash failure mode, provided that they are free of design faults. Based on this consideration, we propose a fail-heterogeneous architectural model for distributed systems which considers two classes of nodes: (a) full-fledged execution nodes, which can be fail-Byzantine, and (b) lightweight, validated coordination nodes, which can only be fail-crash. To illustrate the model we introduce HeterTrust as a practical trustworthy service replication protocol. It has a low latency overhead, requires few execution nodes with diversified design, and prevents intruded servers from disclosing confidential data. We also discuss applications of the model to DoS attacks mitigation and to group membership.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122750305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Eventual Clusterer Oracle and Its Application to Consensus in MANETs","authors":"Weigang Wu, Jiannong Cao, M. Raynal","doi":"10.1109/srds.2007.24","DOIUrl":"https://doi.org/10.1109/srds.2007.24","url":null,"abstract":"This paper studies the design of hierarchical consensus protocols for mobile ad hoc networks. A two-layer hierarchy is imposed on the mobile hosts by grouping them into clusters, each with a clusterhead. The messages from and to the hosts in the same cluster are merged/unmerged by the clusterhead so as to reduce the message cost and improve the scalability. We adopt a modular method in the design, separating clustering from achieving consensus using the clusters. The clustering function, named eventual clusterer (denoted as diamC), is designed to construct a cluster-based hierarchy over the mobile hosts in the network. Since diamC provides the fault tolerant clustering function transparently, it can be used as a new oracle (i.e. an abstract tool to provide some kind of information about the state of the system) for the design of hierarchical consensus protocols. Based on diamC, we design a new consensus protocol, which can significantly reduce the message cost of achieving consensus. We also propose an implementation of the diamC oracle based on the failure detector diamS.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134642642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Analytical Framework and Its Applications for Studying Brick Storage Reliability","authors":"Ming Chen, Wei Chen, Likun Liu, Zheng Zhang","doi":"10.1109/SRDS.2007.21","DOIUrl":"https://doi.org/10.1109/SRDS.2007.21","url":null,"abstract":"The reliability of a large-scale storage system is influenced by a complex set of inter-dependent factors. This paper presents a comprehensive and extensible analytical framework that offers quantitative answers to many design tradeoffs. We apply the framework to a number of important design strategies that a designer and/or administrator must face in reality, including topology-aware replica placement, proactive replication that uses small background network bandwidth and unused disk space to create additional copies. We also quantify the impact of slow (but potentially more accurate) failure detection and lazy replacement of failed disks. We use detailed simulation to verify and refine our analytical model. These results demonstrate the versatility of the framework and serve as a solid step towards more quantitative studies of fundamental system tradeoffs between reliability, performance, and cost in large-scale distributed storage systems.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128740657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Software-based Attestation for Node Compromise Detection in Sensor Networks","authors":"Yi Yang, Xinran Wang, Sencun Zhu, G. Cao","doi":"10.1109/SRDS.2007.31","DOIUrl":"https://doi.org/10.1109/SRDS.2007.31","url":null,"abstract":"Sensors that operate in an unattended, harsh or hostile environment are vulnerable to compromises because their low costs preclude the use of expensive tamper-resistant hardware. Thus, an adversary may reprogram them with malicious code to launch various insider attacks. Based on verifying the genuineness of the running program, we propose two distributed software-based attestation schemes that are well tailored for sensor networks. These schemes are based on a pseudorandom noise generation mechanism and a lightweight block-based pseudorandom memory traversal algorithm. Each node is loaded with pseudorandom noise in its empty program memory before deployment, and later on multiple neighbors of a suspicious node collaborate to verify the integrity of the code running on this node in a distributed manner. Our analysis and simulation show that these schemes achieve high detection rate even when multiple compromised neighbors collude in an attestation process.","PeriodicalId":224921,"journal":{"name":"2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121021378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}