{"title":"Why is it so hard to predict software system trustworthiness from software component trustworthiness?","authors":"J. Voas","doi":"10.1109/RELDIS.2001.969773","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969773","url":null,"abstract":"When software is built from components, nonfunctional properties such as security, reliability, fault-tolerance, performance, availability, safety, etc. are not necessarily composed. The problem stems from our inability to know a priori, for example, that the security of a system composed of two components can be determined from knowledge about the security of each. This is because the security of the composite is based on more than just the security of the individual components. There are numerous reasons for this. The article considers only the factors of component performance and calendar time. It is concluded that no properties are easy to compose and some are much harder than others.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121896635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Kuznetsov, R. Guerraoui, S. Handurukande, Anne-Marie Kermarrec
{"title":"Reducing noise in gossip-based reliable broadcast","authors":"P. Kuznetsov, R. Guerraoui, S. Handurukande, Anne-Marie Kermarrec","doi":"10.1109/RELDIS.2001.969775","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969775","url":null,"abstract":"We present in this paper a general garbage collection scheme that reduces the \"noise\" in gossip-based broadcast algorithms. In short, our garbage collection scheme uses a simple heuristic to trade \"useless\" messages with \"useful\" ones. Used with a given gossip-based broadcast algorithm, a given size of buffers, and a given number of disseminated messages (e.g., per gossip round), our garbage collection scheme provides higher overall reliability than more conventional schemes. We illustrate our approach through two algorithms: bimodal multicast (pbcast) and lightweight probabilistic broadcast (lpbcast). Our scheme is based on the intuitive idea of discarding messages according to their \"age\". The \"age\" of a message represents the number of times the message has been retransmitted.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131659641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An analytical framework for reasoning about intrusions","authors":"S. Upadhyaya, R. Chinchani, K. Kwiat","doi":"10.1109/RELDIS.2001.969760","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969760","url":null,"abstract":"Local and wide area network information assurance analysts need current and precise knowledge about their system activities in order to address the challenges of critical infrastructure protection. In particular, the analyst needs to know in real-time that an intrusion has occurred so that an active response and recovery thread can be created rapidly. Existing intrusion detection solutions are basically after-the-fact, thereby offering very little in terms of damage confinement and restoration of service. Quick recovery is only possible if the assessment scheme has low latency and it occurs in real-time. The objective of the paper is to develop a reasoning framework to aid in the real-time detection and assessment task that is based on a novel idea of encapsulation of owner's intent. The theoretical framework developed here will help resolve dubious circumstances that may arise while inferring the premises of operations (encapsulated from owner's intent) by way of examining the observed conclusions resulting from the actual operations of the owner. This reasoning is significant in view of the fact that intrusion signaling is not a binary decision unlike error detection in traditional fault tolerance. Our reasoning framework has been developed by leveraging the concepts of cost analysis and pricing under uncertainty found in economics and finance. Our main result is the modeling of user activity on a computing system as a martingale and the subsequent quantification of the cost of performing a job to enable decision making.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127599301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reliable real-time cooperation of mobile autonomous systems","authors":"S. Schemmer, E. Nett, M. Mock","doi":"10.1109/RELDIS.2001.970774","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.970774","url":null,"abstract":"Autonomous systems are expected to provide increasingly complex and safety-critical services that will, sooner or later, require the cooperation of several autonomous systems for their fulfillment. In particular, coordinating the access to shared physical and information technological resources will become a general problem. Scheduling these resources is subject to strong real-time and reliability requirements. In this paper, we present an architecture that allows autonomous mobile systems to schedule shared resources in real-time using their own wireless distributed infrastructure. In our architecture, there is a clear separation between the application-specific scheduling part that is modeled as a function of the global state and the communication part that is used to provide the global state. By isolating the more error-prone communication part within a communication hardcore, the reliability of the overall system is increased and the locally executed scheduling function can be designed with primary focus on the application-specific real-time requirements.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125110258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incorporation of security and fault tolerance mechanisms into real-time component-based distributed computing systems","authors":"K. Kim","doi":"10.1109/RELDIS.2001.969752","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969752","url":null,"abstract":"The volume and size of real-time (RT) distributed computing (DC) applications are now growing faster than in the last century. The mixture of application tasks running on such systems is growing as well as the shared use of computing and communication resources for multiple applications including RT and non-RT applications. The increase in use of shared resources accompanies with it the need for effective security enforcement. More specifically, the needs are to prevent unauthorized users: (1) from accessing protected information; and (2) from disturbing bona-fide users in getting services from server components. Such disturbances are also called denial-of-service attacks.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131328755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Primary-backup replication: from a time-free protocol to a time-based implementation","authors":"R. Oliveira, J. Pereira, A. Schiper","doi":"10.1109/RELDIS.2001.969730","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969730","url":null,"abstract":"Fault-tolerant control systems can be built by replicating critical components. However replication raises the issue of inconsistency. Multiple protocols for ensuring consistency have been described in the literature. PADRE (Protocol for Asymmetric Duplex REdundancy) is such a protocol, and an interesting case study of a complex and sensitive problem: the management of replicated traffic controllers in a railway system. However, the low level at which the protocol has been developed embodies system details, namely timeliness assumptions, that make it difficult to understand and may narrow its applicability. We argue that, when designing a protocol, it is preferable to consider first a general solution that does not include any timeliness assumptions; then, by taking into account an additional hypothesis, one can easily design a time-based solution tailored to a specific environment. This paper illustrates the benefit of a top-down protocol design approach and shows that PADRE can be seen as an instance of a standard primary-backup replication protocol based on view-synchronous communication (VSC).","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128989501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantifying rollback propagation in distributed checkpointing","authors":"A. Agbaria, H. Attiya, R. Friedman, R. Vitenberg","doi":"10.1109/RELDIS.2001.969737","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969737","url":null,"abstract":"Proposes a new classification of executions with checkpoints that is based on the notion of k-rollback, indicating the maximal number of checkpoints that may need to be rolled back during recovery. The relation between known execution classes is explored, and it is shown that coordinated checkpointing, SZPF (strictly Z-path free) and ZPF (Z-path free) are 1-rollback mechanisms, while ZCF (Z-cycle free) is (n-1)-rollback, where n is the number of participants in an execution. A new class of executions, called d-BC (d-bounded cycles), is introduced, and is shown to be an [(n-1)/spl middot/d]-rollback mechanism (ZCF is a special case of d-BC for d=1). Finally, a d-BC protocol is presented. This protocol has the nice property that it does not impose any control information overhead on an application's messages, yet it only sends a few control messages of its own. Moreover, the protocol maintains information about recovery lines, which enables very efficient discovery of the most recent recovery line that existed a short time before the failure.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"37 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114042990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient TDMA synchronization for distributed embedded systems","authors":"Vilgot Claesson, Henrik Lönn, N. Suri","doi":"10.1109/RELDIS.2001.970769","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.970769","url":null,"abstract":"A desired attribute in safety critical embedded real-time systems is a system time/event synchronization capability on which predictable communication can be established. Focusing on bus-based communication protocols in TDMA environments, we present a novel, efficient, and low-cost synchronization approach with bounded start-up time. This approach utilizes information about each node's unique message lengths to achieve synchronization. The protocol avoids start-up collisions by postponing retries after a collision. We also present a re-synchronization strategy that incorporates recovering nodes into synchronization.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114471625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Can reliability and security be joined reliably and securely?","authors":"K. Kwiat","doi":"10.1109/RELDIS.2001.969750","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969750","url":null,"abstract":"The combined topics of reliability and security are briefly traced in relation to the past and present endeavors of the Air Force Research Laboratory's Information Directorate. It is concluded that in the realm of information assurance, system features created to tolerate benign failures and to respond to attack must be stressed and tested beforehand and their effectiveness predicted, otherwise they might inadvertently magnify the attacker's power. With the explosive growth of distributed and mobile systems and the need for information assurance to address the accompanying vulnerabilities, one history lesson comes to mind: although ancient Rome was not built in a day, it did not take very long for it to fall once the barbarians took hold.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122093987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A consensus protocol based on a weak failure detector and a sliding round window","authors":"M. Hurfin, A. Mostéfaoui, M. Raynal, R. Macêdo","doi":"10.1109/RELDIS.2001.969766","DOIUrl":"https://doi.org/10.1109/RELDIS.2001.969766","url":null,"abstract":"The paper revisits the \"sliding window\" notion commonly encountered in communication protocols and applies it to the round numbers of round-based asynchronous protocols. This approach is novel. To illustrate its benefits, the paper presents an original weak failure detector-based consensus protocol that allows each process to be simultaneously involved in several rounds. The rounds in which a process is simultaneously involved defines \"sliding round window\". The proposed approach has several advantages. It fits better to the uncertainty created by the asynchrony and failures, and consequently permits one to design efficient round-based asynchronous protocols. Maybe more important, it also provides a better understanding of the global synchronization that manages the protocol progress from round to round. This appears clearly in the proposed failure detector-based consensus protocol, where the \"sliding round window\" allows one to dynamically define the message exchange pattern for each round separately.","PeriodicalId":440881,"journal":{"name":"Proceedings 20th IEEE Symposium on Reliable Distributed Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132814616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}