{"title":"Adaptability experiments in the RAID distributed database system","authors":"B. Bhargava, A. Helal, K. Friesen, J. Riedl","doi":"10.1109/RELDIS.1990.93953","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93953","url":null,"abstract":"A series of experiments is being conducted on the RAID distributed database system to study the performance and reliability implications of providing static and dynamic adaptability. The authors' studies of the cost of their adaptable implementation were conducted in the context of the concurrency controller and the replication controller. It is shown that adaptable implementations can be provided at costs comparable to those of special-purpose implementations. The experimentation with dynamic adaptability focuses on concurrency control. It is shown that dynamic adaptability can result in performance benefits and that system reconfiguration can be accomplished dynamically with less cost than stopping the system, performing reconfiguration, and then restarting the system. The authors' examination of the costs of providing greater data availability includes studying the replication control and atomicity control subsystems of RAID. The cost associated with increasing availability in an adaptable scheme of replication control and commit protocols is demonstrated.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127207681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RelaX-an extensible architecture supporting reliable distributed applications","authors":"R. Kröger, M. Mock, R. Schumann, Frank Lange","doi":"10.1109/RELDIS.1990.93961","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93961","url":null,"abstract":"The authors provide a description of RelaX (reliable distributed applications support on UniX), a portable and extensible system software layer on top of UNIX-like operating system kernels that supports reliable distributed applications by a generalized transaction mechanism. The transaction mechanism relieves each programmer of dealing explicitly with error recovery and concurrency control in every distributed application. In order to make transactions applicable as a general programming tool, flexibility has been introduced into the traditional transaction concept. The transaction mechanism is isolated in a server (Transaction Manager) that cooperates with an extensible set of resource managers, which provide different kinds of long-term storage entities accessible by RelaX transactions. Each resource manager provides a standard interface to the transaction kernel, and, if so desired, additional resource managers can be built. In order to ease the construction of new resource managers, RelaX provides generic software components as building blocks for any kind of resource manager. The RelaX architecture is described and the design of an examplary resource manager, the transactional object management system which provides access to persistent shared objects, is outlined.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116544906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved algorithm for the symbolic reliability analysis of networks","authors":"M. Veeraraghavan, Kishor S. Trivedi","doi":"10.1109/RELDIS.1990.93949","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93949","url":null,"abstract":"An efficient Boolean algebraic algorithm for the symbolic reliability and sensitivity analysis of coherent two-terminal networks with s independent components is described. The algorithm is also applicable to a fault tree model without NOT gates. The algorithm uses the concept originally proposed by A. Grnarov, L. Kleinrock, and M. Gerla (1979). After the algorithm is presented, the errors in the original technique are illustrated by two examples. The algorithm is extended t compute the reliability importance of a given component (sensitivity of system reliability to a given component's reliability). A computer program implementing the modified algorithm is used to solve and obtain measured time complexities for a large set of network and fault tree models.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116815047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Voting as the optimal static pessimistic scheme for managing replicated data","authors":"M. Spasojevic, P. Berman","doi":"10.1109/RELDIS.1990.93958","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93958","url":null,"abstract":"The problem of finding an optimal static pessimistic replica control scheme is investigated. It has been widely accepted that coteries (proposed by Garcia-Molina and Barbara) provide the most general framework for such schemes. Under such as assumption, it is demonstrated that the voting scheme is an optimal static pessimistic scheme for fully connected networks with negligible link failure rates, as well as for Ethernet systems. It is also shown that voting is not optimal for somewhat more general systems. The authors propose a modification of the algorithm of Tong and Kain for the best voting in the operation-independent case so that it runs in linear (rather than exponential) time. They also propose a linear-time algorithm for computing the optimal vote assignment when relative frequencies of read and write operations are known.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129328588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed lock management in a transaction processing environment","authors":"A. Hastings","doi":"10.1109/RELDIS.1990.93948","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93948","url":null,"abstract":"Distributed synchronization for data sharing is discussed, and the design of a distributed lock manager for the Camelot transaction facility is presented. The lock manager is a component of a proposed implementation of data sharing in the Camelot environment. A number of experiments that demonstrate the correct operation of the lock manager are reported and its performance is described. The performance metrics indicate that distributed lock management should not reduce the feasibility of data sharing in this environment. The similarity between the caching and synchronization strategies appropriate for locks and data suggests that protocols developed for distributed locks will be applicable to data sharing.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116113423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparison of voting strategies for fault-tolerant distributed systems","authors":"D. Blough, G. Sullivan","doi":"10.1109/RELDIS.1990.93959","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93959","url":null,"abstract":"The problem of voting is studied for both the exact and inexact cases. Optimal solutions based on explicit computation of condition probabilities are given. The most commonly used strategies, i.e. majority, median, and plurality are compared quantitatively. The results show that plurality voting is the most powerful of these techniques and is, in fact, optimal for a certain class of probability distributions. An efficient method of implementing a generalized plurality voter when nonfaulty processes can produce differing answers is also given.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"23 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116610096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Chu, A. Y. Hwang, R. Lee, Qiming Chen, M. Merzbacher, H. Hecht
{"title":"Fault tolerant distributed database system via data inference","authors":"W. Chu, A. Y. Hwang, R. Lee, Qiming Chen, M. Merzbacher, H. Hecht","doi":"10.1109/RELDIS.1990.93954","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93954","url":null,"abstract":"A knowledge-gased approach for query processing during network partitioning is proposed. The approach uses available domain and summary knowledge to infer inaccessible data to answer a given query. A rule induction technique is used to extract correlated knowledge between attributes from the database contents. This knowledge is represented as rules for data inference. On the basis of a set of queries, simulation is used to evaluate the effectiveness of the proposed data inference technique for improving data availability under network partitioning. Object allocation has a significant impact on data availability. Allocating objects that increase remote redundancy and reduce local redundancy increases data Availability during network partitioning. A prototype distributed database system that uses the proposed inference technique with correlated knowledge from a ship database has been implemented. Experience indicates that the proposed inference technique can significantly improve the availability of a distributed database during network partitioning.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121643170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The design and implementation of a reliable distributed operating system-ROSE","authors":"T. Ng","doi":"10.1109/RELDIS.1990.93946","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93946","url":null,"abstract":"ROSE, a modular distributed operating system that provides support for building reliable applications, is designed and implemented. Failure detection capabilities are provided by a failure detection server. Configuration objects can be used to capture the relationship among multiple processes that cooperate to replicate certain resources. Replicated address space (RAS) objects, whose content is accessible with a high probability despite hardware failures, can be used to increase data availability. Finally, a resistant process (RP) abstraction allows user processes to survive hardware failures with minimal interruption. Two different implementations of RP are provided: one checkpoints the information about its state in an RAS object periodically; the other uses replicated execution by executing the same code in different nodes at the same time.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116637866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fault tolerant algorithm for distributed mutual exclusion","authors":"Ye-In Chang, M. Singhal, Ming T. Liu","doi":"10.1109/RELDIS.1990.93960","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93960","url":null,"abstract":"A fault-tolerant mutual exclusion algorithm for distributed systems is presented. The algorithm uses a distributed queue strategy and maintains alternative paths at each site to provide a high degree of fault tolerance. However, owing to these alternative paths, the algorithm must use reverse messages to avoid the occurrence of directed cycles, which may form when the direction of edges is reversed after the token passes through. If there is no alternative path, the total number of the messages exchanged is O (2*log N) in light traffic and two messages in heavy traffic; however, in this case the system cannot tolerate even a single communication link or site failure. If there are alternative paths between sites, the system can achieve a higher degree of fault tolerance at the expense of increased message traffic (owing to reverse messages). Thus, there is a tradeoff between efficiency and reliability, and a system can be designed to balance these two criteria properly. A recovery procedure for restoring a recovering site consistently into the system is also presented.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121832825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using stashing to increase node autonomy in distributed file systems","authors":"R. Alonso, Daniel Barbará, Luis L. Cova","doi":"10.1109/RELDIS.1990.93947","DOIUrl":"https://doi.org/10.1109/RELDIS.1990.93947","url":null,"abstract":"The authors present an enhancement to distributed file systems that allows the users of the system to keep local copies of important files, decreasing the dependency over file servers. Using the notions of stashing and quasi-copies, the system allows users to tune up the quality of the service they want to receive when the file server is not reachable. One of the key points of this work is the focus on the tradeoff between availability and degradation of service. The other main contribution is the design of a distributed file system which is ideally suited to very large distributed systems, in that it provides users with greater tolerance of network partitions and server failures. It is emphasized that the use of stashing does not preclude the use of other performance-enhancing or fault-tolerant techniques. The file system architecture has been implemented and FACE, a prototype of a file system service based on Sun's NFS, is described. Performance figures are reported. These figures show that the overhead of providing the service is negligible. Current plans also call for porting the FACE design to a number of other processors.<<ETX>>","PeriodicalId":218085,"journal":{"name":"Proceedings Ninth Symposium on Reliable Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128227572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}