{"title":"An integrated approach to fault tolerance","authors":"E. Elnozahy, W. Zwaenepoel","doi":"10.1109/MRD.1992.242611","DOIUrl":null,"url":null,"abstract":"Describes Manetho, an experimental protocol system, whose goal is to explore the extent to which transparent fault tolerance can be added to long-running distributed applications. Transparent techniques are attractive because they can automatically add fault tolerance to existing applications that were written without consideration for reliability. Previous techniques for providing transparent fault-tolerance relied on rollback-recovery. However, rollback recovery is not appropriate for server processes where the lack of service during rollback is intolerable. Furthermore, rollback-recovery assumes that a process can be restarted on any available host. As a result, extended downtime cannot be tolerated for example in file servers, which have to run on the host where the disks reside. Manetho solves these problems with an integrated approach by using process replication for server processes and rollback-recovery for client processes.<<ETX>>","PeriodicalId":314844,"journal":{"name":"[1992 Proceedings] Second Workshop on the Management of Replicated Data","volume":"41 7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1992 Proceedings] Second Workshop on the Management of Replicated Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MRD.1992.242611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Describes Manetho, an experimental protocol system, whose goal is to explore the extent to which transparent fault tolerance can be added to long-running distributed applications. Transparent techniques are attractive because they can automatically add fault tolerance to existing applications that were written without consideration for reliability. Previous techniques for providing transparent fault-tolerance relied on rollback-recovery. However, rollback recovery is not appropriate for server processes where the lack of service during rollback is intolerable. Furthermore, rollback-recovery assumes that a process can be restarted on any available host. As a result, extended downtime cannot be tolerated for example in file servers, which have to run on the host where the disks reside. Manetho solves these problems with an integrated approach by using process replication for server processes and rollback-recovery for client processes.<>