{"title":"Recovery blocks in real-time distributed systems","authors":"Dong Nguyen, Irvine, Dar-Biau Liu","doi":"10.1109/RAMS.1998.653703","DOIUrl":null,"url":null,"abstract":"This paper discusses the concept of recovery blocks as a dynamic redundancy approach to software fault tolerance. The discussion focuses on the distributed recovery block (DRB) scheme which can be thought of as a means of integrating hardware and software fault tolerance in a single structure. The DRB approach, which combines distributed processing and recovery block concepts, is capable of effecting forward recovery while handling both hardware and software faults in a uniform manner. The DRB was developed for applications such as command and control in which data was collected by interface processors and distributed over a network, and in which data from one pair of processors was output to another pair of processors. The extended distributed recovery blocks (EDRB) is then discussed as a modified scheme of the original DRB for real-time process control applications. The implementation of the EDRB is also presented to acquaint the reader with the implementation issue that must be faced in the development of a fault-tolerant software architecture for a distributed system.","PeriodicalId":275301,"journal":{"name":"Annual Reliability and Maintainability Symposium. 1998 Proceedings. International Symposium on Product Quality and Integrity","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Reliability and Maintainability Symposium. 1998 Proceedings. International Symposium on Product Quality and Integrity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RAMS.1998.653703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper discusses the concept of recovery blocks as a dynamic redundancy approach to software fault tolerance. The discussion focuses on the distributed recovery block (DRB) scheme which can be thought of as a means of integrating hardware and software fault tolerance in a single structure. The DRB approach, which combines distributed processing and recovery block concepts, is capable of effecting forward recovery while handling both hardware and software faults in a uniform manner. The DRB was developed for applications such as command and control in which data was collected by interface processors and distributed over a network, and in which data from one pair of processors was output to another pair of processors. The extended distributed recovery blocks (EDRB) is then discussed as a modified scheme of the original DRB for real-time process control applications. The implementation of the EDRB is also presented to acquaint the reader with the implementation issue that must be faced in the development of a fault-tolerant software architecture for a distributed system.