{"title":"Research on Architecture and Design Principles of COTS Components Based Generic Fault-Tolerant Computer","authors":"Zhonghong Ou, Youguang Yuan, Xiaoyong Zhao","doi":"10.1109/PRDC.2005.53","DOIUrl":null,"url":null,"abstract":"A novel fault-tolerant architecture based on COTS components is put forward and implemented in this paper. In order to make observable the internal states of COTS components, and in order to concurrently perform fault-tolerance function and normal function and control the behavior of each COTS component, the authors have devised an intelligent hardware module dedicated to fault-tolerance processing, which can significantly offload application processors. This architecture digs every inherent fault-detection mechanism and adopts layered fault protection mechanism to raise fault-tolerance coverage. This architecture is efficient, flexible, scalable and transparent with respect to fault-tolerance. It is Byzantine fault safe and also supports online repair. The authors also raise some design tradeoffs when designing COTS components based fault-tolerant computer.","PeriodicalId":123010,"journal":{"name":"Pacific Rim International Symposium on Dependable Computing","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pacific Rim International Symposium on Dependable Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PRDC.2005.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A novel fault-tolerant architecture based on COTS components is put forward and implemented in this paper. In order to make observable the internal states of COTS components, and in order to concurrently perform fault-tolerance function and normal function and control the behavior of each COTS component, the authors have devised an intelligent hardware module dedicated to fault-tolerance processing, which can significantly offload application processors. This architecture digs every inherent fault-detection mechanism and adopts layered fault protection mechanism to raise fault-tolerance coverage. This architecture is efficient, flexible, scalable and transparent with respect to fault-tolerance. It is Byzantine fault safe and also supports online repair. The authors also raise some design tradeoffs when designing COTS components based fault-tolerant computer.