{"title":"多处理器并发错误检测与故障诊断系统设计","authors":"B. Vinnakota, N. Jha","doi":"10.1109/FTCS.1991.146708","DOIUrl":null,"url":null,"abstract":"Results on the design of systems using algorithm-based fault tolerance (ABFT), a low-overhead fault tolerance scheme for high-speed parallel processing systems, are presented. Bounds on the diagnosability of the system and the number of checks needed to design a unit system of given capability are derived. A procedure for forming the target fault-tolerant system from the unit system is introduced. The procedure is applicable to a wide range of systems in which processors may share data elements. The applications of the design scheme are illustrated through examples.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Design of multiprocessor systems for concurrent error detection and fault diagnosis\",\"authors\":\"B. Vinnakota, N. Jha\",\"doi\":\"10.1109/FTCS.1991.146708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Results on the design of systems using algorithm-based fault tolerance (ABFT), a low-overhead fault tolerance scheme for high-speed parallel processing systems, are presented. Bounds on the diagnosability of the system and the number of checks needed to design a unit system of given capability are derived. A procedure for forming the target fault-tolerant system from the unit system is introduced. The procedure is applicable to a wide range of systems in which processors may share data elements. The applications of the design scheme are illustrated through examples.<<ETX>>\",\"PeriodicalId\":300397,\"journal\":{\"name\":\"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FTCS.1991.146708\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FTCS.1991.146708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design of multiprocessor systems for concurrent error detection and fault diagnosis
Results on the design of systems using algorithm-based fault tolerance (ABFT), a low-overhead fault tolerance scheme for high-speed parallel processing systems, are presented. Bounds on the diagnosability of the system and the number of checks needed to design a unit system of given capability are derived. A procedure for forming the target fault-tolerant system from the unit system is introduced. The procedure is applicable to a wide range of systems in which processors may share data elements. The applications of the design scheme are illustrated through examples.<>