{"title":"Centralized failure injection for distributed, fault-tolerant protocol testing","authors":"G. A. Alvarez, F. Cristian","doi":"10.1109/ICDCS.1997.597856","DOIUrl":null,"url":null,"abstract":"We describe a centralized approach to testing that distributed fault-tolerant protocols satisfy their safety and timeliness specifications in the presence of the very failures they are designed to tolerate. CESIUM is a testing environment based on the centralized simulation of distributed executions and failures. Processes are run in a single address space while providing the appearance of a truly distributed execution. The human tester can force the occurrence of arbitrary failures and security attacks. The implementations under test are not instrumented for testing purposes, and their source codes need not be available. We prove that CESIUM can execute exactly the set of runs feasible in the real distributed system being simulated. We also show that there are safety and timeliness properties in the specifications of many existing distributed protocols that cannot be tested in practical distributed systems. All of these properties can, however, be accurately tested by CESIUM without introducing any perturbation in test experiments.","PeriodicalId":122990,"journal":{"name":"Proceedings of 17th International Conference on Distributed Computing Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 17th International Conference on Distributed Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.1997.597856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
We describe a centralized approach to testing that distributed fault-tolerant protocols satisfy their safety and timeliness specifications in the presence of the very failures they are designed to tolerate. CESIUM is a testing environment based on the centralized simulation of distributed executions and failures. Processes are run in a single address space while providing the appearance of a truly distributed execution. The human tester can force the occurrence of arbitrary failures and security attacks. The implementations under test are not instrumented for testing purposes, and their source codes need not be available. We prove that CESIUM can execute exactly the set of runs feasible in the real distributed system being simulated. We also show that there are safety and timeliness properties in the specifications of many existing distributed protocols that cannot be tested in practical distributed systems. All of these properties can, however, be accurately tested by CESIUM without introducing any perturbation in test experiments.