{"title":"Almost certain diagnosis for intermittently faulty systems","authors":"D. Blough, G. Sullivan, G. Masson","doi":"10.1109/FTCS.1988.5329","DOIUrl":null,"url":null,"abstract":"The authors present and analyze a uniformly probabilistic model for the self-diagnosis capabilities of a multiprocessor system. In this model an individual processor fails with probability p and a fault-free processor testing a faulty processor detects a fault with probability q, modeling the situation in which processors can be intermittently faulty or the situation where tests are not capable of detecting all possible faults within a processor. They present an efficient algorithm which utilizes a relatively small number of tests (given by any function dominating n log n where n is the number of processors) and achieves correct diagnosis with high probability. They obtain a nearly matching lower bound which shows that no algorithm can achieve correct diagnosis with high probability in systems which conduct a number of tests dominated by n log n. Examples of systems which perform a modest number of tests are given in which the probability of correct diagnosis for the authors' algorithm is very nearly one.<<ETX>>","PeriodicalId":171148,"journal":{"name":"[1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1988-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FTCS.1988.5329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27
Abstract
The authors present and analyze a uniformly probabilistic model for the self-diagnosis capabilities of a multiprocessor system. In this model an individual processor fails with probability p and a fault-free processor testing a faulty processor detects a fault with probability q, modeling the situation in which processors can be intermittently faulty or the situation where tests are not capable of detecting all possible faults within a processor. They present an efficient algorithm which utilizes a relatively small number of tests (given by any function dominating n log n where n is the number of processors) and achieves correct diagnosis with high probability. They obtain a nearly matching lower bound which shows that no algorithm can achieve correct diagnosis with high probability in systems which conduct a number of tests dominated by n log n. Examples of systems which perform a modest number of tests are given in which the probability of correct diagnosis for the authors' algorithm is very nearly one.<>