Proceedings of Annual Symposium on Fault Tolerant Computing最新文献_第3页

Design and evaluation of fault-tolerant shared file system for cluster systems 集群系统容错共享文件系统的设计与评价

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534596

S. Sumimoto

引用次数: 1

Recoverable mobile environment: design and trade-off analysis 可恢复移动环境:设计与权衡分析

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534590

D. Pradhan, P. Krishna, N. Vaidya

{"title":"Recoverable mobile environment: design and trade-off analysis","authors":"D. Pradhan, P. Krishna, N. Vaidya","doi":"10.1109/FTCS.1996.534590","DOIUrl":"https://doi.org/10.1109/FTCS.1996.534590","url":null,"abstract":"The mobile wireless environment poses challenging problems in designing fault-tolerant systems because of the dynamics of mobility, and limited bandwidth available on wireless links. Traditional fault-tolerance schemes, therefore, cannot be directly applied to these systems. Mobile systems are often subject to environmental conditions which can cause loss of communications or data. Because of the consumer orientation of most mobile systems, run-time faults must be corrected with minimal (if any) intervention from the user. The fault-tolerance capability must, therefore, be transparent to the user. The paper presents recovery schemes for the failure of a mobile host. It portrays the limitations of the mobile wireless environment, and their impact on recovery protocols. The adaptation of well-known recovery schemes are presented which suit the mobile environment. The performance of these schemes has been analyzed to determine those environments where a particular recovery scheme is best suited. The performance of the recovery schemes primarily depends on: the wireless bandwidth; the communication-mobility ratio of the user; and the failure rate of the mobile host.","PeriodicalId":191163,"journal":{"name":"Proceedings of Annual Symposium on Fault Tolerant Computing","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124054119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 110

Reliable broadcasting in product networks with Byzantine faults 具有拜占庭故障的产品网络中的可靠广播

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534612

Feng Bao, Y. Igarashi

{"title":"Reliable broadcasting in product networks with Byzantine faults","authors":"Feng Bao, Y. Igarashi","doi":"10.1109/FTCS.1996.534612","DOIUrl":"https://doi.org/10.1109/FTCS.1996.534612","url":null,"abstract":"The reliability of broadcasting in product networks is discussed. We assume that a network may contain faulty nodes and/or links of Byzantine type and that no nodes know any information about faults in advance. If there are n independent spanning trees rooted at the some node of a network, the network is called an n-channel graph. We first show a construction of n independent spanning trees rooted at the same node of a product network consisting of n component graphs. Then we design a broadcasting scheme in the product network so that messages are sent along the n independent spanning trees. This broadcasting scheme can tolerate up to [(n-1)/2] faults of Byzantine type even in the worst case. Broadcasting by the scheme is successful with a probability higher than 1-k/sup -[n/2]/ in any product network of order N consisting of n component graphs of order b or less if at most N/4b/sup 3/nk faulty nodes are randomly distributed in the network. Furthermore we show how to construct n/sub 1/+n/sub 2/ independent spanning trees in a product network of two graphs such that the one component graph is an n/sub 1/-channel graph and the other component graph is an n/sub 2/-channel graph. These independent spanning trees can be also used as efficient and reliable message channels for broadcasting in the product network.","PeriodicalId":191163,"journal":{"name":"Proceedings of Annual Symposium on Fault Tolerant Computing","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123102673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

The redundancy mechanisms of the Ariane 5 Operational Control Center 阿丽亚娜5号运行控制中心的冗余机制

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534623

J. Dega

引用次数: 6

Multiple fault diagnosis in sequential circuits using sensitizing sequence pairs 利用敏化序列对进行顺序电路多故障诊断

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534597

N. Yanagida, Hiroshi Takahashi, Y. Takamatsu

引用次数: 5

Experimental evaluation of the fail-silent behaviour in programs with consistency checks 具有一致性检查的程序失效沉默行为的实验评价

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534625

M. Z. Rela, H. Madeira, J. G. Silva

{"title":"Experimental evaluation of the fail-silent behaviour in programs with consistency checks","authors":"M. Z. Rela, H. Madeira, J. G. Silva","doi":"10.1109/FTCS.1996.534625","DOIUrl":"https://doi.org/10.1109/FTCS.1996.534625","url":null,"abstract":"An important research topic deals with the investigation of whether a non-duplicated computer can be made fail-silent, since that behaviour is a-priori assumed in many algorithms. However, previous research has shown that in systems using a simple behaviour based error detection mechanism invisible to the programmer (e.g. memory protection), the percentage of fail-silent violations could be higher than 10%. Since the study of these errors has shown that they were mostly caused by pure data errors, we evaluate the effectiveness of software techniques capable of checking the semantics of the data, such as assertions, to detect these remaining errors. The results of injecting physical pin-level faults show that these tests can prevent about 40% of the fail-silent model violations that escape the simple hardware-based error detection techniques. In order to decouple the intrinsic limitations of the tests used from other factors that might affect its error detection capabilities, we evaluated a special class of software checks known for its high theoretical coverage: algorithm based fault tolerance (ABFT). The analysis of the remaining errors showed that most of them remained undetected due to short range control flow errors. When very simple software-based control flow checking was associated to the semantic tests, the target system, without any dedicated error detection hardware, behaved according to the fail-silent model for about 98% of all the faults injected.","PeriodicalId":191163,"journal":{"name":"Proceedings of Annual Symposium on Fault Tolerant Computing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127210175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 76

A fault simulation method for crosstalk faults in synchronous sequential circuits 同步顺序电路串扰故障的故障仿真方法

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534592

N. Itazaki, Yasutaka Idomoto, K. Kinoshita

{"title":"A fault simulation method for crosstalk faults in synchronous sequential circuits","authors":"N. Itazaki, Yasutaka Idomoto, K. Kinoshita","doi":"10.1109/FTCS.1996.534592","DOIUrl":"https://doi.org/10.1109/FTCS.1996.534592","url":null,"abstract":"With the scaling down of VLSI size and the reducing switching time of logic gates, crosstalk faults become an important problem for testing. If a crosstalk pulse is excited by internal noise sources, the crosstalk pulse tends to be considered as harmless for synchronous sequential circuits, because generated crosstalk pulses on data lines can be eliminated by a clocking. However the crosstalk pulse generated on clock lines or reset lines can lead the circuit to erroneous operations. We analyze the crosstalk fault scheme, and contrive a fault simulator based on the scheme, in order to estimate the effect for the crosstalk fault. We consider the crosstalk fault as unexpected strong capacitive coupling between one data line and clock lines. Since we have to consider timing in addition to a logic value, a unit delay model is used in our fault simulation. Our experiments on some benchmark circuits show that fault activation rates and fault detection rates are widely varied corresponding to circuit characteristics. Up to 80% fault detection rates are obtained from our simulation with test vectors generated at random.","PeriodicalId":191163,"journal":{"name":"Proceedings of Annual Symposium on Fault Tolerant Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131361053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Experimental assessment of parallel systems 并行系统的实验评估

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534627

J. G. Silva, J. Carreira, H. Madeira, D. Costa, F. Moreira

引用次数: 44

A new methodology for calculating distributions of reward accumulated during a finite interval 一种计算有限时间内累积奖励分配的新方法

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.534600

M. Qureshi, W. Sanders

{"title":"A new methodology for calculating distributions of reward accumulated during a finite interval","authors":"M. Qureshi, W. Sanders","doi":"10.1109/FTCS.1996.534600","DOIUrl":"https://doi.org/10.1109/FTCS.1996.534600","url":null,"abstract":"Markov reward models are an important formalism by which to obtain dependability and performability measures of computer systems and networks. In this context, it is particularly important to determine the probability distribution function of the reward accumulated during a finite interval. The interval may correspond to the mission period in a mission-critical system, the time between scheduled maintenances, or a warranty period. In such models, changes in state correspond to changes in system structure (due to faults and repairs), and the reward structure depends on the measure of interest. For example, the reward rates may represent a productivity rate while in that state, if performability is considered, or the binary values zero and one, if interval availability is of interest. We present a new methodology to calculate the distribution of reward accumulated over a finite interval. In particular, we derive recursive expressions for the distribution of reward accumulated given that a particular sequence of state changes occurs during the interval, and we explore paths one at a time. The expressions for conditional accumulated reward are new and are numerically stable. In addition, by exploring paths individually, we avoid the memory growth problems experienced when applying previous approaches to large models. The utility of the methodology is illustrated via application to a realistic fault-tolerant multiprocessor model with over half a million states.","PeriodicalId":191163,"journal":{"name":"Proceedings of Annual Symposium on Fault Tolerant Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128847265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

Hardware-efficient and highly-reconfigurable 4- and 2-track fault-tolerant designs for mesh-connected multicomputers 网格连接多计算机的硬件高效和高度可重构的四轨和二轨容错设计

Proceedings of Annual Symposium on Fault Tolerant Computing Pub Date : 1996-06-25 DOI: 10.1109/FTCS.1996.535880

N. Mahapatra, S. Dutt

{"title":"Hardware-efficient and highly-reconfigurable 4- and 2-track fault-tolerant designs for mesh-connected multicomputers","authors":"N. Mahapatra, S. Dutt","doi":"10.1109/FTCS.1996.535880","DOIUrl":"https://doi.org/10.1109/FTCS.1996.535880","url":null,"abstract":"We consider m-track models for constructing fault-tolerant (FT) mesh systems which have one primary and m spare tracks per row and column, switches at the intersection of these tracks, and spare processors at the boundaries. A faulty system is reconfigured by finding for each fault u a reconfiguration path from the fault to a spare in which starting from the fault u, a processor is replaced or \"covered\" by the nearest \"available\" succeeding processor on the path-a processor on the path is not available if it is faulty or is used as a \"cover\" on some other reconfiguration path. In previous work, a 1-track design that can support any set of node-disjoint straight reconfiguration paths, and a more reliable 3-track design that can support any set of node-disjoint rectilinear reconfiguration paths have been proposed. In this paper; we present: (1) A fundamental result regarding the universality of simple \"one-to-one switches\" in m-track 2-D mesh designs in terms of their reconfigurabilities. (2) A 4-track mesh design that can support any set of edge-disjoint (a much less restrictive criterion than node-disjointness) rectilinear reconfiguration paths, and that has 34% less switching overhead and significantly higher actually close-to-optimal, reconfigurability compared to the previously proposed 3-track design. (3) A new 2-track design derived from the above 4-track design that we show can support the same set of reconfiguration paths as the preview 3-track design but with 33% less wiring overhead. (4) Results on the deterministic fault tolerance capabilities (the number of faults guaranteed reconfigurable) of our 4- and 2-track designs, and the previously proposed 1- and 3-track designs.","PeriodicalId":191163,"journal":{"name":"Proceedings of Annual Symposium on Fault Tolerant Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117121807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3