[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium最新文献

筛选
英文 中文
Pattern sensitive fault testing of RAMs with built-in ECC 内置ECC的ram模式敏感故障测试
M. Franklin, K. Saluja
{"title":"Pattern sensitive fault testing of RAMs with built-in ECC","authors":"M. Franklin, K. Saluja","doi":"10.1109/FTCS.1991.146690","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146690","url":null,"abstract":"The problem of testing RAMs with different built-in error-correction-coding (ECC) capabilities is formulated. The basics of ECC in RAMs are reviewed, and some of the implementation aspects are described. It is shown that if memories using separable linear codes satisfy certain conditions, it is always possible to apply arbitrary patterns to all check bits. An upper bound on the number of writes required to apply the required patterns to a neighborhood is established. An efficient algorithm for testing the information bits and check bits of an N-bit memory array for 5 cell neighborhood pattern sensitive faults in O(N) reads and writes is provided. The use of the method is demonstrated by a case study.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125622764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The t(n-1)-diagnosability and its applications to fault tolerance t(n-1)可诊断性及其在容错中的应用
Jie Xu
{"title":"The t(n-1)-diagnosability and its applications to fault tolerance","authors":"Jie Xu","doi":"10.1109/FTCS.1991.146707","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146707","url":null,"abstract":"A system composed of n units is said to be t/(n-1)-diagnosable if, given any complete collection of test results, the set of faulty units can be isolated to within a set of at most n-1 units provided that the number of faulty units does not exceed t. Based on some recently discovered properties of t/(n-1)-diagnosability, the author examines three canonical classes of systems-chains, loop and H/sub 2r,n/ systems-and presents optimal t/(n-1) diagnosable configurations for these classes. Incorporating these results into the scheme of D.M. Blough and A. Pelc (see 20th Inst. Symp. on Fault-Toler. Computing, pp.316-323 (1990)), the author gives an improved diagnosis and repair algorithm for constant-degree multiprocessor systems. A software fault tolerance scheme that utilizes t(n-1)-diagnosis technique is also proposed.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134460163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Signature analysis and test scheduling for self-testable circuits 自测试电路的特征分析与测试调度
A. P. Stroele, H. Wunderlich
{"title":"Signature analysis and test scheduling for self-testable circuits","authors":"A. P. Stroele, H. Wunderlich","doi":"10.1109/FTCS.1991.146640","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146640","url":null,"abstract":"In complex circuits the test execution is usually divided into a number of subtasks, each producing a signature in a self-test register. These signatures influence one another. A model that can be used as a basis for test scheduling procedures is presented, and it is shown how test schedules can be constructed, in order to minimize the number of signatures to be evaluated. The error masking probabilities decrease when the subtasks of the test execution are repeated in an appropriate order, and an equilibrium situation is reached where the error masking probabilities are minimal. A method is presented for constructing test schedules so that only the signatures at the primary outputs must be evaluated to get a sufficient fault coverage. Then no internal scan path is required, only a few signatures have to be evaluated at the end of the test execution, and the test control at chip and board level is simplified. The amount of hardware to implement a built-in self-test is reduced significantly.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115816017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Program fault tolerance based on memory access behavior 基于内存访问行为的程序容错
N. Bowen, D. Pradhan
{"title":"Program fault tolerance based on memory access behavior","authors":"N. Bowen, D. Pradhan","doi":"10.1109/FTCS.1991.146696","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146696","url":null,"abstract":"Fault observability based on the behavior of the memory references is studied. As opposed to traditional studies that view memory as one large entity that must completely work to be considered reliable, this study emphasizes the usage patterns of a particular program's memory. Expressions for the successful execution of a program that takes into account the usage of the data are developed. Three variations that depend on whether the program's storage is pre-allocated, dynamically allocated, or constrained in allocation are presented. A theory is proposed to explain the phenomenon that increased workloads lead to increased failure rates, which has been observed in several studies. The model is used to study several program traces, and is shown that increased workloads could cause an increase of the observed failure rates in the range of 27% to 53%.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114600196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Certification trails for data structures 数据结构的认证跟踪
G. Sullivan, G. Masson
{"title":"Certification trails for data structures","authors":"G. Sullivan, G. Masson","doi":"10.1109/FTCS.1991.146668","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146668","url":null,"abstract":"The applicability of the certification trail technique, a recently introduced and promising approach to fault detection and fault tolerance, is expanded. Previously, certification trails had to be customized to each algorithm application, but here trails appropriate to wide classes of algorithms are developed. These certification trails are based on common data-structure operations such as those carried out using balanced binary trees and heaps. Any algorithm using these sets of operations can therefore employ the certification trail method to achieve software fault tolerance. Constructions of trails for abstract data types such as priority queues and union-find structures are given. These trails are applicable to any data structure implementation of the abstract data type. It is shown that these ideas lead naturally to monitors for data structure operations.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133585155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
The IBM S/390 Sysplex Timer IBM S/390 Sysplex定时器
T. Smith, William A. Moorman, Thao Dang
{"title":"The IBM S/390 Sysplex Timer","authors":"T. Smith, William A. Moorman, Thao Dang","doi":"10.1109/FTCS.1991.146653","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146653","url":null,"abstract":"The IBM S/390 Sysplex Timer, a centralized fault-tolerant time reference used in maintaining time-of-day synchronism between multiple closely coupled IBM S/390's, is presented. The basic Sysplex Timer organization is quad redundant, and its packaging is duplex. A fully duplicated star interconnect topology provides redundant timer transmissions to every S/390 client system using dedicated fiber optic cables. The technology used in the design of the Sysplex Timer and measurements of the performance of the prototype system used in the validation of the design are presented. The design proved to be economic and robust, maintaining tightly synchronized operation of the redundant components for all test cases. Cable latency compensation appeared to be particularly effective. Worst case operation of the prototype with less than 300 ns of timer skew was verified.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129533731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An optimal algorithm for distributed system level diagnosis 分布式系统级诊断的最优算法
A. Bagchi, S. Hakimi
{"title":"An optimal algorithm for distributed system level diagnosis","authors":"A. Bagchi, S. Hakimi","doi":"10.1109/FTCS.1991.146664","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146664","url":null,"abstract":"A system consisting of n identical processors connected by links in which some processors could be faulty is considered. Initially each unit knows only its own i.d. and the i.d.'s of its immediate neighbors; no unit has any global knowledge about the system. An optimal algorithm for system level diagnosis in such a system that is based on the transmission of packets by fault-free units is presented. The algorithm requires at most 3n log p+O(n+pt) message transmissions by fault-free units, where p fault-free units simultaneously start the algorithm and there are t faulty units. The correctness of the algorithm is argued.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134409784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
Some practical issues in the design of fault-tolerant multiprocessors 容错多处理机设计中的一些实际问题
S. Dutt, J. Hayes
{"title":"Some practical issues in the design of fault-tolerant multiprocessors","authors":"S. Dutt, J. Hayes","doi":"10.1109/FTCS.1991.146676","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146676","url":null,"abstract":"A node-covering approach to fault-tolerant design is generalized to apply to a wide class of multiprocessor structures whose structure and failure mechanisms are represented by arbitrary graphs. Several new types of covering graphs are defined, which lead to various design tradeoffs. A new technique for incremental design, using a class of switch implementations that reduce a system's interconnection costs, is presented. The reduction of other cost factors is addressed, including VLSI layout area minimization, efficient transfer of state information during recovery, and the efficient use of local spares. A fast and distributed algorithm for reconfiguration around faults is presented. A review of the general node covering theory is included, focusing on how it models the important practical features of fault-tolerant systems.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"277 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132784512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
An evaluation of fault-tolerant hypercube architectures for onboard computing 板载计算容错超立方体架构的评估
J. Peterson, J. O. Tuazon, E. Upchurch
{"title":"An evaluation of fault-tolerant hypercube architectures for onboard computing","authors":"J. Peterson, J. O. Tuazon, E. Upchurch","doi":"10.1109/FTCS.1991.146662","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146662","url":null,"abstract":"Four hypercube architectures that are designed to use hardware resources more efficiently and that produce computers with high throughput and high reliability are evaluated. Spare nodes in three of the architectures are configured so that the entire computer has the topology of an incomplete hypercube. Here, the nodes of an incomplete hypercube are capable of providing different levels of fault detection, hardware reconfiguration, and routing. In the other architecture, the hypercube topology uses conventional switches capable only of establishing connections. End-of-mission dependability models and performance simulation models were developed. Results of performance degradation studies of the four architectures under reconfiguration in terms of throughput, response time, and communication utilization are presented for three workloads. The evaluations addressed performance-related dependability based on hardware failures and reconfiguration using hardware.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131321083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fault-tolerant communications processing 容错通信处理
V. Cherkassky, R. Rooholamini, H. Lari-Najafi
{"title":"Fault-tolerant communications processing","authors":"V. Cherkassky, R. Rooholamini, H. Lari-Najafi","doi":"10.1109/FTCS.1991.146684","DOIUrl":"https://doi.org/10.1109/FTCS.1991.146684","url":null,"abstract":"The concept of combining the traditional redundancy approach to fault tolerant design with the error detection and recovery mechanisms built into most of the existing communication protocols is addressed. The goal is to achieve low-cost fault-tolerant communication processing (transparent to the user) in the presence of individual processor board failures. General techniques for achieving system-level fault tolerance are reviewed. The notion of error control (recovery) used in computer communications is discussed and compared with the idea of fault tolerance and error recovery in computer science. A general multiprocessor model of a network processor is introduced, and a novel technique, called redundant task allocation, for achieving fault tolerance in a multiprocessor environment is described. Some of the issues in and approaches to recovery and tolerance of communication protocols after a failure of the underlying hardware are examined. A system prototype is described, and some simulation results are reported.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114109385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信