Fault-tolerant communications processing

[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium Pub Date : 1991-06-25 DOI:10.1109/FTCS.1991.146684

V. Cherkassky, R. Rooholamini, H. Lari-Najafi

引用次数: 2

Abstract

The concept of combining the traditional redundancy approach to fault tolerant design with the error detection and recovery mechanisms built into most of the existing communication protocols is addressed. The goal is to achieve low-cost fault-tolerant communication processing (transparent to the user) in the presence of individual processor board failures. General techniques for achieving system-level fault tolerance are reviewed. The notion of error control (recovery) used in computer communications is discussed and compared with the idea of fault tolerance and error recovery in computer science. A general multiprocessor model of a network processor is introduced, and a novel technique, called redundant task allocation, for achieving fault tolerance in a multiprocessor environment is described. Some of the issues in and approaches to recovery and tolerance of communication protocols after a failure of the underlying hardware are examined. A system prototype is described, and some simulation results are reported.<>

查看原文本刊更多论文

容错通信处理

将传统的冗余容错设计方法与大多数现有通信协议中内置的错误检测和恢复机制相结合的概念进行了讨论。目标是在出现单个处理器板故障时实现低成本的容错通信处理(对用户透明)。回顾了实现系统级容错的一般技术。讨论了计算机通信中的错误控制(恢复)概念，并与计算机科学中的容错和错误恢复概念进行了比较。介绍了网络处理器的通用多处理器模型，并描述了在多处理器环境中实现容错的一种新技术——冗余任务分配。研究了底层硬件故障后通信协议的恢复和容忍度中的一些问题和方法。介绍了系统的原型，并给出了一些仿真结果

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium

自引率

0.00%

发文量