Fault-tolerant communications processing

V. Cherkassky, R. Rooholamini, H. Lari-Najafi
{"title":"Fault-tolerant communications processing","authors":"V. Cherkassky, R. Rooholamini, H. Lari-Najafi","doi":"10.1109/FTCS.1991.146684","DOIUrl":null,"url":null,"abstract":"The concept of combining the traditional redundancy approach to fault tolerant design with the error detection and recovery mechanisms built into most of the existing communication protocols is addressed. The goal is to achieve low-cost fault-tolerant communication processing (transparent to the user) in the presence of individual processor board failures. General techniques for achieving system-level fault tolerance are reviewed. The notion of error control (recovery) used in computer communications is discussed and compared with the idea of fault tolerance and error recovery in computer science. A general multiprocessor model of a network processor is introduced, and a novel technique, called redundant task allocation, for achieving fault tolerance in a multiprocessor environment is described. Some of the issues in and approaches to recovery and tolerance of communication protocols after a failure of the underlying hardware are examined. A system prototype is described, and some simulation results are reported.<<ETX>>","PeriodicalId":300397,"journal":{"name":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FTCS.1991.146684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The concept of combining the traditional redundancy approach to fault tolerant design with the error detection and recovery mechanisms built into most of the existing communication protocols is addressed. The goal is to achieve low-cost fault-tolerant communication processing (transparent to the user) in the presence of individual processor board failures. General techniques for achieving system-level fault tolerance are reviewed. The notion of error control (recovery) used in computer communications is discussed and compared with the idea of fault tolerance and error recovery in computer science. A general multiprocessor model of a network processor is introduced, and a novel technique, called redundant task allocation, for achieving fault tolerance in a multiprocessor environment is described. Some of the issues in and approaches to recovery and tolerance of communication protocols after a failure of the underlying hardware are examined. A system prototype is described, and some simulation results are reported.<>
容错通信处理
将传统的冗余容错设计方法与大多数现有通信协议中内置的错误检测和恢复机制相结合的概念进行了讨论。目标是在出现单个处理器板故障时实现低成本的容错通信处理(对用户透明)。回顾了实现系统级容错的一般技术。讨论了计算机通信中的错误控制(恢复)概念,并与计算机科学中的容错和错误恢复概念进行了比较。介绍了网络处理器的通用多处理器模型,并描述了在多处理器环境中实现容错的一种新技术——冗余任务分配。研究了底层硬件故障后通信协议的恢复和容忍度中的一些问题和方法。介绍了系统的原型,并给出了一些仿真结果
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信