Why Reliability for Computing Needs Rethinking

Valeriu Beiu, V. Dragoi, Roxana-Mariana Beiu
{"title":"Why Reliability for Computing Needs Rethinking","authors":"Valeriu Beiu, V. Dragoi, Roxana-Mariana Beiu","doi":"10.1109/ICRC2020.2020.00006","DOIUrl":null,"url":null,"abstract":"Offering high quality services/products has been of paramount importance for both communications and computations. Early on, both of these were in dire need of practical designs for enhancing reliability. That is why John von Neumann proposed the first gate-level method (using redundancy to build reliable systems from unreliable components), while Edward F. Moore and Claude E. Shannon followed suit with the first device-level scheme. Moore and Shannon’s prescient paper also established network reliability as a probabilistic model where the nodes of the network were considered to be perfectly reliable, while the edges could fail independently with a certain probability. The fundamental problem was that of estimating the probability that (under given conditions) two (or more) nodes are connected, the solution being represented by the well-known reliability polynomial (of the network). This concept has been heavily used for communications, where big strides were made and applied to networks of: roads, railways, power lines, fiber optics, phones, sensors, etc. For computations the research community converged on the gate-level method proposed by von Neumann, while the device-level scheme crafted by Moore and Shannon—although very practical and detailed—did not inspire circuit designers and went under the radar. That scheme was built on a thought-provoking network called hammock, exhibiting regular brick-wall near-neighbor connections. Trying to do justice to computing networks in general (and hammocks in particular), this paper aims to highlight and clarify how reliable different types of networks are when they are intended for performing computations. For doing this, we will define quite a few novel cost functions which, together with established ones, will allow us to meticulously compare different types of networks for a clearer understanding of the reliability enhancements they are able to bring to computations. To our knowledge, this is the first ever ranking of networks with respect to computing reliability. The main conclusion is that a rethinking/rebooting of how should we design reliable computing systems, immediately applicable to networks/arrays of devices (e.g., transistors or qubits), is both timely and needed.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Rebooting Computing (ICRC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRC2020.2020.00006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Offering high quality services/products has been of paramount importance for both communications and computations. Early on, both of these were in dire need of practical designs for enhancing reliability. That is why John von Neumann proposed the first gate-level method (using redundancy to build reliable systems from unreliable components), while Edward F. Moore and Claude E. Shannon followed suit with the first device-level scheme. Moore and Shannon’s prescient paper also established network reliability as a probabilistic model where the nodes of the network were considered to be perfectly reliable, while the edges could fail independently with a certain probability. The fundamental problem was that of estimating the probability that (under given conditions) two (or more) nodes are connected, the solution being represented by the well-known reliability polynomial (of the network). This concept has been heavily used for communications, where big strides were made and applied to networks of: roads, railways, power lines, fiber optics, phones, sensors, etc. For computations the research community converged on the gate-level method proposed by von Neumann, while the device-level scheme crafted by Moore and Shannon—although very practical and detailed—did not inspire circuit designers and went under the radar. That scheme was built on a thought-provoking network called hammock, exhibiting regular brick-wall near-neighbor connections. Trying to do justice to computing networks in general (and hammocks in particular), this paper aims to highlight and clarify how reliable different types of networks are when they are intended for performing computations. For doing this, we will define quite a few novel cost functions which, together with established ones, will allow us to meticulously compare different types of networks for a clearer understanding of the reliability enhancements they are able to bring to computations. To our knowledge, this is the first ever ranking of networks with respect to computing reliability. The main conclusion is that a rethinking/rebooting of how should we design reliable computing systems, immediately applicable to networks/arrays of devices (e.g., transistors or qubits), is both timely and needed.
为什么计算的可靠性需要重新思考
提供高质量的服务/产品对于通信和计算都是至关重要的。在早期,这两种技术都迫切需要实用的设计来提高可靠性。这就是为什么约翰·冯·诺伊曼提出了第一个门级方法(利用冗余从不可靠的组件中构建可靠的系统),而爱德华·f·摩尔和克劳德·e·香农紧随其后,提出了第一个设备级方案。Moore和Shannon的有预见性的论文也将网络可靠性建立为一个概率模型,其中网络的节点被认为是完全可靠的,而边缘可能以一定的概率独立失效。基本问题是估计(在给定条件下)两个(或更多)节点连接的概率,其解由众所周知的(网络的)可靠性多项式表示。这一概念已被广泛用于通信领域,在道路、铁路、电力线、光纤、电话、传感器等网络中取得了长足的进步。在计算方面,研究界倾向于冯·诺伊曼提出的门级方法,而摩尔和香农精心设计的器件级方案——尽管非常实用和详细——并没有激发电路设计师的灵感,也没有得到关注。该方案建立在一个发人深省的网络上,称为吊床,展示了规则的砖墙邻近连接。试图对一般的计算网络(特别是吊床)进行公正的评判,本文旨在强调和阐明不同类型的网络在执行计算时的可靠性。为此,我们将定义一些新的成本函数,这些函数与已建立的成本函数一起,将使我们能够仔细比较不同类型的网络,以便更清楚地了解它们能够为计算带来的可靠性增强。据我们所知,这是有史以来第一次就计算可靠性对网络进行排名。主要结论是,重新思考/重新启动我们应该如何设计可靠的计算系统,立即适用于网络/设备阵列(例如,晶体管或量子位),既及时又必要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信