{"title":"Why Reliability for Computing Needs Rethinking","authors":"Valeriu Beiu, V. Dragoi, Roxana-Mariana Beiu","doi":"10.1109/ICRC2020.2020.00006","DOIUrl":null,"url":null,"abstract":"Offering high quality services/products has been of paramount importance for both communications and computations. Early on, both of these were in dire need of practical designs for enhancing reliability. That is why John von Neumann proposed the first gate-level method (using redundancy to build reliable systems from unreliable components), while Edward F. Moore and Claude E. Shannon followed suit with the first device-level scheme. Moore and Shannon’s prescient paper also established network reliability as a probabilistic model where the nodes of the network were considered to be perfectly reliable, while the edges could fail independently with a certain probability. The fundamental problem was that of estimating the probability that (under given conditions) two (or more) nodes are connected, the solution being represented by the well-known reliability polynomial (of the network). This concept has been heavily used for communications, where big strides were made and applied to networks of: roads, railways, power lines, fiber optics, phones, sensors, etc. For computations the research community converged on the gate-level method proposed by von Neumann, while the device-level scheme crafted by Moore and Shannon—although very practical and detailed—did not inspire circuit designers and went under the radar. That scheme was built on a thought-provoking network called hammock, exhibiting regular brick-wall near-neighbor connections. Trying to do justice to computing networks in general (and hammocks in particular), this paper aims to highlight and clarify how reliable different types of networks are when they are intended for performing computations. For doing this, we will define quite a few novel cost functions which, together with established ones, will allow us to meticulously compare different types of networks for a clearer understanding of the reliability enhancements they are able to bring to computations. To our knowledge, this is the first ever ranking of networks with respect to computing reliability. The main conclusion is that a rethinking/rebooting of how should we design reliable computing systems, immediately applicable to networks/arrays of devices (e.g., transistors or qubits), is both timely and needed.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Rebooting Computing (ICRC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRC2020.2020.00006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Offering high quality services/products has been of paramount importance for both communications and computations. Early on, both of these were in dire need of practical designs for enhancing reliability. That is why John von Neumann proposed the first gate-level method (using redundancy to build reliable systems from unreliable components), while Edward F. Moore and Claude E. Shannon followed suit with the first device-level scheme. Moore and Shannon’s prescient paper also established network reliability as a probabilistic model where the nodes of the network were considered to be perfectly reliable, while the edges could fail independently with a certain probability. The fundamental problem was that of estimating the probability that (under given conditions) two (or more) nodes are connected, the solution being represented by the well-known reliability polynomial (of the network). This concept has been heavily used for communications, where big strides were made and applied to networks of: roads, railways, power lines, fiber optics, phones, sensors, etc. For computations the research community converged on the gate-level method proposed by von Neumann, while the device-level scheme crafted by Moore and Shannon—although very practical and detailed—did not inspire circuit designers and went under the radar. That scheme was built on a thought-provoking network called hammock, exhibiting regular brick-wall near-neighbor connections. Trying to do justice to computing networks in general (and hammocks in particular), this paper aims to highlight and clarify how reliable different types of networks are when they are intended for performing computations. For doing this, we will define quite a few novel cost functions which, together with established ones, will allow us to meticulously compare different types of networks for a clearer understanding of the reliability enhancements they are able to bring to computations. To our knowledge, this is the first ever ranking of networks with respect to computing reliability. The main conclusion is that a rethinking/rebooting of how should we design reliable computing systems, immediately applicable to networks/arrays of devices (e.g., transistors or qubits), is both timely and needed.