{"title":"Novel Reliability Indicators From the Perspective of Data Center Networks","authors":"Hongbin Zhuang;Xiao-Yan Li;Cheng-Kuan Lin;Ximeng Liu;Xiaohua Jia","doi":"10.1109/TR.2024.3393133","DOIUrl":null,"url":null,"abstract":"Modern large-scale computing systems always demand better connectivity indicators for reliability evaluation. However, as more processing units have been rapidly incorporated into emerging computing systems, existing indicators (e.g., <inline-formula><tex-math>$\\ell$</tex-math></inline-formula>-component edge connectivity and <inline-formula><tex-math>$\\ell$</tex-math></inline-formula>-extra edge connectivity) have gradually failed to provide the required fault tolerance. In addition, these indicators require, for example, that the faulty network should have at least <inline-formula><tex-math>$\\ell$</tex-math></inline-formula> components (or that each component should have at least <inline-formula><tex-math>$\\ell$</tex-math></inline-formula> nodes). These fault assumptions are not flexible enough to deal with diversified structural demands in practice circumstances. In order to address these challenges simultaneously, this article proposes two novel indicators for network reliability by utilizing the partition matroid technique, named matroidal connectivity and conditional matroidal connectivity. We first investigate the accurate values of (conditional) matroidal connectivity of <inline-formula><tex-math>$k$</tex-math></inline-formula>-ary <inline-formula><tex-math>$n$</tex-math></inline-formula>-cube <inline-formula><tex-math>$Q_{n}^{k}$</tex-math></inline-formula>, which is an appealing option as the underlying topology for modern parallel computing systems. Moreover, we propose an <inline-formula><tex-math>$O(k^{n-1})$</tex-math></inline-formula> algorithm for determining structural features of minimum edge sets whose cardinality is the conditional matroidal connectivity of <inline-formula><tex-math>$Q_{n}^{k}$</tex-math></inline-formula>. Simulation results are presented to verify our algorithm's correctness and further investigate the distribution pattern of edge sets subject to the restriction of partition matroid. We also present comparative analyses illustrating the superior edge fault tolerance of our findings in relation to prior research, which even exhibits an exponential enhancement when <inline-formula><tex-math>$k\\geq 4$</tex-math></inline-formula>.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 1","pages":"2459-2472"},"PeriodicalIF":5.7000,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10527394/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Modern large-scale computing systems always demand better connectivity indicators for reliability evaluation. However, as more processing units have been rapidly incorporated into emerging computing systems, existing indicators (e.g., $\ell$-component edge connectivity and $\ell$-extra edge connectivity) have gradually failed to provide the required fault tolerance. In addition, these indicators require, for example, that the faulty network should have at least $\ell$ components (or that each component should have at least $\ell$ nodes). These fault assumptions are not flexible enough to deal with diversified structural demands in practice circumstances. In order to address these challenges simultaneously, this article proposes two novel indicators for network reliability by utilizing the partition matroid technique, named matroidal connectivity and conditional matroidal connectivity. We first investigate the accurate values of (conditional) matroidal connectivity of $k$-ary $n$-cube $Q_{n}^{k}$, which is an appealing option as the underlying topology for modern parallel computing systems. Moreover, we propose an $O(k^{n-1})$ algorithm for determining structural features of minimum edge sets whose cardinality is the conditional matroidal connectivity of $Q_{n}^{k}$. Simulation results are presented to verify our algorithm's correctness and further investigate the distribution pattern of edge sets subject to the restriction of partition matroid. We also present comparative analyses illustrating the superior edge fault tolerance of our findings in relation to prior research, which even exhibits an exponential enhancement when $k\geq 4$.
期刊介绍:
IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.