Fault-tolerant and energy-efficient MCSoC for information processing and control

Q3 Mathematics
A. Gruzlikov, N. Kolesov, D. Kostygov, M. Tolmacheva
{"title":"Fault-tolerant and energy-efficient MCSoC for information processing and control","authors":"A. Gruzlikov, N. Kolesov, D. Kostygov, M. Tolmacheva","doi":"10.31799/1684-8853-2019-4-9-18","DOIUrl":null,"url":null,"abstract":"Introduction: The majority of real complex systems are designed with respect to fault tolerance requirements. However, all theknown approaches are intended only to increase reliability. Purpose: An approach for designing fault-tolerant systems on a chip, aimednot only at increasing the reliability, but also at reducing the energy consumed by the system. Results: A two-stage approach to thedesign of fault-tolerant multicore systems-on-chip (MCSoCs) is proposed. At the first stage, an energy-efficient architecture of thedesigned system is formed. For each core used in the system, the optimal number of additional cores is determined within the frameworkof the imposed restrictions. The optimality criterion is the minimum power consumed by the system. The algorithm proposed for theformation of an energy-efficient architecture is based on the dependence of the power consumed in the system on the values of the supplyvoltage and the clock frequency. At the second stage, a procedure for diagnosing and repairing the system is developed which uses theprinciples of system-level diagnosis, involving mutual checks between the system cores. This procedure allows you to decentralize theprocess of diagnosing and restoring the system after a failure. Additionally, the article examines the organization of the communicationsubsystem based on shared memory. The study is based on a simulation conducted in order to estimate the time for making a decisionabout a failure in systems such as a lattice, torus and hypercube. Practical relevance: The proposed approach allows a system to providethe necessary values for its two most important characteristics: fault tolerance and energy efficiency. At the same time, decentralizationis ensured when making decisions about a failure and restoration. As a result, the system becomes more reliable.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatsionno-Upravliaiushchie Sistemy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31799/1684-8853-2019-4-9-18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: The majority of real complex systems are designed with respect to fault tolerance requirements. However, all theknown approaches are intended only to increase reliability. Purpose: An approach for designing fault-tolerant systems on a chip, aimednot only at increasing the reliability, but also at reducing the energy consumed by the system. Results: A two-stage approach to thedesign of fault-tolerant multicore systems-on-chip (MCSoCs) is proposed. At the first stage, an energy-efficient architecture of thedesigned system is formed. For each core used in the system, the optimal number of additional cores is determined within the frameworkof the imposed restrictions. The optimality criterion is the minimum power consumed by the system. The algorithm proposed for theformation of an energy-efficient architecture is based on the dependence of the power consumed in the system on the values of the supplyvoltage and the clock frequency. At the second stage, a procedure for diagnosing and repairing the system is developed which uses theprinciples of system-level diagnosis, involving mutual checks between the system cores. This procedure allows you to decentralize theprocess of diagnosing and restoring the system after a failure. Additionally, the article examines the organization of the communicationsubsystem based on shared memory. The study is based on a simulation conducted in order to estimate the time for making a decisionabout a failure in systems such as a lattice, torus and hypercube. Practical relevance: The proposed approach allows a system to providethe necessary values for its two most important characteristics: fault tolerance and energy efficiency. At the same time, decentralizationis ensured when making decisions about a failure and restoration. As a result, the system becomes more reliable.
用于信息处理和控制的容错节能MCSoC
简介:大多数真实的复杂系统都是根据容错需求设计的。然而,所有已知的方法都只是为了提高可靠性。目的:一种在芯片上设计容错系统的方法,目的不仅是为了提高可靠性,而且是为了减少系统的能量消耗。结果:提出了一种设计容错多核片上系统(mcsoc)的两阶段方法。在第一阶段,形成设计系统的节能架构。对于系统中使用的每个核心,在强制限制的框架内确定最佳的附加核心数量。最优准则是系统消耗的最小功率。提出了一种基于系统功耗与电源电压和时钟频率的依赖关系的节能架构形成算法。在第二阶段,使用系统级诊断的原则,开发了一个诊断和修复系统的程序,包括系统核心之间的相互检查。此过程允许您在故障后分散诊断和恢复系统的过程。此外,本文还研究了基于共享内存的通信子系统的组织结构。这项研究基于一项模拟,目的是估计在晶格、环面和超立方体等系统中对故障做出决策所需的时间。实际意义:提出的方法允许系统为其两个最重要的特性提供必要的值:容错性和能源效率。与此同时,在做出有关故障和恢复的决策时,可以确保分散化。因此,系统变得更加可靠。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Informatsionno-Upravliaiushchie Sistemy
Informatsionno-Upravliaiushchie Sistemy Mathematics-Control and Optimization
CiteScore
1.40
自引率
0.00%
发文量
35
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信