{"title":"Reliability-aware and energy-efficient synthesis of NoC based MPSoCs","authors":"Yong Zou, S. Pasricha","doi":"10.1109/ISQED.2013.6523678","DOIUrl":null,"url":null,"abstract":"In sub-65nm CMOS process technologies, networks-on-chip (NoC) are increasingly susceptible to transient faults (i.e., soft errors). To achieve fault tolerance, Triple Modular Redundancy (TMR) and Hamming Error Correction Codes (HECC) are often employed by designers to protect buffers used in NoC components. However, these mechanisms to achieve fault resilience introduce power dissipation overheads that can disrupt stringent chip power budgets and thermal constraints. In this paper, we propose a novel design-time framework (RESYN) to trade-off energy consumption and reliability in the NoC fabric at the system level for MPSoCs. RESYN employs a nested evolutionary algorithm approach to guide the mapping of cores on a die, and opportunistically determine locations to insert fault tolerance mechanisms in the NoC to minimize energy while satisfying reliability constraints. Our experimental results show that RESYN can reduce energy costs by 14.5% on average compared to a fully protected NoC, while still maintaining more than a 90% fault tolerance. If higher levels of reliability are desired, RESYN can generate a Pareto set of solutions allowing designers to select the most energy-efficient solution for any reliability goal. Given the increasing importance of reliability in the nanometer era for MPSoCs, this work provides important perspectives that can guide the reduction of overheads of reliable NoC design.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Quality Electronic Design (ISQED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISQED.2013.6523678","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
In sub-65nm CMOS process technologies, networks-on-chip (NoC) are increasingly susceptible to transient faults (i.e., soft errors). To achieve fault tolerance, Triple Modular Redundancy (TMR) and Hamming Error Correction Codes (HECC) are often employed by designers to protect buffers used in NoC components. However, these mechanisms to achieve fault resilience introduce power dissipation overheads that can disrupt stringent chip power budgets and thermal constraints. In this paper, we propose a novel design-time framework (RESYN) to trade-off energy consumption and reliability in the NoC fabric at the system level for MPSoCs. RESYN employs a nested evolutionary algorithm approach to guide the mapping of cores on a die, and opportunistically determine locations to insert fault tolerance mechanisms in the NoC to minimize energy while satisfying reliability constraints. Our experimental results show that RESYN can reduce energy costs by 14.5% on average compared to a fully protected NoC, while still maintaining more than a 90% fault tolerance. If higher levels of reliability are desired, RESYN can generate a Pareto set of solutions allowing designers to select the most energy-efficient solution for any reliability goal. Given the increasing importance of reliability in the nanometer era for MPSoCs, this work provides important perspectives that can guide the reduction of overheads of reliable NoC design.