{"title":"Dynamic Lifetime Reliability Management for Chip Multiprocessors","authors":"Milad Ghorbani Moghaddam;Cristinel Ababei","doi":"10.1109/TMSCS.2018.2870187","DOIUrl":null,"url":null,"abstract":"We introduce an algorithm for dynamic lifetime reliability optimization of chip multiprocessors (CMPs). The proposed dynamic reliability management (DRM) algorithm combines thread migration and dynamic voltage and frequency scaling (DVFS) as the two primary techniques to change the CMP operation. The goal is to increase the lifetime reliability of the overall system to the desired target with minimal performance degradation. We test the proposed algorithm with a variety of benchmarks on 16 and 64 core network-on-chip (NoC) based CMP architectures. Full-system based simulations using a customized GEM5 simulator demonstrate that lifetime reliability can be improved by 100 percent for an average performance penalty of 7.7 and 8.7 percent for the two CMP architectures.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 4","pages":"952-958"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2018.2870187","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multi-Scale Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/8466668/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We introduce an algorithm for dynamic lifetime reliability optimization of chip multiprocessors (CMPs). The proposed dynamic reliability management (DRM) algorithm combines thread migration and dynamic voltage and frequency scaling (DVFS) as the two primary techniques to change the CMP operation. The goal is to increase the lifetime reliability of the overall system to the desired target with minimal performance degradation. We test the proposed algorithm with a variety of benchmarks on 16 and 64 core network-on-chip (NoC) based CMP architectures. Full-system based simulations using a customized GEM5 simulator demonstrate that lifetime reliability can be improved by 100 percent for an average performance penalty of 7.7 and 8.7 percent for the two CMP architectures.