F. Kriebel, Arun K. Subramaniyan, Semeen Rehman, Segnon Jean Bruno Ahandagbe, M. Shafique, J. Henkel
{"title":"R2Cache: Reliability-aware reconfigurable last-level cache architecture for multi-cores","authors":"F. Kriebel, Arun K. Subramaniyan, Semeen Rehman, Segnon Jean Bruno Ahandagbe, M. Shafique, J. Henkel","doi":"10.1109/CODESISSS.2015.7331362","DOIUrl":null,"url":null,"abstract":"On-chip last-level caches in multicore systems are one of the most vulnerable components to soft errors. However, vulnerability to soft errors highly depends upon the parameters and configuration of the last-level cache, especially when executing different applications. Therefore, in a reconfigurable cache architecture, the cache parameters can be adapted at run-time to improve its reliability against soft errors. In this paper we propose a novel reliability-aware reconfigurable last-level cache architecture (R2Cache) for multicore systems. It provides reliability-wise efficient cache configurations (i.e. cache parameter selection and cache partitioning) for different concurrently executing applications under user-provided tolerable performance overheads. To enable run-time adaptations, we also introduce a lightweight online vulnerability predictor that exploits the knowledge of performance metrics like number of L2 misses to accurately estimate the cache vulnerability to soft errors. Based on the predicted vulnerabilities of different concurrently executing applications in the current execution epoch, our run-time reliability manager reconfigures the cache such that, for the next execution epoch, the total vulnerability for all concurrently executing applications is minimized. In scenarios where single-bit error correction for cache lines may be afforded, vulnerability-aware reconfigurations can be leveraged to increase the reliability of the last-level cache against multi-bit errors. Compared to state-of-the-art, the proposed architecture provides 24% vulnerability savings when averaged across numerous experiments, while reducing the vulnerability by more than 60% for selected applications and application phases.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CODESISSS.2015.7331362","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
On-chip last-level caches in multicore systems are one of the most vulnerable components to soft errors. However, vulnerability to soft errors highly depends upon the parameters and configuration of the last-level cache, especially when executing different applications. Therefore, in a reconfigurable cache architecture, the cache parameters can be adapted at run-time to improve its reliability against soft errors. In this paper we propose a novel reliability-aware reconfigurable last-level cache architecture (R2Cache) for multicore systems. It provides reliability-wise efficient cache configurations (i.e. cache parameter selection and cache partitioning) for different concurrently executing applications under user-provided tolerable performance overheads. To enable run-time adaptations, we also introduce a lightweight online vulnerability predictor that exploits the knowledge of performance metrics like number of L2 misses to accurately estimate the cache vulnerability to soft errors. Based on the predicted vulnerabilities of different concurrently executing applications in the current execution epoch, our run-time reliability manager reconfigures the cache such that, for the next execution epoch, the total vulnerability for all concurrently executing applications is minimized. In scenarios where single-bit error correction for cache lines may be afforded, vulnerability-aware reconfigurations can be leveraged to increase the reliability of the last-level cache against multi-bit errors. Compared to state-of-the-art, the proposed architecture provides 24% vulnerability savings when averaged across numerous experiments, while reducing the vulnerability by more than 60% for selected applications and application phases.