{"title":"一个私有的1级缓存架构,利用在接近阈值电压下工作的多核中的延迟和容量权衡","authors":"Farrukh Hijaz, Qingchuan Shi, O. Khan","doi":"10.1109/ICCD.2013.6657029","DOIUrl":null,"url":null,"abstract":"Near-threshold voltage (NTV) operation is expected to enable up to 10× energy-efficiency for future processors. However, reliable operation below a minimum voltage (Vccmin) cannot be guaranteed. Specifically, SRAM bit-cell error rates are expected to rise steeply since their margins can easily be violated at near-threshold voltages. Multicore processors rely on fast private L1 caches to exploit data locality and achieve high performance. In the presence of high bit-cell error rates, an L1 cache can either sacrifice capacity or incur additional latency to correct the errors. We observe that L1 cache sensitivity to hit latency offers a design tradeoff between capacity and latency. When error rate is high at extreme Vccmin, it is worthwhile incurring additional latency to recover and utilize the additional L1 cache capacity. However, at low error rates, the additional constant latency to recover cache capacity degrades performance. With this tradeoff in mind, we propose a novel private L1 cache architecture that dynamically learns and adapts by either recovering cache capacity at the cost of additional latency overhead, or operate at lower capacity while utilizing the benefits of optimal hit latency. Using simulations of a 64-core multicore, we demonstrate that our adaptive L1 cache architecture performs better than both individual schemes at low and high error rates (i.e., various NTV conditions).","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"A private level-1 cache architecture to exploit the latency and capacity tradeoffs in multicores operating at near-threshold voltages\",\"authors\":\"Farrukh Hijaz, Qingchuan Shi, O. Khan\",\"doi\":\"10.1109/ICCD.2013.6657029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Near-threshold voltage (NTV) operation is expected to enable up to 10× energy-efficiency for future processors. However, reliable operation below a minimum voltage (Vccmin) cannot be guaranteed. Specifically, SRAM bit-cell error rates are expected to rise steeply since their margins can easily be violated at near-threshold voltages. Multicore processors rely on fast private L1 caches to exploit data locality and achieve high performance. In the presence of high bit-cell error rates, an L1 cache can either sacrifice capacity or incur additional latency to correct the errors. We observe that L1 cache sensitivity to hit latency offers a design tradeoff between capacity and latency. When error rate is high at extreme Vccmin, it is worthwhile incurring additional latency to recover and utilize the additional L1 cache capacity. However, at low error rates, the additional constant latency to recover cache capacity degrades performance. With this tradeoff in mind, we propose a novel private L1 cache architecture that dynamically learns and adapts by either recovering cache capacity at the cost of additional latency overhead, or operate at lower capacity while utilizing the benefits of optimal hit latency. Using simulations of a 64-core multicore, we demonstrate that our adaptive L1 cache architecture performs better than both individual schemes at low and high error rates (i.e., various NTV conditions).\",\"PeriodicalId\":398811,\"journal\":{\"name\":\"2013 IEEE 31st International Conference on Computer Design (ICCD)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 31st International Conference on Computer Design (ICCD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCD.2013.6657029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 31st International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2013.6657029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A private level-1 cache architecture to exploit the latency and capacity tradeoffs in multicores operating at near-threshold voltages
Near-threshold voltage (NTV) operation is expected to enable up to 10× energy-efficiency for future processors. However, reliable operation below a minimum voltage (Vccmin) cannot be guaranteed. Specifically, SRAM bit-cell error rates are expected to rise steeply since their margins can easily be violated at near-threshold voltages. Multicore processors rely on fast private L1 caches to exploit data locality and achieve high performance. In the presence of high bit-cell error rates, an L1 cache can either sacrifice capacity or incur additional latency to correct the errors. We observe that L1 cache sensitivity to hit latency offers a design tradeoff between capacity and latency. When error rate is high at extreme Vccmin, it is worthwhile incurring additional latency to recover and utilize the additional L1 cache capacity. However, at low error rates, the additional constant latency to recover cache capacity degrades performance. With this tradeoff in mind, we propose a novel private L1 cache architecture that dynamically learns and adapts by either recovering cache capacity at the cost of additional latency overhead, or operate at lower capacity while utilizing the benefits of optimal hit latency. Using simulations of a 64-core multicore, we demonstrate that our adaptive L1 cache architecture performs better than both individual schemes at low and high error rates (i.e., various NTV conditions).