最后一级缓存的嵌入式方式预测

2012 IEEE 30th International Conference on Computer Design (ICCD) Pub Date : 2012-09-30 DOI:10.1109/ICCD.2012.6378636

Faissal M. Sleiman, R. Dreslinski, T. Wenisch

{"title":"最后一级缓存的嵌入式方式预测","authors":"Faissal M. Sleiman, R. Dreslinski, T. Wenisch","doi":"10.1109/ICCD.2012.6378636","DOIUrl":null,"url":null,"abstract":"This paper investigates Embedded Way Prediction for large last-level caches (LLCs): an architecture and circuit design to provide the latency of parallel tag-data access at substantial energy savings. Existing way prediction approaches for L1 caches are compromised by the high associativity and filtered temporal locality of LLCs. We demonstrate: (1) the need for wide partial tag comparison, which we implement with a dynamic CAM alongside the data sub-array wordline decode, and (2) the inhibit bit, an architectural innovation to provide accurate predictions when the partial tag comparison is inconclusive. We present circuit critical-path and architectural power/performance studies demonstrating speedups of up to 15.4% (6.6% average) for scientific and server applications, matching the performance of parallel tag-data access while reducing energy overhead by 40%.","PeriodicalId":313428,"journal":{"name":"2012 IEEE 30th International Conference on Computer Design (ICCD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Embedded way prediction for last-level caches\",\"authors\":\"Faissal M. Sleiman, R. Dreslinski, T. Wenisch\",\"doi\":\"10.1109/ICCD.2012.6378636\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper investigates Embedded Way Prediction for large last-level caches (LLCs): an architecture and circuit design to provide the latency of parallel tag-data access at substantial energy savings. Existing way prediction approaches for L1 caches are compromised by the high associativity and filtered temporal locality of LLCs. We demonstrate: (1) the need for wide partial tag comparison, which we implement with a dynamic CAM alongside the data sub-array wordline decode, and (2) the inhibit bit, an architectural innovation to provide accurate predictions when the partial tag comparison is inconclusive. We present circuit critical-path and architectural power/performance studies demonstrating speedups of up to 15.4% (6.6% average) for scientific and server applications, matching the performance of parallel tag-data access while reducing energy overhead by 40%.\",\"PeriodicalId\":313428,\"journal\":{\"name\":\"2012 IEEE 30th International Conference on Computer Design (ICCD)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 30th International Conference on Computer Design (ICCD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCD.2012.6378636\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 30th International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2012.6378636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

本文研究了大型最后一级缓存(llc)的嵌入式方式预测:一种架构和电路设计，可以在节省大量能源的情况下提供并行标签数据访问的延迟。现有的L1缓存预测方法受到llc的高关联性和过滤时间局部性的影响。我们证明:(1)需要广泛的部分标签比较，我们使用动态CAM与数据子阵列字行解码一起实现，以及(2)抑制位，这是一种架构创新，可以在部分标签比较不确定时提供准确的预测。我们提出的电路关键路径和架构功率/性能研究表明，对于科学和服务器应用，加速高达15.4%(平均6.6%)，与并行标签数据访问的性能相匹配，同时减少了40%的能源开销。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Embedded way prediction for last-level caches

This paper investigates Embedded Way Prediction for large last-level caches (LLCs): an architecture and circuit design to provide the latency of parallel tag-data access at substantial energy savings. Existing way prediction approaches for L1 caches are compromised by the high associativity and filtered temporal locality of LLCs. We demonstrate: (1) the need for wide partial tag comparison, which we implement with a dynamic CAM alongside the data sub-array wordline decode, and (2) the inhibit bit, an architectural innovation to provide accurate predictions when the partial tag comparison is inconclusive. We present circuit critical-path and architectural power/performance studies demonstrating speedups of up to 15.4% (6.6% average) for scientific and server applications, matching the performance of parallel tag-data access while reducing energy overhead by 40%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE 30th International Conference on Computer Design (ICCD)

自引率

0.00%

发文量