通过缓存负载/存储队列减少数据缓存能耗

Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03. Pub Date : 2003-08-25 DOI:10.1145/871506.871569

D. Nicolaescu, A. Veidenbaum, A. Nicolau

{"title":"通过缓存负载/存储队列减少数据缓存能耗","authors":"D. Nicolaescu, A. Veidenbaum, A. Nicolau","doi":"10.1145/871506.871569","DOIUrl":null,"url":null,"abstract":"High-performance processors use a large set-associative L1 data cache with multiple ports. As clock speeds and size increase such a cache consumes a significant percentage of the total processor energy. This paper proposes a method of saving energy by reducing the number of data cache accesses. It does so by modifying the load/store queue design to allow \"caching\" of previously accessed data values on both loads and stores after the corresponding memory access instruction has been committed. It is shown that a 32 entry modified LSQ design allows an average of 38.5% of the loads in the SpecINT95 benchmarks and 18.9% in the SpecFP95 benchmarks to get their data from the LSQ. The reduction in the number of Ll cache accesses results in up to a 40% reduction in the L1 data cache energy consumption and in an up to a 16% improvement in the energy-delay product while requiring almost no additional hardware or complex control logic.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Reducing data cache energy consumption via cached load/store queue\",\"authors\":\"D. Nicolaescu, A. Veidenbaum, A. Nicolau\",\"doi\":\"10.1145/871506.871569\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-performance processors use a large set-associative L1 data cache with multiple ports. As clock speeds and size increase such a cache consumes a significant percentage of the total processor energy. This paper proposes a method of saving energy by reducing the number of data cache accesses. It does so by modifying the load/store queue design to allow \\\"caching\\\" of previously accessed data values on both loads and stores after the corresponding memory access instruction has been committed. It is shown that a 32 entry modified LSQ design allows an average of 38.5% of the loads in the SpecINT95 benchmarks and 18.9% in the SpecFP95 benchmarks to get their data from the LSQ. The reduction in the number of Ll cache accesses results in up to a 40% reduction in the L1 data cache energy consumption and in an up to a 16% improvement in the energy-delay product while requiring almost no additional hardware or complex control logic.\",\"PeriodicalId\":355883,\"journal\":{\"name\":\"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/871506.871569\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/871506.871569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

摘要

高性能处理器使用具有多个端口的大型集关联L1数据缓存。随着时钟速度和大小的增加，这样的缓存会消耗处理器总能量的很大一部分。本文提出了一种通过减少数据缓存访问次数来节约能源的方法。它通过修改加载/存储队列设计来实现这一点，以便在提交相应的内存访问指令后，允许在加载和存储上“缓存”先前访问的数据值。结果表明，32项修改后的LSQ设计允许SpecINT95基准测试中平均38.5%的负载和SpecFP95基准测试中18.9%的负载从LSQ获取数据。l缓存访问次数的减少导致L1数据缓存能耗降低40%，能量延迟产品提高16%，同时几乎不需要额外的硬件或复杂的控制逻辑。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reducing data cache energy consumption via cached load/store queue

High-performance processors use a large set-associative L1 data cache with multiple ports. As clock speeds and size increase such a cache consumes a significant percentage of the total processor energy. This paper proposes a method of saving energy by reducing the number of data cache accesses. It does so by modifying the load/store queue design to allow "caching" of previously accessed data values on both loads and stores after the corresponding memory access instruction has been committed. It is shown that a 32 entry modified LSQ design allows an average of 38.5% of the loads in the SpecINT95 benchmarks and 18.9% in the SpecFP95 benchmarks to get their data from the LSQ. The reduction in the number of Ll cache accesses results in up to a 40% reduction in the L1 data cache energy consumption and in an up to a 16% improvement in the energy-delay product while requiring almost no additional hardware or complex control logic.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.

自引率

0.00%

发文量