Program Phase Directed Dynamic Cache Way Reconfiguration for Power Efficiency

Subhasish Banerjee, G. Surendra, S. Nandy
{"title":"Program Phase Directed Dynamic Cache Way Reconfiguration for Power Efficiency","authors":"Subhasish Banerjee, G. Surendra, S. Nandy","doi":"10.1109/ASPDAC.2007.358101","DOIUrl":null,"url":null,"abstract":"Aggressive superscalar processor with deep pipeline and sophisticated speculative execution techniques is pushing the power budget to its limit. It is found that a significant portion of this power is wasted during wrong path execution and non power optimal allocation of power hungry resources. Dynamic reconfiguration of micro-architectural resources can be exploited to bring down this waste at runtime. Lack of architectural method to capture the behavior of a program at runtime makes dynamic reconfiguration a challenge. In this paper we propose a method to characterize program behavior at runtime using conflict miss pattern of a data cache, which in turn identifies different program phases in terms of cache utilization. We use this phase information to enable/disable cache ways dynamically depending on the conflict miss pattern of a program. Using a hardware tracking mechanism we ensure that the program performance (throughput in terms of IPC) does not degrade beyond a tolerable limit. Through simulation we establish that an average improvement of 32% (best case 38%) in cache power saving is achieved at the expense of less than 2% degradation in performance for SPEC-CPU and MEDIA benchmarks. The additional hardware that detects and captures the phase information is outside the critical path of the processor and does not contribute to the overall delay.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Asia and South Pacific Design Automation Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASPDAC.2007.358101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Aggressive superscalar processor with deep pipeline and sophisticated speculative execution techniques is pushing the power budget to its limit. It is found that a significant portion of this power is wasted during wrong path execution and non power optimal allocation of power hungry resources. Dynamic reconfiguration of micro-architectural resources can be exploited to bring down this waste at runtime. Lack of architectural method to capture the behavior of a program at runtime makes dynamic reconfiguration a challenge. In this paper we propose a method to characterize program behavior at runtime using conflict miss pattern of a data cache, which in turn identifies different program phases in terms of cache utilization. We use this phase information to enable/disable cache ways dynamically depending on the conflict miss pattern of a program. Using a hardware tracking mechanism we ensure that the program performance (throughput in terms of IPC) does not degrade beyond a tolerable limit. Through simulation we establish that an average improvement of 32% (best case 38%) in cache power saving is achieved at the expense of less than 2% degradation in performance for SPEC-CPU and MEDIA benchmarks. The additional hardware that detects and captures the phase information is outside the critical path of the processor and does not contribute to the overall delay.
程序阶段导向的动态缓存方式重新配置的功率效率
具有深度管道和复杂的推测执行技术的激进超标量处理器正在将功率预算推向极限。研究发现,在错误路径执行和耗电资源的非功率最优分配过程中,有相当一部分功率被浪费掉了。可以利用微架构资源的动态重新配置来减少运行时的这种浪费。缺乏在运行时捕获程序行为的体系结构方法使得动态重新配置成为一项挑战。在本文中,我们提出了一种在运行时使用数据缓存的冲突缺失模式来表征程序行为的方法,该方法反过来根据缓存利用率识别不同的程序阶段。我们使用这个阶段信息来根据程序的冲突缺失模式动态地启用/禁用缓存方式。使用硬件跟踪机制,我们确保程序性能(IPC方面的吞吐量)不会超过可容忍的限制。通过模拟,我们确定在SPEC-CPU和MEDIA基准测试中,以不到2%的性能下降为代价,实现了缓存省电32%(最佳情况下为38%)的平均改进。检测和捕获相位信息的附加硬件位于处理器的关键路径之外,不会导致总体延迟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信