Width-Aware Fine-Grained Dynamic Supply Gating: A Design Methodology for Low-Power Datapath and Memory

2012 25th International Conference on VLSI Design Pub Date : 2012-01-07 DOI:10.1109/VLSID.2012.94

L. Wang, Somnath Paul, S. Bhunia

{"title":"Width-Aware Fine-Grained Dynamic Supply Gating: A Design Methodology for Low-Power Datapath and Memory","authors":"L. Wang, Somnath Paul, S. Bhunia","doi":"10.1109/VLSID.2012.94","DOIUrl":null,"url":null,"abstract":"With increasing contribution of leakage in total active power, run-time leakage control techniques are becoming extremely important. Supply gating provides an effective, low-overhead and technology scalable approach for active leakage reduction through the well-known \"stacking effect\". However, conventional supply gating approaches are typically coarse-grained in both space and time - i.e. are applied to large data path or memory blocks when an entire logic/memory block is idle for sufficiently long period. They suffer from limited applicability at run time. On the other hand, fine-grained supply gating is constrained primarily by the large wake-up delay and wake-up power overhead. In this paper, we propose a novel fine-grained width-aware dynamic supply gating (WADSG) approach to reduce both active leakage and redundant switching power in data path and embedded memory (e.g. L1/L2 cache). The approach exploits the abundance of narrow-width (NW) operands in general-purpose and embedded applications to \"supply-gate\" unused parts of integer execution units and memory blocks while they are in use. We introduce a novel levelized gating strategy to virtually eliminate the wake-up delay overhead. We employ the proposed WADSG approach to a super scalar processor. To reduce the wake-up power we use a width aware instruction issue policy. In case of L1 and L2 cache, we store the width information per \"ways\" of associative cache and supply-gate the most significant bits of the NW ways. We also propose a width-aware block allocation and replacement policy to maximize the number of NW ways. Simulation results for 45nm technology with Spec2k benchmarks show major savings (34.5%) in total processor power (considering both switching and active leakage power) with no performance impact. As a by-product, the proposed scheme also improves the thermal profile of both data path and memory.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 25th International Conference on VLSI Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSID.2012.94","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

With increasing contribution of leakage in total active power, run-time leakage control techniques are becoming extremely important. Supply gating provides an effective, low-overhead and technology scalable approach for active leakage reduction through the well-known "stacking effect". However, conventional supply gating approaches are typically coarse-grained in both space and time - i.e. are applied to large data path or memory blocks when an entire logic/memory block is idle for sufficiently long period. They suffer from limited applicability at run time. On the other hand, fine-grained supply gating is constrained primarily by the large wake-up delay and wake-up power overhead. In this paper, we propose a novel fine-grained width-aware dynamic supply gating (WADSG) approach to reduce both active leakage and redundant switching power in data path and embedded memory (e.g. L1/L2 cache). The approach exploits the abundance of narrow-width (NW) operands in general-purpose and embedded applications to "supply-gate" unused parts of integer execution units and memory blocks while they are in use. We introduce a novel levelized gating strategy to virtually eliminate the wake-up delay overhead. We employ the proposed WADSG approach to a super scalar processor. To reduce the wake-up power we use a width aware instruction issue policy. In case of L1 and L2 cache, we store the width information per "ways" of associative cache and supply-gate the most significant bits of the NW ways. We also propose a width-aware block allocation and replacement policy to maximize the number of NW ways. Simulation results for 45nm technology with Spec2k benchmarks show major savings (34.5%) in total processor power (considering both switching and active leakage power) with no performance impact. As a by-product, the proposed scheme also improves the thermal profile of both data path and memory.

查看原文本刊更多论文

宽度感知的细粒度动态供应门控:一种低功耗数据路径和存储器的设计方法

随着泄漏在总有功功率中所占的比重越来越大，运行时泄漏控制技术显得尤为重要。电源门控通过众所周知的“堆叠效应”，为主动减少泄漏提供了一种有效、低开销和技术可扩展的方法。然而，传统的供应门控方法通常在空间和时间上都是粗粒度的——也就是说，当整个逻辑/内存块空闲足够长的时间时，应用于大数据路径或内存块。它们在运行时的适用性有限。另一方面，细粒度供应门控主要受到较大唤醒延迟和唤醒功率开销的限制。在本文中，我们提出了一种新颖的细粒度宽度感知动态电源门控(WADSG)方法，以减少数据路径和嵌入式存储器(例如L1/L2缓存)中的有源泄漏和冗余开关功率。该方法利用通用和嵌入式应用程序中大量的窄宽度(NW)操作数，在整数执行单元和内存块正在使用时“供应门”未使用的部分。我们引入了一种新的平化门控策略来消除唤醒延迟开销。我们将所提出的WADSG方法应用于一个超大标量处理器。为了减少唤醒功率，我们使用了宽度感知指令发布策略。在L1和L2缓存的情况下，我们根据关联缓存的“方式”存储宽度信息，并提供NW方式的最高有效位。我们还提出了一个宽度感知的块分配和替换策略，以最大限度地增加NW方式的数量。采用Spec2k基准测试的45nm技术的仿真结果显示，在没有性能影响的情况下，总处理器功耗(考虑开关和有源泄漏功率)大幅节省(34.5%)。作为副产品，该方案还改善了数据路径和存储器的热分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 25th International Conference on VLSI Design

自引率

0.00%

发文量