Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)最新文献

筛选
英文 中文
Reducing leakage power by accounting for temperature inversion dependence in dual-Vt synthesized circuits 通过考虑双vt合成电路的温度反转依赖性来降低泄漏功率
A. Calimera, R. I. Bahar, E. Macii, M. Poncino
{"title":"Reducing leakage power by accounting for temperature inversion dependence in dual-Vt synthesized circuits","authors":"A. Calimera, R. I. Bahar, E. Macii, M. Poncino","doi":"10.1145/1393921.1393978","DOIUrl":"https://doi.org/10.1145/1393921.1393978","url":null,"abstract":"The effects of temperature on delay depend on several parameters, such as cell size, load, supply voltage, and threshold voltage. In particular, variations in Vth can yield a temperature inversion effect causing a decreases of cell delay as temperature increases. This phenomenon, besides affecting timing analysis of a design, has important and unforeseeable consequences on power optimization techniques. In this paper, we focus on the impact of such effects on multi-Vt design; in particular, we show how traditional dual-Vt optimization may yield timing errors in circuits by ignoring temperature effects. Moreover, we present a temperature-aware dual-Vt optimization technique that reduces leakage power and can guarantee that the circuit is timing feasible at the boundary temperatures provided by the technology library. Our experiments show an average 27% leakage reduction with respect to a non temperature-aware design flow.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125136896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
SRAM methodology for yield and power efficiency: per-element selectable supplies and memory reconfiguration schemes 产量和功率效率的SRAM方法:每元素可选供应和存储器重新配置方案
R. Kanj, R. Joshi, Zhuo Li, J. B. Kuang, H. Ngo, Nancy Y. Zhou, Weiping Shi, S. Nassif
{"title":"SRAM methodology for yield and power efficiency: per-element selectable supplies and memory reconfiguration schemes","authors":"R. Kanj, R. Joshi, Zhuo Li, J. B. Kuang, H. Ngo, Nancy Y. Zhou, Weiping Shi, S. Nassif","doi":"10.1145/1393921.1393946","DOIUrl":"https://doi.org/10.1145/1393921.1393946","url":null,"abstract":"We present a novel power-aware yield enhancement design methodology and reconfiguration scheme for deep submicron SRAM designs. We show that with the continued trend of raising array supply to counter process variations, it is more effective to use a per-element selectable virtual power-supply scenario as opposed to single array supply with traditional redundancy schemes. The element can be a bank, a sub-array, or an independent row/column, and the element's virtual supply value is determined based on fail bitmaps. The technique can also be used in conjunction with traditional redundancy schemes to further improve the efficiency. The supply and redundancy assignments can be obtained by relying on memory reconfiguration algorithms. For this, we propose a greedy yet accurate algorithm that runs in O(nlogn) as opposed to average case O(n2) traditional algorithms. The methodology leads to significant power savings ranging from 20% to 50% for 65 nm technology. We expect the savings to increase in future technologies as leakage powers dominate. To the best of our knowledge, this is the first time such a methodology is applied to SRAM designs.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127620177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
System implications of integrated photonics 集成光子学的系统含义
N. Jouppi
{"title":"System implications of integrated photonics","authors":"N. Jouppi","doi":"10.1145/1393921.1393923","DOIUrl":"https://doi.org/10.1145/1393921.1393923","url":null,"abstract":"Micron-scale photonic devices integrated with standard CMOS processes have the potential to dramatically increase system bandwidths, performance, and configuration flexibility while reducing system power. I first describe some recent developments in silicon nanophotonic technology, such as microring resonators. Small devices have many advantages: reduced power, increased density, and increased speed. By integrating many thousands of these devices on a chip, photonics could potentially be used for most high-speed off-chip and global on-chip communication. Integrated photonics has many advantages at the board and rack scale as well. Recent high-speed board-level electrical signaling (>2.5GHz) precludes the use of multi-drop busses or communication over long distances on ordinary inexpensive PC board materials. By using photonics, high fan-out and high-fan-in bus structures can be built. Due to the low loss of optical signals versus distance, these structures can even be distributed over rack-scale distances. This dramatically increases system flexibility while reducing interconnect power. As an example of the potential impact of photonics, I describe a system architecture for the 2017 time frame we call Corona. Corona is a 3D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We believe that in comparison with an electrically-connected many-core alternative, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously significantly reducing power.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133417866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multiple power-gating domain (multi-VGND) architecture for improved leakage power reduction 多电源门控域(multi-VGND)架构,提高泄漏功率降低
A. Sathanur, L. Benini, A. Macii, E. Macii, M. Poncino
{"title":"Multiple power-gating domain (multi-VGND) architecture for improved leakage power reduction","authors":"A. Sathanur, L. Benini, A. Macii, E. Macii, M. Poncino","doi":"10.1145/1393921.1393938","DOIUrl":"https://doi.org/10.1145/1393921.1393938","url":null,"abstract":"Row-based power-gating has recently emerged as a meet-in-the-middle sleep transistor insertion paradigm between cell-level and block-level granularity, in which each layout row defines the unit of gating, and different rows can be clustered and share the same sleep transistor. Previous works, however, assume the availability of a single virtual ground voltage, thus making the decision of whether to gate or not a given cluster a binary choice: a cluster is either gated or not. In this work, we consider a limited set of virtual ground voltages, which allows us to assign to a cluster the virtual ground voltage that offers the best leakage-performance tradeoff for that cluster. We propose two algorithms for solving two power-gating variants: one in which the entire design is gated (given an allowable delay degradation), and another one in which only a subset of the rows is gated (given an allowable delay degradation and sleep transistor area). Our algorithm automatically finds the set of clusters with optimal virtual ground voltages so as to minimize leakage while respecting timing and area constraints. The number of power-gating domains can be user-bounded, in accordance with power grid or library characterization limitations. Results show that multiple virtual ground allow to improve by more than 34% over existing solutions that gate the entire design, and provide sizable savings also for the case of partial power-gating.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115018044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Thread fusion 线程融合
José González, Qiong Cai, P. Chaparro, G. Magklis, R. Rakvic, Antonio González
{"title":"Thread fusion","authors":"José González, Qiong Cai, P. Chaparro, G. Magklis, R. Rakvic, Antonio González","doi":"10.1145/1393921.1394018","DOIUrl":"https://doi.org/10.1145/1393921.1394018","url":null,"abstract":"This work proposes Thread Fusion as an effective way of reducing power consumption when a Simultaneous Multi-Threaded (SMT) core is executing two threads from a homogeneous parallel application. Two dynamic instances of the same static instruction, each from a different thread are merged (fused) into a single instruction, consuming half of the resources of front-end pipeline stages. When the fused instruction is executed, it is cloned and it proceeds at full bandwidth. Our simulation results show average energy reduction of 10% with less than 1% impact on performance.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124656134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A physical level study and optimization of CAM-based checkpointed register alias table 基于cam的检查点寄存器别名表的物理层研究与优化
Elham Safi, Andreas Moshovos, A. Veneris
{"title":"A physical level study and optimization of CAM-based checkpointed register alias table","authors":"Elham Safi, Andreas Moshovos, A. Veneris","doi":"10.1145/1393921.1393982","DOIUrl":"https://doi.org/10.1145/1393921.1393982","url":null,"abstract":"Using full-custom layouts in 130 nm technology, this work studies how the latency and energy of a checkpointed, CAM-based Register Alias Table (cRAT) vary as a function of the window size, the issue width, and the number of embedded global checkpoints (GCs). These results are compared to those of the SRAM-based RAT (sRAT). Understanding these variations is useful during the early stages of architectural exploration where physical level information is not yet available. It is found that compared to sRAT, cRAT is more sensitive to the number of physical registers and issue width, however, it is less sensitive to the number of GCs. In addition, beyond a certain number of GCs, cRAT becomes faster than its equivalent sRAT. For instance, this is true when a RAT for 64 architectural and 128 physical registers has at least 20 GCs. This work also proposes an energy optimization for the cRAT; this optimization selectively disables cRAT entries that do not result in a match during lookup. The energy savings are, for the most part, a function of the number of physical registers. For instance, for a cRAT with 128 entries energy is reduced by 40%.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125918364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Low power high bandwidth amplifier with RC Miller and gain enhanced feedforward compensation 具有RC米勒增益增强前馈补偿的低功率高带宽放大器
Shagun Bajoria, V. Singh, Raju Kunde, C. Parikh
{"title":"Low power high bandwidth amplifier with RC Miller and gain enhanced feedforward compensation","authors":"Shagun Bajoria, V. Singh, Raju Kunde, C. Parikh","doi":"10.1145/1393921.1393972","DOIUrl":"https://doi.org/10.1145/1393921.1393972","url":null,"abstract":"An improved frequency compensation technique is presented for low-power low-voltage three-stage operational amplifiers with high capacitive loads. The technique uses single RC Miller compensation and a direct gain enhanced feedforward path from the input to the output. With a load capacitance of 300 pF, the amplifier nominally achieves a dc gain of 74 dB, a 3-dB bandwidth of 2.9 kHz, a 52 degrees phase margin, and a slew rate of 0.22 V/μs, while consuming 0.24 mW of power with a 1.2-V supply voltage, in a 180 nm CMOS technology. The 3-dB bandwidth is one of the highest reported for a high-gain three-stage CMOS amplifier.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126517611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Entry control in network-on-chip for memory power reduction 降低存储器功耗的片上网络入口控制
Dongwook Lee, S. Yoo, Kiyoung Choi
{"title":"Entry control in network-on-chip for memory power reduction","authors":"Dongwook Lee, S. Yoo, Kiyoung Choi","doi":"10.1145/1393921.1393967","DOIUrl":"https://doi.org/10.1145/1393921.1393967","url":null,"abstract":"As high-end mobile embedded systems become data-intensive, the off-chip memory is becoming a major contributor to the total energy consumption. Especially, high-end mobile chips accommodate dedicated hardware blocks, e.g., codec and 3D graphics IP's, required for both performance and power consumption reasons. Those IP's usually do not have a large shared memory on chip. Thus, they communicate with each other via the off-chip DDR memory increasing off-chip memory accesses, which increases memory energy consumption during read/write operations. In this paper, we present a method of reducing memory energy consumption during read/write operations. It aims at minimizing the number of row opens and closes, which are the major source of energy consumption during read/write operations. The basic idea is to apply network entry control to prioritize consecutive open row memory accesses. The experimental results show up to 35% reduction in memory energy consumption with an industrial strength multimedia mobile SoC.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132593630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Row/column redundancy to reduce SRAM leakage in presence of random within-die delay variation 行/列冗余,以减少SRAM泄漏在随机模内延迟变化的存在
M. Goudarzi, T. Ishihara
{"title":"Row/column redundancy to reduce SRAM leakage in presence of random within-die delay variation","authors":"M. Goudarzi, T. Ishihara","doi":"10.1145/1393921.1393947","DOIUrl":"https://doi.org/10.1145/1393921.1393947","url":null,"abstract":"Traditionally, spare rows/columns have been used in two ways: either to replace too leaky cells to reduce leakage, or to substitute faulty cells to improve yield. In contrast, we first choose a higher threshold voltage (Vth) and/or gate-oxide thickness (Tox) for SRAM transistors at design time to reduce leakage, and then substitute the resulting too slow cells by spare rows/columns. We show that due to within-die delay variation of SRAM cells only a few cells violate target timing at higher Vth or Tox; we carefully choose the Vth and Tox values such that the original memory timing-yield remains intact for a negligible extra delay. On a commercial 90 nm process assuming 3% variation in SRAM cell delay, we obtained 47% leakage reduction by adding only 5 redundant columns at negligible area, dynamic power and delay costs.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132976236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Extending the lifetime of media recorders constrained by battery and flash memory size 延长受电池和闪存大小限制的媒体录像机的使用寿命
Younghyun Kim, Youngjin Cho, N. Chang, C. Chakrabarti, N. Cho
{"title":"Extending the lifetime of media recorders constrained by battery and flash memory size","authors":"Younghyun Kim, Youngjin Cho, N. Chang, C. Chakrabarti, N. Cho","doi":"10.1145/1393921.1393964","DOIUrl":"https://doi.org/10.1145/1393921.1393964","url":null,"abstract":"The lifetime of a stand-alone media recorder is a function of both the battery size and flash memory size. In this paper, we present a power management framework for media recorders that significantly enhances their lifetime while minimizing the flash memory usage and maintaining the same level of recording quality. This is achieved by implementing a mixture of encoding algorithms of different complexities that generate data with different compression ratios, and in turn balancing the energy consumption and the flash memory usage. The proposed method can be effectively employed on a direct battery drive system which does not use a DC-DC converter. The gradual drop of the battery voltage of such system is compensated by operating algorithms of lower complexity more and more. For a speech encoding application where a mixture of ADPCM (low complexity) and MP3 (high complexity) is used, the proposed algorithm achieves 70% more lifetime than a DC-DC converter with a highest clock frequency, and 20% more lifetime than even a DC-DC converter with the optimal clock frequency.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133924458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信