{"title":"Modeling and Analysis of Leakage Induced Damping Effect in Low Voltage LSIs","authors":"Jie Gu, J. Keane, C. Kim","doi":"10.1145/1165573.1165668","DOIUrl":"https://doi.org/10.1145/1165573.1165668","url":null,"abstract":"Although there has been extensive research on controlling leakage power, the fact that leaky transistors can act as a damping element for supply noise has been long ignored or unnoticed in the design community. This paper investigates the leakage induced damping effect that helps suppress the supply noise. By developing physics-based impedance models for active and leakage currents, we show that leakage, particularly gate tunneling leakage, provides more damping than strong-inversion current. Simulations were performed in a 32nm CMOS technology to validate our models under PVT variations and to explore the voltage dependent behavior of this phenomenon. Design example utilizing leakage induced damping such as decap assignment is discussed with results showing 15.6% saving in decap area","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129261622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alok Garg, Fernando Castro, Michael C. Huang, D. Chaver, L. Piñuel, M. Prieto
{"title":"Substituting Associative Load Queue with Simple Hash Tables in Out-of-Order Microprocessors","authors":"Alok Garg, Fernando Castro, Michael C. Huang, D. Chaver, L. Piñuel, M. Prieto","doi":"10.1145/1165573.1165637","DOIUrl":"https://doi.org/10.1145/1165573.1165637","url":null,"abstract":"Buffering more in-flight instructions in an out-of-order microprocessor is a straightforward and effective method to help tolerate the long latencies generally associated with off-chip memory accesses. One of the main challenges of buffering a large number of instructions, however, is the implementation of a scalable and efficient mechanism to detect memory access order violations as a result of out-of-order scheduling of load and store instructions. Traditional CAM-based associative queues can be very slow and energy consuming. In this paper, instead of using the traditional age-based load queue to record load addresses, we explicitly record age information in address-indexed hash tables to achieve the same functionality of detecting premature loads. This alternative design eliminates associative searches and significantly reduces the energy consumption of the load queue. With simple techniques to reduce the number of false positives, performance degradation is kept at a minimum","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122886031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Hanson, Bo Zhai, D. Blaauw, D. Sylvester, A. Bryant, Xinlin Wang
{"title":"Energy Optimality and Variability in Subthreshold Design","authors":"S. Hanson, Bo Zhai, D. Blaauw, D. Sylvester, A. Bryant, Xinlin Wang","doi":"10.1145/1165573.1165660","DOIUrl":"https://doi.org/10.1145/1165573.1165660","url":null,"abstract":"Recent progress in the development of subthreshold circuit design techniques has created the opportunity for dramatic energy reductions in many applications. However, energy efficiency comes at the price of timing and energy variability due to process variations. We explore energy optimality in the subthreshold regime, discuss variability in this region, and highlight the energy and variability characteristics of a real subthreshold design","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131300087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Mismatch-Dependent Low Power Technique with Shadow Match-Line Voltage-Detecting Scheme for CAMs","authors":"Jianwei Zhang, Y. Ye, Bin-Da Liu","doi":"10.1145/1165573.1165605","DOIUrl":"https://doi.org/10.1145/1165573.1165605","url":null,"abstract":"A new mismatch-dependent low-power technique is presented for content-addressable memories (CAMs). With a novel shadow match-line voltage-detecting scheme, the word circuits realize fast self-disable of the charging paths in case of mismatches. Since the majority of CAMs words are mismatched, a significant power is reduced with a high search speed. Simulation results show the proposed 256-word times 144-bit ternary CAM, using 0.13-mum 1.2-V CMOS process, achieves 0.51 fJ/bit/search for the word circuit with less than 900 ps search time. The achievement illustrates a 77% energy-delay-product (EDP) reduction as compared to the speed-optimized current-saving scheme","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130223543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Sarrafzadeh, F. Dabiri, R. Jafari, T. Massey, A. Nahapetian
{"title":"Low Power Light-weight Embedded Systems","authors":"M. Sarrafzadeh, F. Dabiri, R. Jafari, T. Massey, A. Nahapetian","doi":"10.1145/1165573.1165623","DOIUrl":"https://doi.org/10.1145/1165573.1165623","url":null,"abstract":"Light-weight embedded systems are now gaining more popularity due to the recent technological advances in fabrication that have resulted in more powerful tiny processors with greater communication capabilities that pose various scientific challenges for researchers. Perhaps the most significant challenge is the energy consumption concern and reliability, mainly due to the small size of batteries. In this tutorial, we portray a brief description of low-power, light-weight embedded systems, depict several power profiling studies previously conducted, and present several research challenges that require low-power consumption in embedded systems. For each challenge, we highlight how low-power designs may enhance the overall performance of the system. Finally, we present a several techniques that minimize the power consumption in such systems","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124436715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Process Variation Aware Cache Leakage Management","authors":"Ke Meng, R. Joseph","doi":"10.1145/1165573.1165636","DOIUrl":"https://doi.org/10.1145/1165573.1165636","url":null,"abstract":"In a few technology generations, limitations of fabrication processes have made accurate design time power estimates a daunting challenge. Static leakage current which comprises a significant fraction of total power due to large on-chip caches, is exponentially dependent on widely varying physical parameters such as gate length, gate oxide thickness, and dopant ion concentration. In large structures like on-chip caches, this may mean that one portion of a cache may consume an order of magnitude larger static power than equivalently sized regions. Under this climate, egalitarian management of physical resources is clearly untenable. In this paper, we analyze the effects of within-die and die-to-die leakage variation for on-chip caches. We then propose way prioritization, a manufacturing variation aware scheme that minimizes cache leakage energy. Our results show that significant average power reductions are possible without undue hardware complexity or performance compromise","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"20 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124544651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power Reduction in an H.264 Encoder Through Algorithmic and Logic Transformations","authors":"M. Koziri, G. Stamoulis, I. Katsavounidis","doi":"10.1145/1165573.1165598","DOIUrl":"https://doi.org/10.1145/1165573.1165598","url":null,"abstract":"The H.264 video coding standard can achieve considerably higher coding efficiency than previous video coding standards. The keys to this high coding efficiency are the two prediction modes (intra & inter) provided by H.264. Unfortunately, these result in a considerably higher encoder complexity that adversely affects speed and power, which are both significant for the mobile multimedia applications targeted by the standard. Therefore, it is of high importance to design architectures that minimize the speed and power overhead of the prediction modes. In this paper we present a new algorithm, and the logic transformations that enable it, that can replace the standard sum of absolute differences (SAD) approach in the two main prediction modes, and provide a power efficient hardware implementation without perceivable degradation in coding efficiency or video quality","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114728468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selective Writeback: Exploiting Transient Values for Energy-Efficiency and Performance","authors":"D. Balkan, J. Sharkey, D. Ponomarev, K. Ghose","doi":"10.1145/1165573.1165584","DOIUrl":"https://doi.org/10.1145/1165573.1165584","url":null,"abstract":"Today's superscalar microprocessors use large, heavily-ported physical register files (RFs) to increase the instruction throughput. The high complexity and power dissipation of such RFs mainly stem from the need to maintain each and every result for a large number of cycles after the result generation. We observed that a significant fraction (about 45%) of the result values are delivered to their consumers via the bypass network (consumed \"on-the-fly\") and are never read out from the destination registers. In this paper, we first formulate conditions for identifying such transient values and describe their microarchitectural implementation; then we propose a technique to avoid the writeback of such transient values into the RF. With 64-entry integer and floating point register files, our technique achieves an 11% performance improvement and 29% reduction in the RF energy consumption compared to the baseline machine with the same number of registers. Furthermore, for the same performance target, the selective writeback scheme results in a 38% reduction in the energy consumption of the RF compared to the baseline machine","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126133933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Lu, N. Cao, L. Sigal, P. Woltgens, R. Robertazzi, D. Heidel
{"title":"A Pulsed Low-Voltage Swing Latch for Reduced Power Dissipation in High-Frequency Microprocessors","authors":"P. Lu, N. Cao, L. Sigal, P. Woltgens, R. Robertazzi, D. Heidel","doi":"10.1145/1165573.1165593","DOIUrl":"https://doi.org/10.1145/1165573.1165593","url":null,"abstract":"We have reported previously (Pong-Fei Lu et al., 2004) a low-swing latch (LSL) with superior performance-power tradeoff compared to the conventional pass-gate master-slave latch. In this paper, hardware results are presented for the proposed LSL with pulsed clock waveforms. The motivation is to combine low-voltage swing with pulsed signals to further reduce overall system power in high-frequency microprocessors. We have designed a 65-bit accumulator loop experiment to mimic a microprocessor pipeline stage. The local clock buffer design features a mode switch to toggle between two-phase (c1/c2) master-slave clocking and one-phase pulsed (c2 only) clocking. Our data show that 15-25% system power saving can be achieved in pulsed mode compared to non-pulsed mode. Power contribution from individual components is also presented","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125173432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy/Power Breakdown of Pipelined Nanometer Caches (90nm/65nm/45nm/32nm)","authors":"Samuel Rodríguez, B. Jacob","doi":"10.1145/1165573.1165581","DOIUrl":"https://doi.org/10.1145/1165573.1165581","url":null,"abstract":"As transistors continue to scale down into the nanometer regime, device leakage currents are becoming the dominant cause of power dissipation in nanometer caches, making it essential to model these leakage effects properly. Moreover, typical microprocessor caches are pipelined to keep up with the speed of the processor, and the effects of pipelining overhead need to be properly accounted for. In this paper, we present a detailed study of pipelined nanometer caches with detailed energy/power dissipation breakdowns showing where and how the power is dissipated within a nanometer cache. We explore a three-dimensional pipelined cache design space that includes cache size (16kB to 512kB), cache associativity (direct-mapped to 16-way) and process technology (90nm, 65nm, 45nm and 32nm). Among our findings, we show that cache bitline leakage is increasingly becoming the dominant cause of power dissipation in nanometer technology nodes. We show that subthreshold leakage is the main cause of static power dissipation, and that gate leakage is, surprisingly, not a significant contributor to total cache power, even for 32nm caches. We also show that accounting for cache pipelining overhead is necessary, as power dissipated by the pipeline elements is a significant part of cache power","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124666362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}