{"title":"Frequency planning for multi-core processors under thermal constraints","authors":"M. Kadin, S. Reda","doi":"10.1145/1393921.1393977","DOIUrl":"https://doi.org/10.1145/1393921.1393977","url":null,"abstract":"The objectives of this paper are (1) to develop a frequency planning methodology that maximizes the total performance of multi-core processors and that limits their maximum temperature as specified by the design constraints; and (2) to establish the implications of technology scaling on the performance limits of multi-core processors. Given the intricate designs and workloads of multi or many-core processors, it is computationally exhaustive to develop models that accurately calculate the temperature and performance of a given processor under various operating conditions. To abstract the underlying design complexity, we propose the use of supervised machine learning techniques to develop versatile models that capture the thermal characterization of multi-core processors under various input conditions and workloads. We then use the developed models to create a framework where various design constraints and objectives are expressed and solved using combinatorial optimization techniques. Using established power modeling and thermal simulation tools, we show that it is possible to boost the performance of multi-core processors by up to 11.4% at no impact to the maximum temperature.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128302100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A probabilistic technique for full-chip leakage estimation","authors":"Shaobo Liu, Qinru Qiu, Qing Wu","doi":"10.1145/1393921.1393975","DOIUrl":"https://doi.org/10.1145/1393921.1393975","url":null,"abstract":"In this paper, we propose a probability-based algorithm to estimate full-chip leakage without knowing layout information, under intra-die and inter-die process variations. Through modeling process variations into a random vector, we show that the standard cell leakage can be modeled as an inverse Gaussian random variable and further demonstrate that full-chip leakage can also be approximated to be an inverse Gaussian random variable. Hence, the leakage estimation problem is reduced to the estimation of the mean value and variance of the full-chip leakage. Experimental results show that the proposed algorithm is over 1000X faster than Monte Carlo simulation while the maximum estimation error is less than 6%.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127870554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An expected-utility based approach to variation aware VLSI optimization under scarce information","authors":"Upavan Gupta, N. Ranganathan","doi":"10.1145/1393921.1393945","DOIUrl":"https://doi.org/10.1145/1393921.1393945","url":null,"abstract":"In this research, we propose a novel approach for simultaneous optimization of power, crosstalk noise and delay via gate sizing, in the presence of scarce information about the distribution of the variations. The methodology uses the concepts of utility theory and risk minimization to identify a deterministic equivalent model of the stochastic problem, ensuring high levels of expected utilities of constraints, and significant speedup in the optimization process for large circuits. A comparative study with an existing gate sizing methodology shows that our method is multi-fold faster as well as comparable in terms of the optimization.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130746014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Error-resilient low-power Viterbi decoders","authors":"R. Abdallah, Naresh R Shanbhag","doi":"10.1145/1393921.1393951","DOIUrl":"https://doi.org/10.1145/1393921.1393951","url":null,"abstract":"Two low-power Viterbi decoder (VD) architectures are presented in this paper. In the first, limited decision errors are introduced in the add-compare-select units (ACSUs) of a VD to reduce their critical path delays so that they can be operated at lower supply voltages in absence of timing errors. In the second one, we allow data-dependent timing errors which occur whenever a critical path in the ACSU is excited. Algorithmic noise-tolerance (ANT) is then applied at the level of the ACSU to correct for these errors. Power reduction in this design is achieved by either overscaling the supply voltage (voltage overscaling (VOS)) or designing at the nominal process corner and supply voltage (average-case design). Power savings in the first and second design are 58% and 40% at a coding loss of 0:15 dB and 1:1 dB respectively in a IBM 130 nm CMOS process.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"422 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116086796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hiroaki Suzuki, M. Kurimoto, Tadao Yamanaka, H. Takata, H. Makino, H. Shinohara
{"title":"Post-silicon programmed body-biasing platform suppressing device variability in 45 nm CMOS technology","authors":"Hiroaki Suzuki, M. Kurimoto, Tadao Yamanaka, H. Takata, H. Makino, H. Shinohara","doi":"10.1145/1393921.1393931","DOIUrl":"https://doi.org/10.1145/1393921.1393931","url":null,"abstract":"The Post-Silicon Programmed Body-Biasing Platform is proposed to suppress device variability in the 45-nm CMOS technology era. The proposed platform measures device speed during post-fabrication testing. Then the fast die is marked so that the body-bias circuit turns on and reduces leakage current of the die that is selected and marked in a user application. Because the slow die around the speed specifications of a product is not body-biased, the product runs as fast as a normal non-body-biasing product. Although the leakage power of a fast die is reduced, the speed specification does not change. The proposed platform improves the worst corner specification comprising the two worst cases of speed and leakage power. The test chip, fabricated using 45-nm technology, improves the worst corner of stand-by leakage power vs. speed by 70%.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123967515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bus encoding for simultaneous delay and energy optimization","authors":"Jingyi Zhang, Qing Wu, Qinru Qiu","doi":"10.1145/1393921.1393976","DOIUrl":"https://doi.org/10.1145/1393921.1393976","url":null,"abstract":"In this paper we propose two bus encoding algorithms that optimize both bus delay and energy dissipation based on the probabilistic characteristics of data on data buses. The first algorithm minimizes the crosstalk transitions by inserting temporal redundancy and achieves optimal energy. The second algorithm reduces crosstalk more aggressively to achieve optimal bus delay by mapping the original data to low-energy opposite-transition-forbidden codes. Experimental results show that they outperform the existing heuristic bus encoding algorithms by 15.7% to 58.8% in average energy dissipation and 11.4% to 58.4% in average delay.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125037563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low power current mode receiver with inductive input impedance","authors":"M. Dave, M. Baghini, D. Sharma","doi":"10.1145/1393921.1393980","DOIUrl":"https://doi.org/10.1145/1393921.1393980","url":null,"abstract":"In this paper we show that current mode signaling system with receivers using inductive input impedance can provide a low power solution to high speed data transmission over long lines. We show that beta multiplier circuits can be designed such that they exhibit inductive input impedance and their use as current mode receivers provides significant enhancement in data rates. Simulation results show that it is possible to transmit data at eight times higher data rates than voltage mode with an one order of magnitude lower power consumption. Even compared to other current mode signaling systems, those using receiver with inductive input impedance show around 50% improvement in data rate at marginally lower power consumption.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127672340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing static and dynamic write margin for nanometer SRAMs","authors":"Jiajing Wang, Satyanand Nalam, B. Calhoun","doi":"10.1145/1393921.1393954","DOIUrl":"https://doi.org/10.1145/1393921.1393954","url":null,"abstract":"This paper analyzes write ability for SRAM cells in deeply scaled technologies, focusing on the relationship between static and dynamic write margin metrics. Reliability has become a major concern for SRAM designs in modern technologies. Both local mismatch and scaled VDD degrade read stability and write ability. Several static approaches, including traditional SNM, BL margin, and the N-curve method, can be used to measure static write margin. However, static approaches cannot indicate the impact of dynamic dependencies on cell stability. We propose to analyze dynamic write ability by considering the write operation as a noise event that we analyze using dynamic stability criteria. We also define dynamic write ability as the critical pulse width for a write. By using this dynamic criterion, we evaluate the existing static write margin metrics at normal and scaled supply voltages and assess their limitations. The dynamic write time metric can also be used to improve the accuracy of VCCmin estimation for active VDD scaling designs.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128329285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3-tier dynamically adaptive power-aware motion estimator for h.264/AVC video encoding","authors":"M. Shafique, L. Bauer, J. Henkel","doi":"10.1145/1393921.1393962","DOIUrl":"https://doi.org/10.1145/1393921.1393962","url":null,"abstract":"The limitation of energy in portable communication/entertainment devices necessitates the reduction of video encoding complexity. The H.264/AVC video coding standard is one of the latest video codecs and features a complex Motion Estimation scheme that accounts for a major part of the encoder energy. We therefore present a power-aware Motion Estimator for H.264 that adapts at run time according to the available energy level. We perform a set of adaptations at different Processing Stages of Motion Estimation. Our results show that in case of CIF videos (typically used in portable devices; but our approach is equally applicable to other video resolutions too), we achieve an average energy reduction of 52 times and 27 times as compared to UMHexagonS and EPZS respectively. This energy saving comes at the cost of an average loss of only 0.39 dB in Peak Signal to Noise Ratio (PSNR: an objective quality measure) and 23% increase in area (synthesized for 90 nm technology).","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129975429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-efficient MESI cache coherence with pro-active snoop filtering for multicore microprocessors","authors":"Avadh Patel, K. Ghose","doi":"10.1145/1393921.1393988","DOIUrl":"https://doi.org/10.1145/1393921.1393988","url":null,"abstract":"We present a snoop filtering mechanism for multicore microprocessors that implement coherent caches using the MESI protocol. The relatively small filter structure at each core maintains coarse-grain sharing information about regions within a page to filter out snoops. On broadcast, the sharing status of all regions within the page is collected proactively and up to 90% of unnecessary snoops are eliminated. The energy savings resulting from snoop filtering in our scheme average about 30% across the benchmarks studied for both a quad core design in 65 nm and 8-core design in 45 nm CMOS.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131280896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}