Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)最新文献

筛选
英文 中文
An energy efficient GPGPU memory hierarchy with tiny incoherent caches 具有微小非相干缓存的高能效GPGPU内存层次结构
Alamelu Sankaranarayanan, E. K. Ardestani, J. L. Briz, Jose Renau
{"title":"An energy efficient GPGPU memory hierarchy with tiny incoherent caches","authors":"Alamelu Sankaranarayanan, E. K. Ardestani, J. L. Briz, Jose Renau","doi":"10.1109/ISLPED.2013.6629259","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629259","url":null,"abstract":"With progressive generations and the ever-increasing promise of computing power, GPGPUs have been quickly growing in size, and at the same time, energy consumption has become a major bottleneck for them. The first level data cache and the scratchpad memory are critical to the performance of a GPGPU, but they are extremely energy inefficient due to the large number of cores they need to serve. This problem could be mitigated by introducing a cache higher up in hierarchy that services fewer cores, but this introduces cache coherency issues that may become very significant, especially for a GPGPU with hundreds of thousands of in-flight threads. In this paper, we propose adding incoherent tinyCaches between each lane in an SM, and the first level data cache that is currently shared by all the lanes in an SM. In a normal multiprocessor, this would require hardware cache coherence between all the SM lanes capable of handling hundreds of thousands of threads. Our incoherent tinyCache architecture exploits certain unique features of the CUDA/OpenCL programming model to avoid complex coherence schemes. This tinyCache is able to filter out 62% of memory requests that would otherwise need to be serviced by the DL1G, and almost 81% of scratchpad memory requests, allowing us to achieve a 37% energy reduction in the on-chip memory hierarchy. We evaluate the tinyCache for different memory patterns and show that it is beneficial in most cases.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79553275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Power mapping and modeling of multi-core processors 多核处理器的功率映射和建模
K. Dev, Abdullah Nazma Nowroz, S. Reda
{"title":"Power mapping and modeling of multi-core processors","authors":"K. Dev, Abdullah Nazma Nowroz, S. Reda","doi":"10.5555/2648668.2648680","DOIUrl":"https://doi.org/10.5555/2648668.2648680","url":null,"abstract":"We propose new techniques for post-silicon power mapping and modeling of multi-core processors using infrared imaging and performance counter measurements. An accurate finite-element modeling framework is used to capture the relationship between temperature and power, while compensating for the artifacts introduced from substituting traditional heat removal mechanisms with oil-based infrared-transparent cooling mechanisms. We use thermal conditioning techniques to build leakage power models for the die. Utilizing the power maps identified from infrared mapping, we develop empirical power models for different processor blocks based on the measurements from the performance monitoring counters (PMCs), and utilize the PMC-based models to analyze the transient power consumption. In our experiments, we capture thermal images from a quad-core processor under different workload conditions, and then we reconstruct the dynamic and leakage power maps for different blocks. Our results show good accuracy in mapping and modeling, revealing good insights into the trends of power consumption in multi-core processors.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73933699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
An analytical solution for multi-core energy calculation with consideration of leakage and temperature dependency 考虑泄漏和温度依赖的多核能量计算的解析解
Ming Fan, Vivek Chaturvedi, Shi Sha, Gang Quan
{"title":"An analytical solution for multi-core energy calculation with consideration of leakage and temperature dependency","authors":"Ming Fan, Vivek Chaturvedi, Shi Sha, Gang Quan","doi":"10.1109/ISLPED.2013.6629322","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629322","url":null,"abstract":"Energy minimization is a critical issue and challenge when considering the cyclic dependency of leakage power and temperature as IC technology reaches deep sub-micron level. In this paper, we present an analytical method to calculate the energy consumption efficiently and effectively for a given voltage schedule on a multi-core platform, with the leakage/temperature dependency taken into consideration. Our experiments show that the proposed method can achieve a speedup of 15 times compared with the numerical method, with a relative error of no more than 1.5%.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75772833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
REEL: Reducing effective execution latency of floating point operations REEL:减少浮点操作的有效执行延迟
Vignyan Reddy Kothinti Naresh, S. Gilani, Erika Gunadi, N. Kim, M. Schulte, Mikko H. Lipasti
{"title":"REEL: Reducing effective execution latency of floating point operations","authors":"Vignyan Reddy Kothinti Naresh, S. Gilani, Erika Gunadi, N. Kim, M. Schulte, Mikko H. Lipasti","doi":"10.1109/ISLPED.2013.6629292","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629292","url":null,"abstract":"The height of the dynamic dependence graph of a program, as executed by a processor, determines the minimum bound on the execution time. This height can be decreased by reducing the effective execution latency of operations that form dependence chains in the graph. In this paper, we propose a technique called REEL to reduce overall latency of chains of dependent floating point (FP) operations by increasing the throughput of computation. REEL comprises of a high-throughput floating point unit (HFP) that allows early issue of an FP Add that is dependent on another FP Add or FP Multiply. This is complemented by instruction scheduler modifications that allow early issue of dependent FP Adds, and a novel checker logic that corrects any precision errors. Unlike conventional static operation fusion, like fused Multiply-Add (FMA), there are no changes to the instruction set to enable utilization of the new hardware, and no recompilation is necessary. Furthermore, unlike ISA-level FMA, our technique produces results that are bit compatible while boosting performance of Add-Add dependence pairs in addition to Multiply-Add pairs. Our evaluation of REEL using CFP2006 benchmarks shows an average performance gain of 7.6% and maximum performance gain of 17% while consuming 1.2% lower energy.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72718154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Coordinated refresh: Energy efficient techniques for DRAM refresh scheduling 协调刷新:DRAM刷新调度的节能技术
Ishwar Bhati, Zeshan A. Chishti, B. Jacob
{"title":"Coordinated refresh: Energy efficient techniques for DRAM refresh scheduling","authors":"Ishwar Bhati, Zeshan A. Chishti, B. Jacob","doi":"10.1109/islped.2013.6629295","DOIUrl":"https://doi.org/10.1109/islped.2013.6629295","url":null,"abstract":"As the size and speed of DRAM devices increase, the performance and energy overheads due to refresh become more significant. To reduce refresh penalty we propose techniques referred collectively as “Coordinated Refresh”, in which scheduling of low power modes and refresh commands are coordinated so that most of the required refreshes are issued when the DRAM device is in the deepest low power Self Refresh (SR) mode. Our approach saves DRAM background power because the peripheral circuitry and clocks are turned off in the SR mode. Our proposed solutions improve DRAM energy efficiency by 10% as compared to baseline, averaged across all the SPEC CPU 2006 benchmarks.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84537169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Write intensity prediction for energy-efficient non-volatile caches 高能效非易失性缓存的写入强度预测
Junwhan Ahn, S. Yoo, Kiyoung Choi
{"title":"Write intensity prediction for energy-efficient non-volatile caches","authors":"Junwhan Ahn, S. Yoo, Kiyoung Choi","doi":"10.1109/ISLPED.2013.6629298","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629298","url":null,"abstract":"This paper presents a novel concept called write intensity prediction for energy-efficient non-volatile caches as well as the architecture that implements the concept. The key idea is to correlate write intensity of cache blocks with addresses of memory access instructions that incur cache misses of those blocks. The predictor keeps track of instructions that tend to load write-intensive blocks and utilizes that information to predict write intensity of blocks. Based on this concept, we propose a block placement strategy driven by write intensity prediction for SRAM/STT-RAM hybrid caches. Experimental results show that the proposed approach reduces write energy consumption by 55% on average compared to the existing hybrid cache architecture.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90362550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Single-cycle, pulse-shaped critical path monitor in the POWER7+ microprocessor POWER7+微处理器中的单周期脉冲形关键路径监视器
A. Drake, M. Floyd, Richard L. Willaman, Derek J. Hathaway, J. Hernández, Crystal Soja, Marshall D. Tiner, G. Carpenter, R. Senger
{"title":"Single-cycle, pulse-shaped critical path monitor in the POWER7+ microprocessor","authors":"A. Drake, M. Floyd, Richard L. Willaman, Derek J. Hathaway, J. Hernández, Crystal Soja, Marshall D. Tiner, G. Carpenter, R. Senger","doi":"10.1109/ISLPED.2013.6629293","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629293","url":null,"abstract":"A 32nm SOI critical path monitor (CPM) that can provide timing measurements to a Digital PLL for dynamic frequency adjustments in the 8-core POWER7+™ microprocessor is described. The CPM calibrates to within 2% of cycle time from nominal to turbo voltages. Its voltage sensitivity is 10mV/bit. It tracks processor temperature sensitivity to within 1.5% of nominal frequency, and has a sample jitter less than 1.5% of nominal frequency. The ability to detect noise dynamically allows the system to operate the processor closer to its optimal frequency for any given voltage, resulting in lower voltage for power savings or higher frequency for performance improvements.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90511609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Rethinking DC-DC converter design constraints for adaptable systems that target the minimum-energy point 以最小能量点为目标的自适应系统DC-DC变换器设计约束的再思考
M. Turnquist, Jani Mäkipää, M. Hiienkari, Hanh-Phuc Le, L. Koskinen
{"title":"Rethinking DC-DC converter design constraints for adaptable systems that target the minimum-energy point","authors":"M. Turnquist, Jani Mäkipää, M. Hiienkari, Hanh-Phuc Le, L. Koskinen","doi":"10.1109/ISLPED.2013.6629327","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629327","url":null,"abstract":"This paper explores a new DC-DC converter design constraint for adaptable systems that target the minimum-energy point (MEP). Traditionally, DC-DC converters have regulated to a fixed output voltage over a wide range of input voltages. For energy-constrained systems that target the MEP, regulating them to a fixed voltage is unnecessary since changes in the output voltage near the MEP have little impact on the energy per cycle. This paper applies a new and traditional design constraint to a 3:1 series-parallel switched-capacitor (SC) DC-DC converter in 28 nm CMOS. The new design constraint allows for decreased design time, less area, and less system-level energy per cycle compared to traditional constraints.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86781943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Robustness-driven energy-efficient ultra-low voltage standard cell design with intra-cell mixed-Vt methodology 基于胞内混合vt方法的鲁棒驱动节能超低电压标准胞设计
Wenfeng Zhao, Yajun Ha, Chin Hau Hoo, A. Alvarez
{"title":"Robustness-driven energy-efficient ultra-low voltage standard cell design with intra-cell mixed-Vt methodology","authors":"Wenfeng Zhao, Yajun Ha, Chin Hau Hoo, A. Alvarez","doi":"10.1109/ISLPED.2013.6629317","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629317","url":null,"abstract":"High functional yield is one of the key challenges for subthreshold standard cell designs. Device upsizing is a commonly used but suboptimal method due to its overheads in energy and area. In this paper, we propose a robustness-driven intra-cell mixed-Vt design methodology (MVT-ULV) for the robust ultra-low voltage operation. It uses low threshold voltage transistors in the weak pulling network of logic gates to enhance the robustness. It guarantees the high functional yield with the minimum energy/area overheads. We demonstrate on a commercial 65nm CMOS process that, our proposed design methodology shows up to 60mV and 110mV robustness improvement at 300mV power supply voltage over the commercial library cells and the cells built with previous Leakage-Minimization mixed-Vt methods (MVT-LM) under the same cell area constraints, respectively. In addition, the proposed MVT-ULV library enables ITC'99 benchmark circuits to show on average 30.1% and 78.1% energy-efficiency improvement when compared to the libraries built with the device-upsizing methods and the previous MVT-LM methods under the same yield constraints, respectively.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81655411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A hybrid display frame buffer architecture for energy efficient display subsystems 一种用于节能显示子系统的混合显示帧缓冲结构
Kyungtae Han, Alexander W. Min, Nithyananda S. Jeganathan, Paul Diefenbaugh
{"title":"A hybrid display frame buffer architecture for energy efficient display subsystems","authors":"Kyungtae Han, Alexander W. Min, Nithyananda S. Jeganathan, Paul Diefenbaugh","doi":"10.1109/ISLPED.2013.6629321","DOIUrl":"https://doi.org/10.1109/ISLPED.2013.6629321","url":null,"abstract":"Our principal motivation is to reduce the energy consumption of display subsystems in mobile devices by introducing a hybrid frame buffer architecture into the platform. We observed that display contents on a screen are quite static for certain mobile workloads, such as web browsing. As a result, data reading from the display frame is much more frequent than the writing of new data onto the frame buffer, a state we refer to as read dominance. Based on this observation, we propose a hybrid frame buffer architecture that exploits the display contents' read-dominant property to improve the energy efficiency of display subsystems. Specifically, we employ two memory types: DRAM and Phase-Change Memory (PCM), in the display frame buffer to exploit their different read/write energy characteristics. We also present an analysis of the energy efficiency of the hybrid frame buffer based on our display content and energy consumption models. Our evaluation results show that the proposed hybrid frame buffer reduces frame buffer energy consumption by up to 43%, compared to the conventional DRAM-only frame buffer.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85377475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信