2014 IEEE COOL Chips XVII最新文献

筛选
英文 中文
A low power DRAM refresh control scheme for 3D memory cube 一种用于三维存储立方体的低功耗DRAM刷新控制方案
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842950
Ying Wang, Yinhe Han, Huawei Li
{"title":"A low power DRAM refresh control scheme for 3D memory cube","authors":"Ying Wang, Yinhe Han, Huawei Li","doi":"10.1109/CoolChips.2014.6842950","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842950","url":null,"abstract":"We propose a low power refresh control scheme for 3D stacked DRAM memory, which leverages the data-pattern dependence characteristics of the cells' Retention-Time to squeeze the margin of refresh interval. It is a systematic approach that uses our proposed Retention-Time (RT) detection mechanism to capture the bottleneck that contributes to over-frequent refresh operations: “weak” cells with relatively shorter Retention-Time than others. With the help of memory scrubbers and Error Correction Pointer (ECP) table integrated on logic base of 3D memory cube, we can avoid the worst-case operation by locating the true “weak” cells sensitized by application and adapting the refresh rate to the data layout under our loop-based control algorithm. As shown in experiments, the method dramatically saves memory energy and bandwidth consumption.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121825062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Aggressive use of Deep Sleep mode in low power embedded systems 在低功耗嵌入式系统中积极使用深度睡眠模式
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842956
Jun'ichi Segawa, Yusuke Shirota, K. Fujisaki, Tetsuro Kimura, Tatsunori Kanai
{"title":"Aggressive use of Deep Sleep mode in low power embedded systems","authors":"Jun'ichi Segawa, Yusuke Shirota, K. Fujisaki, Tetsuro Kimura, Tatsunori Kanai","doi":"10.1109/CoolChips.2014.6842956","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842956","url":null,"abstract":"Since idle-state is the dominant state for embedded systems, disabling unused devices in idle-states can lead to significant power reduction. Among the various sleep modes provided by application processors, Deep Sleep mode offers maximum power savings. Since Deep Sleep mode requires to stop I/O devices and clocks, it is usually used in suspend-state. However, with the emergence of non-volatile or low power compute state retainable devices, we can now explore exploiting Deep Sleep mode in non-suspend states. We propose a new scheme to aggressively use Deep Sleep mode under normal operations. An experimental result of 48-80% power reduction on our prototype board indicates possibilities for near-future mobile platform running solely on photovoltaic-power.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130431092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An energy optimization method for vector processing mechanisms 矢量加工机构的能量优化方法
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842957
Ye Gao, Masayuki Sato, Ryusuke Egawa, H. Takizawa, Hiroaki Kobayashi
{"title":"An energy optimization method for vector processing mechanisms","authors":"Ye Gao, Masayuki Sato, Ryusuke Egawa, H. Takizawa, Hiroaki Kobayashi","doi":"10.1109/CoolChips.2014.6842957","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842957","url":null,"abstract":"In order to achieve a low energy execution for any multimedia applications (MMAs) on a vector processing mechanism (VPM), the number of parallel arithmetic pipelines and the number of cache ports of VPM must be properly configured for each MMAs. Therefore, this paper proposes an energy optimization method for VPMs (EOM-VP), which finds the lowest energy configuration by using the greedy searching method and an analytical model. As the evaluation results suggest, EOM-VP could find the lowest or the second lowest energy configuration for all the benchmark programs in the evaluation.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128817597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A fine grained power management supported by just-in-time compiler 实时编译器支持的细粒度电源管理
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842958
Motoki Wada, Mikiko Sato, M. Namiki
{"title":"A fine grained power management supported by just-in-time compiler","authors":"Motoki Wada, Mikiko Sato, M. Namiki","doi":"10.1109/CoolChips.2014.6842958","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842958","url":null,"abstract":"A low-power computing is now on high demand for both high performance computing and mobile computing. This research suggests the framework for controlling finely grained power saving hardware such as power gating, based on on-time analysis supported by JIT Compiler. By adapting the framework to control over fine-grained power gating control, the authors have succeeded to reduce the leakage power of processor by the maximum of 22%, and the average of 6%.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128132770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Establishing a standard interface between multi-manycore and software tools - SHIM 建立多核和软件工具之间的标准接口- SHIM
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842946
Masaki Kondo, F. Arakawa, M. Edahiro
{"title":"Establishing a standard interface between multi-manycore and software tools - SHIM","authors":"Masaki Kondo, F. Arakawa, M. Edahiro","doi":"10.1109/CoolChips.2014.6842946","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842946","url":null,"abstract":"The multicore processors are becoming norm and a processor with even more than a hundred of cores are emerging. These inherently require wide range of software tools to help software developers. However, supporting these complex hardware by the tools require significant effort by the tool vendors, and each invest in adapting the new hardware by modifying their tools or creating proprietary configuration files, while often the similar set of hardware architectural information are needed. The SHIM, Software-Hardware Interface for Multi-many-core, is a joint industrial and academic effort to standardize the interface between the multicore hardware and the software tools. This extended abstract introduces SHIM, the overall architecture, the schema used, the use-cases, and a prototype tool to foster the adaption of the interface.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114895174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A flexibly fault-tolerant FU array processor and its self-tuning scheme to locate permanently defective unit 一种灵活容错FU阵列处理器及其定位永久故障单元的自调谐方案
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842951
Jun Yao, Y. Nakashima, Mitsutoshi Saito, Yohei Hazama, Ryosuke Yamanaka
{"title":"A flexibly fault-tolerant FU array processor and its self-tuning scheme to locate permanently defective unit","authors":"Jun Yao, Y. Nakashima, Mitsutoshi Saito, Yohei Hazama, Ryosuke Yamanaka","doi":"10.1109/CoolChips.2014.6842951","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842951","url":null,"abstract":"In this work, we propose the Explicit Redundancy Linear Array (EReLA) architecture to provide a highly flexible fault-toleration, which effectively utilizes its rich resources in a functional unit (FU) array for both the error detection and the fail-safe hot-swap after taking a permanent fault. For the preparation of the hot-swap, a self-tuning scheme is proposed specifically to fast locate the precise position of the permanently defective units, which can be either the computational, LD/ST FUs, or the connecting network as well. EReLA can thereby isolates the permanently defective unit at the smallest granularity, which allows more hot-swaps and extends accordingly the lifespan of the whole processor. Given these schemes, EReLA is functionally same to a traditional TMR processor in terms of fault toleration, while the power data of a 180nm prototype EReLA chip has indicated that it incurs far less power consumption than the TMR implementation.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122003582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Language runtime support for NVM/DRAM hybrid main memory NVM/DRAM混合主存的语言运行时支持
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842949
Gaku Nakagawa, S. Oikawa
{"title":"Language runtime support for NVM/DRAM hybrid main memory","authors":"Gaku Nakagawa, S. Oikawa","doi":"10.1109/CoolChips.2014.6842949","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842949","url":null,"abstract":"Replacing of DRAM in main memory with non-volatile memory (NVM) has several merits. However, NVM under development has some limitations in write operation. To overcome it, some previous researches proposed NVM/DRAM hybrid memory architecture. In the architecture, it needs to determine data placements between NVM and DRAM. In this paper, we advocate that programming language runtimes are useful for management of NVM/DRAM hybrid main memory. In addition, we will propose a method to manage NVM/DRAM hybrid main memory with language runtime support.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130641500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Kernel data race detection using debug register in Linux Linux中使用调试寄存器的内核数据争用检测
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842953
Yunyun Jiang, Yi Yang, Tian Xiao, Tianwei Sheng, Wenguang Chen
{"title":"Kernel data race detection using debug register in Linux","authors":"Yunyun Jiang, Yi Yang, Tian Xiao, Tianwei Sheng, Wenguang Chen","doi":"10.1109/CoolChips.2014.6842953","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842953","url":null,"abstract":"Data races in parallel programs are notoriously difficult to detect and resolve. Existing research has mostly focused on data race detection at the user level and significant progress has been made in this regard. It is difficult to apply detection methods designed for user-level applications to identify OS kernel level races. In this paper, we present a new detection tool that is able to effectively detect race conditions in the Linux kernel environment. We use a dynamic detection approach, employing hardware debug registers available on commodity processors, to catch races on the fly during runtime. Preliminary experimental results show that our tool can effectively identify real data race instances.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129696209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Perpetuum Mobile 32bit CPU with 13.4pJ/cycle, 0.14µA sleep current using Reverse Body Bias Assisted 65nm SOTB CMOS technology perpetual Mobile 32位CPU, 13.4pJ/cycle, 0.14µA休眠电流,采用反向体偏置辅助65nm SOTB CMOS技术
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842954
K. Ishibashi, N. Sugii, K. Usami, H. Amano, Kazutoshi Kobayashi, C. Pham, H. Makiyama, Yoshiki Yamamoto, H. Shinohara, T. Iwamatsu, Y. Yamaguchi, H. Oda, T. Hasegawa, S. Okanishi, H. Yanagita, S. Kamohara, M. Kadoshima, K. Maekawa, T. Yamashita, Duc-Hung Le, T. Yomogita, M. Kudo, K. Kitamori, Shuya Kondo, Yuuki Manzawa
{"title":"A Perpetuum Mobile 32bit CPU with 13.4pJ/cycle, 0.14µA sleep current using Reverse Body Bias Assisted 65nm SOTB CMOS technology","authors":"K. Ishibashi, N. Sugii, K. Usami, H. Amano, Kazutoshi Kobayashi, C. Pham, H. Makiyama, Yoshiki Yamamoto, H. Shinohara, T. Iwamatsu, Y. Yamaguchi, H. Oda, T. Hasegawa, S. Okanishi, H. Yanagita, S. Kamohara, M. Kadoshima, K. Maekawa, T. Yamashita, Duc-Hung Le, T. Yomogita, M. Kudo, K. Kitamori, Shuya Kondo, Yuuki Manzawa","doi":"10.1109/CoolChips.2014.6842954","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842954","url":null,"abstract":"A 32-bit CPU which operates with the lowest energy of 13.4 pJ/cycle at 0.35V and 14MHz, operates at 0.22V to 1.2V and with 0.14μA sleep current is demonstrated. The low power performance is attained by Reverse-Body-Bias-Assisted 65nm SOTB CMOS (Silicon On Thin Buried oxide) technology. The CPU can operate more than 100 years with 610mAH Li battery.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"128 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133256260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A globally asynchronous locally synchronous DMR architecture for aggressive low-power fault toleration 一种全局异步局部同步DMR体系结构,用于积极的低功耗容错
2014 IEEE COOL Chips XVII Pub Date : 2014-04-14 DOI: 10.1109/CoolChips.2014.6842952
Yuttakon Yuttakonkit, Jun Yao, Y. Nakashima
{"title":"A globally asynchronous locally synchronous DMR architecture for aggressive low-power fault toleration","authors":"Yuttakon Yuttakonkit, Jun Yao, Y. Nakashima","doi":"10.1109/CoolChips.2014.6842952","DOIUrl":"https://doi.org/10.1109/CoolChips.2014.6842952","url":null,"abstract":"Recently, dual or triple modular redundancy (DMR/TMR) has been commonly used in high-end server or special environment targeted microprocessors to mitigate single event effects (SEEs), as the miniaturized transistors tend to be more vulnerable to SEEs. However, facing the issue that DMR and TMR usually add remarkable pressures to the power consumption due to the highly redundant executions, this work specially provides an architectural solution to introduce aggressive dynamic voltage scaling (DVS) and Razor-FF on DMR architecture to moderate the total energy. As the traditional DMR architecture with a globally synchronous clock will have visible performance down-gradation when DVS and Razor-FF are used, in this work, we propose a DMR processor architecture that uses dedicated clocks on each DMR module, following a globally asynchronous locally synchronous (GALS) execution fashion. In the execution, due to the possible timing faults from the aggressively lowered voltage, the two modules may experience a dynamically phase-shift clock frequency. Our GALS DMR approach is assembled with FIFOs and delay buffers to conceal the effect from this phase-shift and thereby the performance impact is largely alleviated. Compared to the traditional synchronous DMR system, we can have around 10% performance improvement by this asynchronous scheme when a same power reduction ratio is assumed. Also, we have aggressively turned down the voltage and achieved a 12% better MIPS/W than the previous DMR without major performance influence.","PeriodicalId":366328,"journal":{"name":"2014 IEEE COOL Chips XVII","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123479339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信