2013 IEEE 31st International Conference on Computer Design (ICCD)最新文献

筛选
英文 中文
Dynamic thread mapping for high-performance, power-efficient heterogeneous many-core systems 用于高性能、高能效异构多核系统的动态线程映射
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657025
Guangshuo Liu, Jinpyo Park, Diana Marculescu
{"title":"Dynamic thread mapping for high-performance, power-efficient heterogeneous many-core systems","authors":"Guangshuo Liu, Jinpyo Park, Diana Marculescu","doi":"10.1109/ICCD.2013.6657025","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657025","url":null,"abstract":"This paper addresses the problem of dynamic thread mapping in heterogeneous many-core systems via an efficient algorithm that maximizes performance under power constraints. Heterogeneous many-core systems are composed of multiple core types with different power-performance characteristics. As well documented in the literature, the generic mapping problem is an NP-complete problem which can be formulated as a 0-1 integer linear program, therefore, prohibitively expensive to solve optimally in an online scenario. However, in real applications, thread mapping decisions need to be responsive to workload phase changes. This paper proposes an iterative approach bounding the runtime as O(n2/m), for mapping multi-threaded applications on n cores comprising of m core types. Compared with an optimal solution, the proposed algorithm produces results less than 0.6% away from optimum on average, with two orders of magnitude improvement in runtime. Results show that performance improvement can reach 16% under iso-power constraints compared to a random mapping. The algorithm can be brought online for hundred-core heterogeneous systems as it scales to systems comprised of 256 cores with less than one millisecond in overhead.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"477 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132652830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
Low power multi-level-cell resistive memory design with incomplete data mapping 具有不完全数据映射的低功耗多电平单元电阻存储器设计
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657035
Dimin Niu, Qiaosha Zou, Cong Xu, Yuan Xie
{"title":"Low power multi-level-cell resistive memory design with incomplete data mapping","authors":"Dimin Niu, Qiaosha Zou, Cong Xu, Yuan Xie","doi":"10.1109/ICCD.2013.6657035","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657035","url":null,"abstract":"Phase change memory (PCM) has been widely studied as a potential DRAM alternative. The multi-level cell (MLC) can further increase the memory density and reduce the fabrication cost by storing multiple bits in a single cell. Nevertheless, large write power, high write latency, as well as reliability issue resulted from the resistance drift, bring in challenges for MLC PCM based memory design. In contrast, the emerging Resistive Random Access Memory (ReRAM), which has similar MLC property as PCM, demonstrates better performance and energy efficiency compared to PCM. In addition, due to the physical switching behaviors of ReRAM cell, the resistance drift phenomenon does not exist. In this paper, we propose a low power MLC ReRAM design. We first study the programming method of MLC ReRAM and identify that programming latency and energy are highly dependent on the data pattern written to the cell. Based on this observation, we propose incomplete data mapping (IDM), which maps an eight-level-cell into six states to prevent the time/energy consuming data patterns from appearing in the cell. Furthermore, in order to improve endurance of MLC RAM, which is much smaller than single-level cell (SLC) ReRM due to the complex programming method, we propose Dynamic Data ReMapping (DDRM) to selectively regulate memory blocks from IDM state back to complete data mapping (CDM) state. We demonstrate that the proposed design can work effectively with existing error-correction schemes but requires much smaller space overhead. Experimental results show that, IDM can reduce the energy performance by at most 15% with negligible performance overhead. By combining the DDRM with existing error-correction scheme, DDRM can improve the memory lifetime by 2.75× compared with conventional memory architectures.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115395488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Selected inversion for vectorless power grid verification by exploiting locality 利用局部性对无矢量电网验证进行选择反演
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657051
Jianlei Yang, Yici Cai, Qiang Zhou, Wei Zhao
{"title":"Selected inversion for vectorless power grid verification by exploiting locality","authors":"Jianlei Yang, Yici Cai, Qiang Zhou, Wei Zhao","doi":"10.1109/ICCD.2013.6657051","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657051","url":null,"abstract":"Vectorless power grid verification is a practical approach for early stage safety check without input current patterns. The power grid is usually formulated as a linear system and requires intensive matrix inversion and numerous linear programming, which is extremely time-consuming for large scale power grid verification. In this paper, the power grid is represented in the manner of domain-decomposition approach, and we propose a selected inversion technique to reduce the computation cost of matrix inversion for vectorless verification. The locality existence among power grids is exploited to decide which blocks of matrix inversion should be computed while remaining blocks are not necessary. The vectorless verification could be purposefully performed by this manner of selected inversion while previous direct approaches are required to perform full matrix inversion and then discard small entries to reduce the complexity of linear programming. Meanwhile, constraint locality is proposed to improve the verification accuracy. Experimental results show that the proposed approach could achieve significant speedups compared to previous approaches while still guaranteeing the quality of solution accuracy.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121185420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Semi-analytical current source modeling of near-threshold operating logic cells considering process variations 考虑工艺变化的近阈值操作逻辑单元的半解析电流源建模
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657079
Q. Xie, Tiansong Cui, Yanzhi Wang, Shahin Nazarian, Massoud Pedram
{"title":"Semi-analytical current source modeling of near-threshold operating logic cells considering process variations","authors":"Q. Xie, Tiansong Cui, Yanzhi Wang, Shahin Nazarian, Massoud Pedram","doi":"10.1109/ICCD.2013.6657079","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657079","url":null,"abstract":"Operating circuits in the ultra-low voltage regime results in significantly lower power consumption but can also degrade the circuit performance. In addition, it leads to higher sensitivity to various sources of variability in VLSI circuits. This paper extends the current source modeling (CSM) technique, which has successfully been applied to VLSI circuits to achieve very high accuracy in timing analysis, to the near-threshold voltage regime. In particular, it shows how to combine non-linear analytical models and low-dimensionality CSM lookup tables to simultaneously achieve modeling accuracy, space and time efficiency, when performing CSM-based timing analysis of VLSI circuits operating in near-threshold regime and subject to process variability effects.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129084148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploring the energy efficiency of Multispeculative Adders 探索多投机加法器的能源效率
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657058
Alberto A. Del Barrio, R. Hermida, S. Memik
{"title":"Exploring the energy efficiency of Multispeculative Adders","authors":"Alberto A. Del Barrio, R. Hermida, S. Memik","doi":"10.1109/ICCD.2013.6657058","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657058","url":null,"abstract":"Variable Latency Adders are attracting strong interest for increasing performance at a low cost. However, most of the literature is focused on achieving a good area-delay tradeoff. In this paper we consider multispeculation as an alternative for designing adders with low energy consumption, while offering better performance than the corresponding non-speculative ones. Instead of introducing more logic to accelerate the computation, the adder is split into several fragments which operate in parallel, and whose carry-in signals are provided by predictor units. On the one hand, the critical path of the module is shortened, and on the other hand the frequent useless glitches produced in the carry propagation structure are diminished. Hence, this will be translated into an overall energy reduction. Several experiments have been performed with linear and logarithmic adders, and results show energy savings by up to 90% and 70%, respectively, while achieving an additional execution time decrease. Furthermore, when utilized in whole datapaths with current control techniques, it is possible to reduce execution time by 24.5% (34% best case) and energy by 32% (48% best case) on average.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126551516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Register allocation and VDD-gating algorithms for out-of-order architectures 无序体系结构的寄存器分配和vdd门控算法
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657032
Steven J. Battle, Mark Hempstead
{"title":"Register allocation and VDD-gating algorithms for out-of-order architectures","authors":"Steven J. Battle, Mark Hempstead","doi":"10.1109/ICCD.2013.6657032","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657032","url":null,"abstract":"Register Files (RF) in modern out-of-order microprocessors can account for up to 30% of total power consumed by the core. The complexity and size of the RF has increased due to the transition from ROB-based to MIPSR10K-style physical register renaming. Because physical registers are dynamically allocated, the RF is not fully occupied during every phase of the application. In this paper, we propose a comprehensive power management strategy of the RF through algorithms for register allocation and register-bank power-gating that are informed by both microarchitecture details and circuit costs. We investigate algorithms to control where to place registers in the RF, when to disable banks in the RF, and when to re-enable these banks. We include detailed circuit models to estimate the cost for banking and power-gating the RF. We are able to save up to 50% of the leakage energy vs. a baseline monolithic RF, and save 11% more leakage energy than fine-grained VDD-gating schemes.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124513865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Phoenix NoC: A distributed fault tolerant architecture Phoenix NoC:分布式容错架构
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657018
C. Marcon, Alexandre M. Amory, T. Webber, Thomas Volpato, L. Poehls
{"title":"Phoenix NoC: A distributed fault tolerant architecture","authors":"C. Marcon, Alexandre M. Amory, T. Webber, Thomas Volpato, L. Poehls","doi":"10.1109/ICCD.2013.6657018","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657018","url":null,"abstract":"The advances in deep submicron technology have made the development of large Multiprocessor Systems-on-Chip (MPSoC) possible and Networks-on-Chip (NoCs) have been recognized to provide an efficient communication architecture for such systems. With the positive effects on the device's integration some drawbacks arise, such as the increase of fault susceptibility during the MPSoC manufacturing and operation. This work presents Phoenix, which is a direct mesh NoC that implements fault tolerant mechanisms in order to enable end-to-end communication when some links fail. Phoenix implements a distributed fault tolerant mechanism in software (i.e. in each processor) and in hardware (i.e. in each router). Experimental results show that Phoenix is scalable and allows the MPSoC operation even in the presence of several faulty links.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133418935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Exploiting correlation in stochastic circuit design 利用随机电路设计中的相关性
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657023
Armin Alaghi, J. Hayes
{"title":"Exploiting correlation in stochastic circuit design","authors":"Armin Alaghi, J. Hayes","doi":"10.1109/ICCD.2013.6657023","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657023","url":null,"abstract":"Stochastic computing (SC) is a re-emerging computing paradigm which enables ultra-low power and massive parallelism in important applications like real-time image processing. It is characterized by its use of pseudo-random numbers implemented by 0-1 sequences called stochastic numbers (SNs) and interpreted as probabilities. Accuracy is usually assumed to depend on the interacting SNs being highly independent or uncorrelated in a loosely specified way. This paper introduces a new and rigorous SC correlation (SCC) measure for SNs, and shows that, contrary to intuition, correlation can be exploited as a resource in SC design. We propose a general framework for analyzing and designing combinational circuits with correlated inputs, and demonstrate that such circuits can be significantly more efficient and more accurate than traditional SC circuits. We also provide a method of analyzing stochastic sequential circuits, which tend to have inherently correlated state variables and have proven very hard to analyze.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133580059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 135
FlexiWay: A cache energy saving technique using fine-grained cache reconfiguration FlexiWay:一种使用细粒度缓存重构的缓存节能技术
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657031
Sparsh Mittal, Zhao Zhang, J. Vetter
{"title":"FlexiWay: A cache energy saving technique using fine-grained cache reconfiguration","authors":"Sparsh Mittal, Zhao Zhang, J. Vetter","doi":"10.1109/ICCD.2013.6657031","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657031","url":null,"abstract":"Recent trends of CMOS scaling and use of large last level caches (LLCs) have led to significant increase in the leakage energy consumption of LLCs and hence, managing their energy consumption has become extremely important in modern processor design. The conventional cache energy saving techniques require offline profiling or provide only coarse granularity of cache allocation. We present FlexiWay, a cache energy saving technique which uses dynamic cache reconfiguration. FlexiWay logically divides the cache sets into multiple (e.g. 16) modules and dynamically turns off suitable and possibly different number of cache ways in each module. FlexiWay has very small implementation overhead and it provides fine-grain cache allocation even with caches of typical associativity, e.g. an 8-way cache. Microarchitectural simulations have been performed using an x86-64 simulator and workloads from SPEC2006 suite. Also, FlexiWay has been compared with two conventional energy saving techniques. The results show that FlexiWay provides largest energy saving and incurs only small loss in performance. For single, dual and quad core systems, the average energy saving using FlexiWay are 26.2%, 25.7% and 22.4%, respectively.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129934577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Performance simulator based on hardware resources constraints for ion trap quantum computer 基于硬件资源约束的离子阱量子计算机性能模拟器
2013 IEEE 31st International Conference on Computer Design (ICCD) Pub Date : 2013-11-07 DOI: 10.1109/ICCD.2013.6657073
Muhammad Ahsan, Byung-Soo Choi, Jungsang Kim
{"title":"Performance simulator based on hardware resources constraints for ion trap quantum computer","authors":"Muhammad Ahsan, Byung-Soo Choi, Jungsang Kim","doi":"10.1109/ICCD.2013.6657073","DOIUrl":"https://doi.org/10.1109/ICCD.2013.6657073","url":null,"abstract":"Efforts to build quantum computers using ion-traps have demonstrated all elementary qubit operations necessary for scalable implementation. Modular architectures have been proposed to construct modest size quantum computers with up to 104 - 106 qubits using technologies that are available today (MUSIQC architecture). Concrete scheduling procedure to execute a given quantum algorithm on such a hardware is a significant task, but existing quantum CAD tools generally do not account for the underlying connectivity of the qubits or the limitation on the hardware resources available for the scheduling. We present a scheduler and performance simulator that fully accounts for these resource constraints, capable of estimating the execution time and error performances of executing a quantum circuit on the hardware. We outline the construction of tool components, and describe the process of mapping the qubits to ions and scheduling the physical gates in the MUSIQC architecture. Using this tool, we quantify the trade-off between hardware resource constraints and performance of the computer and show that at an expense of x fold increase in latency, a minimum of 1.6x resource reduction is possible for executing a three-qubit Bernstein-Vazirani algorithm encoded using Steane code.","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130293331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信