2008 IEEE International Conference on Computer Design最新文献_第9页

Analysis and minimization of practical energy in 45nm subthreshold logic circuits 45nm亚阈值逻辑电路中实际能量的分析与最小化

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751876

D. Bol, R. Ambroise, D. Flandre, J. Legat

{"title":"Analysis and minimization of practical energy in 45nm subthreshold logic circuits","authors":"D. Bol, R. Ambroise, D. Flandre, J. Legat","doi":"10.1109/ICCD.2008.4751876","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751876","url":null,"abstract":"Over the last decade, the design of ultra-low-power digital circuits in subthreshold regime has been driven by the quest for minimum energy per operation. In this contribution, we observe that operating at minimum-energy point is not straightforward as design constraints from real-life applications have an important impact on energy. Therefore, we introduce the alternative concept of practical energy, taking functional-yield and throughput constraints on minimum Vdd into account. In this context, we demonstrate for the first time the detrimental impact of DIBL on minimum Vdd. Practical energy gives a useful analysis framework of circuit optimization to reach minimum-energy point, while considering the throughput as an input variable dictated by the application. From simulation of a benchmark multiplier in 45 nm technology, we find out that practical energy can be far higher than minimum energy point, in the case of low-throughput applications (ap 10-100 kOp/s) because of static leakage energy and robustness-limited minimum Vdd. With the proposed framework, we investigate the capability of conventional optimization techniques to make practical energy meet minimum energy point. Amongst these techniques, channel length upsize is shown to be more efficient than MTCMOS power gating, body biasing, Vt selection or device width upsize, as it increases robustness while simultaneously reducing static leakage energy. A small length upsize with low area overhead is shown to reduce practical energy at low throughput to less than 2.1 times the minimum energy level. At medium throughput, it even brings practical energy 30% lower than minimum energy level without optimization techniques.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133756057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

Power switch characterization for fine-grained dynamic voltage scaling 细粒度动态电压缩放的功率开关特性

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751923

Liang Di, M. Putic, J. Lach, B. Calhoun

引用次数: 15

A high-performance parallel CAVLC encoder on a fine-grained many-core system 基于细粒度多核系统的高性能并行CAVLC编码器

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751869

Zhibin Xiao, B. Baas

引用次数: 23

A parallel Steiner tree heuristic for macro cell routing 宏单元路由的并行Steiner树启发式算法

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751836

C. Fobel, G. Grewal

{"title":"A parallel Steiner tree heuristic for macro cell routing","authors":"C. Fobel, G. Grewal","doi":"10.1109/ICCD.2008.4751836","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751836","url":null,"abstract":"Global routing of macro cells remains an important but time-consuming step in the VLSI design cycle. Macro cells are large, irregularly sized parameterized circuit modules that typically contain large numbers of terminals that must be interconnected. The interconnection pattern for each set of terminals (net) that must be connected is a Steiner tree, and the primary sub-problem in the global routing of macro cells is to find a set of dissimilar, low-cost Steiner trees for each net that must be routed. In this paper, a two-phase, parallel (multi-processor) algorithm is proposed for quickly constructing a diverse pool of high-quality Steiner trees for routing of multi-terminal nets. In the first phase, a single Steiner tree is constructed using a heuristic, called Shrubbery. Then, in the second phase, a pool of dissimilar, high-quality trees are created from the original tree, by running multiple instances of a local search in parallel. Computational experiments performed on over 800 commonly used benchmarks show that running multiple instances of the local search in parallel results in near-linear speed-up over the serial case. Most importantly, the trees produced are both high-quality and dissimilar, allowing for numerous routing possibilities for each net.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124779796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A novel, highly SEU tolerant digital circuit design approach 一种新颖的、高度容限的数字电路设计方法

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751834

Rajesh Garg, S. Khatri

{"title":"A novel, highly SEU tolerant digital circuit design approach","authors":"Rajesh Garg, S. Khatri","doi":"10.1109/ICCD.2008.4751834","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751834","url":null,"abstract":"In this paper, we present a new radiation tolerant CMOS standard cell library, and demonstrate its effectiveness in implementing radiation hardened digital circuits. We exploit the fact that if a gate is implemented using only PMOS (NMOS) transistors then a radiation particle strike can result only in logic a 0 to 1 (1 to 0) flip. Based on this observation, we derive our radiation hardened gates from regular static CMOS gates. In particular, we separate the PMOS and NMOS devices, and split the gate output into two signals. One of these outputs of our radiation tolerant gate is generated using PMOS transistors, and it drives other PMOS transistors (only) in its fanout. Similarly, the other output (generated from NMOS transistors) drives only other NMOS transistors in its fanout. Now, if a radiation particle strikes one of the outputs of the radiation tolerant gate, then the gates in the fanout enter a high-impedance state, and hence preserve their output values. Our radiation hardened gates exhibit an extremely high degree of SEU tolerance, which is validated at the circuit level. Using these gates, we also implement circuit level hardening based on logical masking, to selectively harden those gates in a circuit which contribute most to the soft error failure of the circuit. The gates with a low probability of logical masking are replaced by SEU tolerant gates from our new library, such that the digital design achieves a 90% soft error rate reduction. Experimental results demonstrate that this reduction is achieved with a modest layout area and delay penalty of 62% and 29% respectively, for area mapped designs. In contrast with existing approaches, our approach results in SEU immunity for extremely large critical charge values (>650fC).","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120961689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Understanding performance, power and energy behavior in asymmetric multiprocessors 了解非对称多处理器的性能、功耗和能源行为

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751903

Nagesh B. Lakshminarayana, Hyesoon Kim

引用次数: 11

Configurable rectilinear Steiner tree construction for SoC and nano technologies 可配置的线性斯坦纳树结构的SoC和纳米技术

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751837

I. Jiang, Yen-Ting Yu

引用次数: 2

A family of scalable FFT architectures and an implementation of 1024-point radix-2 FFT for real-time communications 一个可扩展的FFT体系结构家族和用于实时通信的1024点基数-2 FFT实现

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751880

A. Suleiman, H. Saleh, A. Hussein, D. Akopian

引用次数: 16

Timing analysis considering IR drop waveforms in power gating designs 功率门控设计中考虑红外降波的时序分析

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751912

Shih-Hung Weng, Yu-Min Kuo, Shih-Chieh Chang, M. Marek-Sadowska

引用次数: 5

Optimizing data sharing and address translation for the Cell BE Heterogeneous Chip Multiprocessor 优化Cell BE异构芯片多处理器的数据共享和地址转换

2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751904

M. Gschwind

{"title":"Optimizing data sharing and address translation for the Cell BE Heterogeneous Chip Multiprocessor","authors":"M. Gschwind","doi":"10.1109/ICCD.2008.4751904","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751904","url":null,"abstract":"Heterogeneous Chip Multiprocessors (HMPs), such as the Cell Broadband Engine, offer a new design optimization opportunity by allowing designers to provide accelerators for application specific domains. Data sharing between CPUs and accelerators, and memory access mechanisms and protocols are crucial decisions in the design of an HMP. In this article, we analyze the choices between hardware and software managed coherence between CPU and accelerators for DMA-based data sharing, and find that hardware-coherent DMA shows a performance benefit of up to 3x, even for simple workloads.We explore memory address translation architecture choices for DMA-based data sharing. In multiprogramming environments, address translation is commonly used to separate processes. For efficiency, direct access to system memory requires address translation capabilities in the accelerator. We find that hardware managed address translation shows a performance benefit of up to 5x, even for simple workloads, by avoiding the costs of accelerator/CPU communication and supervisor management of the translation context and the introduction of a serial bottleneck on the CPU.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125674744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10