2007 25th International Conference on Computer Design最新文献

筛选
英文 中文
Power efficient register file update approach for embedded processors 嵌入式处理器的节能寄存器文件更新方法
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601935
R. Ayoub, A. Orailoglu
{"title":"Power efficient register file update approach for embedded processors","authors":"R. Ayoub, A. Orailoglu","doi":"10.1109/ICCD.2007.4601935","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601935","url":null,"abstract":"In this paper we present an approach for a low power register file in the domain of embedded processors. The suggested approach obtains power savings through tackling the unnecessary writes to register files for short live registers. Writes to register files are essentially redundant when an instruction manages to forward its results to all of its dependents through forwarding hardware. As the percentage of registers that exhibit short liveness is shown to be significant, tackling unnecessary writes contributes to delivering appreciable power savings. In this work we show that tackling the unnecessary writes could be attained efficiently through a register based encoding scheme. The suggested encoding scheme exploits application-specific information and renames all or most of the short live registers to a small subset of the registers that are prespecified during the hardware design. The renaming process is performed at the compiler level. Power savings can be obtained through precluding the set of prespecified registers from writing to the register file. We suggest in this paper efficient algorithms for the purpose of renaming, one algorithm to perform the renaming in the cases of no register pressure and another one for the cases of register pressure. In the cases of register pressure, some of the prespecified registers may need to be turned into normal registers, a process that is managed through the use of reprogrammable hardware support. Although the cases of register pressure could impact power savings, the detailed analysis we outline shows that the size of the prespecified registers subset is typically small which makes register pressure an infrequent event. Experimental analysis on numerical and DSP codes indicates appreciable improvements in power savings.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"17 1","pages":"431-437"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78386680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS 采用新颖的65nm CMOS开关分配器的4.6Tbits/s 3.6GHz单周期NoC路由器
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601881
A. Kumary, Partha Kunduz, A.P. Singhx, Li-Shiuan Pehy, N. K. Jhay
{"title":"A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS","authors":"A. Kumary, Partha Kunduz, A.P. Singhx, Li-Shiuan Pehy, N. K. Jhay","doi":"10.1109/ICCD.2007.4601881","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601881","url":null,"abstract":"As chip multiprocessors (CMPs) become the only viable way to scale up and utilize the abundant transistors made available in current microprocessors, the design of on-chip networks is becoming critically important. These networks face unique design constraints and are required to provide extremely fast and high bandwidth communication, yet meet tight power and area budgets. In this paper, we present a detailed design of our on-chip network router targeted at a 36-core shared-memory CMP system in 65 nm technology. Our design targets an aggressive clock frequency of 3.6 GHz, thus posing tough design challenges that led to several unique circuit and microarchitectural innovations and design choices, including a novel high throughput and low latency switch allocation mechanism, a non-speculative single-cycle router pipeline which uses advanced bundles to remove control setup overhead, a low-complexity virtual channel allocator and a dynamically-managed shared buffer design which uses prefetching to minimize critical path delay. Our router takes up 1.19 mm2 area and expends 551 mW power at 10% activity, delivering a single-cycle no-load latency at 3.6 GHz clock frequency while achieving apeak switching data rate in excess of 4.6 Tbits/sper router node.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"163 1","pages":"63-70"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83528352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 212
Optimized design of a double-precision floating-point multiply-add-dused unit for data dependence 基于数据依赖性的双精度浮点乘加单元的优化设计
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601918
Gongqiong Li, Zhaolin Li
{"title":"Optimized design of a double-precision floating-point multiply-add-dused unit for data dependence","authors":"Gongqiong Li, Zhaolin Li","doi":"10.1109/ICCD.2007.4601918","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601918","url":null,"abstract":"This paper presents a novel double-precision floating-point multiply-add-fused unit, which is implemented in three pipeline stages. The main improvement over the conventional design is data dependence between two consecutive floating-point instructions is considered. In the new design the intermediate computation results of the first floating-point instruction are first pretreated and then fed back to the first stage for being directly used by the second floating-point instruction if the two consecutive floating-point instructions are data dependent. In this way, floating point instructions can be executed directly following their preceding floating-point instructions without being stalled due to data dependence. 11 data dependence cases are accelerated in this paper. The experiments, which are done over four SPEC2000 benchmark programs, show that 25% performance increase can be attained at the cost of 0.27 ns time delay added to the critical path.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"32 1","pages":"311-316"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89976521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Placement and routing of RF embedded passive designs in LCP substrate 射频嵌入式无源设计在LCP基板上的放置与布线
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601913
M. Pathak, S. Mukherjee, M. Swaminathan, E. Engin, S. Lim
{"title":"Placement and routing of RF embedded passive designs in LCP substrate","authors":"M. Pathak, S. Mukherjee, M. Swaminathan, E. Engin, S. Lim","doi":"10.1109/ICCD.2007.4601913","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601913","url":null,"abstract":"Physical layout generation of RF embedded passive design is not an easy task since the response of a given layout is tightly coupled with the response of the individual components and the effect of interconnect parasitics. In this paper we propose a methodology for automatic layout generation of embedded passive RF circuits. We make use of circuit models to represent and optimize a given layout and use non-linear optimization at various stages of the methodology to obtain the desired goals. Full-wave EM simulations is completely out of the design loop, so our methodology significantly reduces the design time for RF embedded passive circuits. The proposed approach has been used successfully to generate layout for band-pass filters of varying sizes.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"63 1","pages":"273-279"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77801641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Memory based computation using embedded cache for processor yield and reliability improvement 基于内存的计算采用嵌入式缓存来提高处理器的良率和可靠性
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601922
Somnath Paul, S. Bhunia
{"title":"Memory based computation using embedded cache for processor yield and reliability improvement","authors":"Somnath Paul, S. Bhunia","doi":"10.1109/ICCD.2007.4601922","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601922","url":null,"abstract":"VLSI systems in the nanometer regime suffer from high defect rates and large parametric variations that lead to yield loss as well as reduced reliability of operation. In this paper, we propose a novel memory-based computation framework that exploits on-chip memory for reliable operation by transferring activity from a defective or unreliable functional unit to the embedded memory. This allows the die to run at a reduced performance level instead of being completely discarded or being throttled (in case of variations). We show that the proposed method improves yield and reliability in a superscalar out-of-order processor by tolerating defective functional units and allowing dynamic thermal management. The simulation results show that it entails only a small loss in performance (average 1.8%) at the cost of 9.5% of area overhead required with hardware duplication.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"9 1","pages":"341-346"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73675550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prioritizing verification via value-based correctness criticality 通过基于值的正确性关键性来确定验证的优先级
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601921
Joonhyuk Yoo, M. Franklin
{"title":"Prioritizing verification via value-based correctness criticality","authors":"Joonhyuk Yoo, M. Franklin","doi":"10.1109/ICCD.2007.4601921","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601921","url":null,"abstract":"Microprocessors are becoming increasingly susceptible to soft errors due to the current trends of semiconductor technology scaling. Traditional redundant multi-threading architectures provide good fault tolerance by re-executing all the computations. However, such a full re-execution significantly increases the demand on the processor resources, resulting in severe performance degradation. To address this problem, this paper introduces a correctness criticality based filter checker, which prioritizes the verification candidates so as to selectively do verification. Binary Correctness Criticality (BCC) and Likelihood of Correctness Criticality (LoCC) are metrics that quantify whether an instruction is important for reliability or how likely an instruction is correctness-critical, respectively. A likelihood of correctness criticality is computed by a value vulnerability factor, which is defined by the numerically significant bit-width used to compute a result. The proposed technique is accomplished by exploiting information redundancy of compressing computationally useful data bits. Based on the likelihood of correctness criticality test, the filter checker mitigates the verification workload by bypassing instructions that are unimportant for correct execution. Extensive measurements prove that the LoCC metric yields quite a wide distribution of values, indicating that it has the potential to differentiate diverse degrees of correctness criticality. Experimental results show that the proposed scheme accelerates a traditional fully-fault-tolerant processor by 1.7 times, while it reduces the soft error rate to 18% of that of a non-fault-tolerant processor.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"214 1","pages":"333-340"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79526525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A low overhead hardware technique for software integrity and confidentiality 一种低开销的硬件技术,可以保证软件的完整性和保密性
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601889
Austin Rogers, M. Milenkovic, A. Milenković
{"title":"A low overhead hardware technique for software integrity and confidentiality","authors":"Austin Rogers, M. Milenkovic, A. Milenković","doi":"10.1109/ICCD.2007.4601889","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601889","url":null,"abstract":"Software integrity and confidentiality play a central role in making embedded computer systems resilient to various malicious actions, such as software attacks; probing and tampering with buses, memory, and I/O devices; and reverse engineering. In this paper we describe an efficient hardware mechanism that protects software integrity and guarantees software confidentiality. To provide software integrity, each instruction block is signed during program installation with a cryptographically secure signature. The signatures embedded in the code are verified during program execution. Software confidentiality is provided by encrypting instruction blocks. To achieve low performance overhead, the proposed mechanism combines several architectural enhancements: a variation of one-time-pad encryption, parallelizable signatures, and conditional execution of unverified instructions. A relatively high memory overhead due to embedded signatures can be reduced by protecting multiple instruction blocks with one signature, with minimal effects on complexity and performance overhead.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"5 1","pages":"113-120"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77249103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Why we need statistical static timing analysis 为什么我们需要统计静态时序分析
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601885
C. Forzan, D. Pandini
{"title":"Why we need statistical static timing analysis","authors":"C. Forzan, D. Pandini","doi":"10.1109/ICCD.2007.4601885","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601885","url":null,"abstract":"As technology continues to advance deeper into the nanometer regime, a tight control on the process parameters is increasingly difficult. As a consequence, variability has turned out to be a dominant factor in the design of complex ICs. Traditional static timing analysis (STA) is becoming insufficient to accurately evaluate the process variation impact on the design performance considering the increasing number of process, power supply voltage, and temperature (PVT) corners. In contrast, statistical static timing analysis (SSTA) is a promising innovative technique to handle increasingly larger environmental and process fluctuations, especially on-chip parameter variations. However, the statistical approach needs a set of costly additional data such as an accurate process variation description, and a statistical standard cell library characterization. In this paper, STA and SSTA are applied on a real industrial design to compare the two techniques, in terms of both accuracy and cost. From our analysis, we have concluded that the potential advantages offered by SSTA exceed the additional library characterization cost and process data assembly effort.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"26 1","pages":"91-96"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73196981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
CMOS logic design with independent-gate FinFETs 独立栅极finfet的CMOS逻辑设计
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601953
Anish Muttreja, Niket Agarwal, N. Jha
{"title":"CMOS logic design with independent-gate FinFETs","authors":"Anish Muttreja, Niket Agarwal, N. Jha","doi":"10.1109/ICCD.2007.4601953","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601953","url":null,"abstract":"Fin-type field-effect transistors (FinFETs) are promising substitutes for bulk CMOS in nano-scale circuits. In this paper, it is observed that in spite of improved device characteristics, high active leakage may remain a problem for FinFET logic circuits. Leakage is found to contribute 31.3% of total power consumption in power-optimized FinFET logic circuits. Various FinFET logic design styles, based on independent control of FinFET gates, are studied. A new low-leakage logic style is presented. Leakage (total) power savings of 64.7% (14.5%) under tight delay constraints and 91.2% (37.2%) under relaxed delay constraints, through the judicious use of FinFET logic styles, are demonstrated.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"29 1","pages":"560-567"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73739656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 150
Maximizing the throughput-area efficiency of fully-parallel low-density parity-check decoding with C-slow retiming and asynchronous deep pipelining 最大化全并行低密度奇偶校验解码的吞吐量面积效率与C-slow重定时和异步深管道
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601964
Ming Su, Lili Zhou, C. Shi
{"title":"Maximizing the throughput-area efficiency of fully-parallel low-density parity-check decoding with C-slow retiming and asynchronous deep pipelining","authors":"Ming Su, Lili Zhou, C. Shi","doi":"10.1109/ICCD.2007.4601964","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601964","url":null,"abstract":"In this paper, we apply C-slow retiming and asynchronous deep pipelining to maximize the throughput-area efficiency of fully parallel low-density-parity-check (LDPC) decoding. Pipelined decoders are implemented in a 0.18 mum FDSOI CMOS process. Experimental results show that our pipelining technique is an efficient approach to maximizing LDPC decoding throughput while minimizing the area consumption. First, pipelined decoders can achieve extraordinary high throughput which non-pipelined design cannot. Second, for the same throughput, pipelined decoders use less area than non-pipelined design. Our approach can improve the throughput of a published implementation by 4 times with only about 80% area overhead. Without using clocks, proposed asynchronous pipelined decoders are more scalable in design complexity and more robust to process-voltage-temperature variations than existing clock-based LDPC decoders.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"10 1","pages":"636-643"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87839480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信