Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors最新文献

筛选
英文 中文
Methodologies and tools for pipelined on-chip interconnect 流水线片上互连的方法和工具
L. Scheffer
{"title":"Methodologies and tools for pipelined on-chip interconnect","authors":"L. Scheffer","doi":"10.1109/ICCD.2002.1106763","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106763","url":null,"abstract":"As processes shrink, gate delay improves much faster than the delay in long wires. Therefore, the long wires increasingly determine the maximum clock rate, and hence performance, of more and more chips. One solution to this problem is to pipeline the global interconnect, enabling the whole chip to run at the speed of local operations. While known to work well, this optimization is seldom used because of practical difficulties - it is hard to change the RTL, test vectors become invalid, and it's hard to prove correctness of any changes. Here we look at some ways these difficulties could be overcome.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"195 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116105560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Branch behavior of a commercial OLTP workload on Intel IA32 processors Intel IA32处理器上商业OLTP工作负载的分支行为
M. Annavaram, T. Diep, John Paul Shen
{"title":"Branch behavior of a commercial OLTP workload on Intel IA32 processors","authors":"M. Annavaram, T. Diep, John Paul Shen","doi":"10.1109/ICCD.2002.1106777","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106777","url":null,"abstract":"This paper presents a detailed branch characterization of an Oracle based commercial on-line transaction processing workload, Oracle Database Benchmark (ODB), running on an IA32 processor. We ran a well-tuned ODB on Simics, a full system simulator, to collect the instruction traces used in this study. We compare the branch behavior of ODB with the branch behaviors of gcc, gzip and mcf from the SPECINT 2000 benchmark suite. Contrary to the popular belief that databases have unpredictable branches, we show that using larger predictors that capture enough branch history information, and using branch prediction schemes that reduce aliasing, conditional branches in ODB are more predictable than in gcc, gzip and mcf Due to frequent context switching in ODB, a hardware return address stack is ineffective in predicting return addresses for ODB. Based on further analysis, we propose and evaluate an enhanced return address predictor, which reduces return address mispredictions in ODB by 40%.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116897515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Power-constrained microprocessor design 功耗受限的微处理器设计
H. P. Hofstee
{"title":"Power-constrained microprocessor design","authors":"H. P. Hofstee","doi":"10.1109/ICCD.2002.1106740","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106740","url":null,"abstract":"Power dissipation and power density have become first-order design constraints, even for high-performance systems. For future designs it will be the dominant constraint. In this paper we suggest a systematic approach to optimizing a processor design under (only) a power constraint. The approach uses the energy-performance ratio (EPR) of the various design parameters as the key to identifying opportunities for improving energy-efficiency.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132269001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
A 10 Gbps full-AES crypto design with a twisted-BDD S-Box architecture 采用扭曲bdd S-Box架构的10gbps全aes加密设计
S. Morioka, Akashi Satoh
{"title":"A 10 Gbps full-AES crypto design with a twisted-BDD S-Box architecture","authors":"S. Morioka, Akashi Satoh","doi":"10.1109/ICCD.2002.1106754","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106754","url":null,"abstract":"In this paper, we present a high-speed AES IP-core, which runs at 780 MHz on a 0. 13 /spl mu/m CMOS standard cell library, and which achieves 10 Gbps throughput in all encryption modes, including CBC mode. Although the CBC mode is the most widely used and important, achieving such high throughput was difficult because pipelining techniques cannot be applied. To reduce the propagation delays of the S-Box, the most critical function block, we developed a special circuit architecture that we call twisted-BDD, where the fanout of signals is distributed in the S-Box circuit. Our S-Box is 1.5 to 2 times faster than the conventional S-Box implementations. The T-Box algorithm, which merges the S-Box and another primitive function (MixColumns) into a single function, is also used for an additional speedup.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126260936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
GPE: a new representation for VLSI floorplan problem GPE:超大规模集成电路平面设计问题的新表述
Chang-Tzu Lin, De-Sheng Chen, Yiwen Wang
{"title":"GPE: a new representation for VLSI floorplan problem","authors":"Chang-Tzu Lin, De-Sheng Chen, Yiwen Wang","doi":"10.1109/ICCD.2002.1106745","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106745","url":null,"abstract":"In this paper, we propose a new representation of VLSI floorplan and building block problem. The representation is the generalization of Polish expression. By proposing a new relational operator, the representation can efficiently reuse some area that cannot be utilized if only having vertical and horizontal operators defined in Polish expression, and is able to present non-slicing structural floorplan. The experimental results show that the representation achieves promising area utilization in commonly used MCNC benchmark circuits.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127267843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Requirements for automotive system engineering tools 汽车系统工程工具要求
Joachim Schlosser
{"title":"Requirements for automotive system engineering tools","authors":"Joachim Schlosser","doi":"10.1109/ICCD.2002.1106795","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106795","url":null,"abstract":"The requirements to system and software development tools brought up by the automotive industry differ from the requirements that other customers have. The important catchwords here are heterogeneity of suppliers, tools, technical background of the engineers, and - partially resulting from the just mentioned - the overall complexity of the systems that are built up. There are multiple suppliers delivering multiple programs and units, and all these are to be integrated into a car that has to meet a huge number of constraints regarding safety, reliability and consumer demands. This paper shows what the design of electric and electronic car systems is and has to be like, and what qualifications the methodology and the process therefore has to meet. From these two points a collection of requirements to the tools and the tool chain is derived, with a special focus on simulation tools.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129810650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Power-performance trade-offs for energy-efficient architectures: A quantitative study 节能架构的功率性能权衡:一项定量研究
Hongbo Yang, R. Govindarajan, G. Gao, K. B. Theobald
{"title":"Power-performance trade-offs for energy-efficient architectures: A quantitative study","authors":"Hongbo Yang, R. Govindarajan, G. Gao, K. B. Theobald","doi":"10.1109/ICCD.2002.1106766","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106766","url":null,"abstract":"The drastic increase in power consumption by modern processors emphasizes the need for power-performance trade-offs in architecture design space exploration and compiler optimizations. This paper reports a quantitative study on the power-performance trade-offs in software pipelined schedules for an Itanium-like EPIC architecture with dual-speed pipelines, in which functional units are partitioned into fast ones and slow ones. We have developed an integer linear programming formulation to capture the power-performance tradeoffs for software pipelined loops. The proposed integer linear programming formulation and its solution method have been implemented and tested on a set of SPEC2000 benchmarks. The results are compared with an Itanium-like architecture (baseline) in which there are four functional units (FUs) and all of them are fast units. Our quantitative study reveals that by introducing a few slow FUs in place of fast FUs in the baseline architecture, the total energy consumed by FUs can be considerably reduced. When 2 out of 4 FUs are set as slow, the total energy consumed by FUs is reduced by up to 31.1% (with an average reduction of 25.2%) compared with the baseline configuration, while the performance degradation caused by using slow FUs is small. If performance demand is less critical, then energy reduction of up to 40.3% compared with the baseline configuration can be achieved.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126100608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Cache design for eliminating the address translation bottleneck and reducing the tag area cost 缓存设计消除了地址转换瓶颈,降低了标签面积成本
Yen-Jen Chang, F. Lai, S. Ruan
{"title":"Cache design for eliminating the address translation bottleneck and reducing the tag area cost","authors":"Yen-Jen Chang, F. Lai, S. Ruan","doi":"10.1109/ICCD.2002.1106791","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106791","url":null,"abstract":"For physical caches, the address translation delay can be partially masked, but it is hard to avoid completely. In this paper, we propose a cache partition architecture, called paged cache, which not only masks the address translation delay completely but also reduces the tag area dramatically. In the paged cache, we divide the entire cache into a set of partitions, and each partition is dedicated to only one page cached in the TLB. By restricting the range in which the cached block can be placed, we can eliminate the total or partial tag depending on the partition size. In addition, because the paged cache can be accessed without waiting for the generation of physical address, i.e., the paged cache and the TLB are accessed in parallel, the extended cache access time can be reduced significantly. We use SimpleScalar to simulate SPEC2000 benchmarks and perform HSPICE simulations (with a 0.18 /spl mu/m technology and 1.8 V voltage supply) to evaluate the proposed architecture. Experimental results show that the paged cache is very effective in reducing tag area of the on-chip Ll caches, while the average extended cache access time can be improved dramatically.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131585799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Analysis of the tradeoffs for the implementation of a high-radix logarithm 分析了实现高基数对数的权衡
José-Alejandro Piñeiro, M. Ercegovac, J. Bruguera
{"title":"Analysis of the tradeoffs for the implementation of a high-radix logarithm","authors":"José-Alejandro Piñeiro, M. Ercegovac, J. Bruguera","doi":"10.1109/ICCD.2002.1106760","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106760","url":null,"abstract":"An analysis of the tradeoffs between area and speed for a sequential implementation of a high-radix recurrence for logarithm computation is presented in this paper The high-radix algorithm is outlined and a sequential architecture is proposed, with the use of selection by rounding of the digits and redundant representation. Estimates of the execution time and total area are obtained for n = 16, 32 and 64 bits of precision and for radix values from r = 8 to r = 1024. An analysis of the tradeoffs between area and speed is presented, showing that the most efficient implementations are obtained for radices r = 256 for 16, 32 bit and r = 128 for 64 bit computations.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132837188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Physical design challenges for billion transistor chips 十亿晶体管芯片的物理设计挑战
P. Groeneveld
{"title":"Physical design challenges for billion transistor chips","authors":"P. Groeneveld","doi":"10.1109/ICCD.2002.1106751","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106751","url":null,"abstract":"Advancing process technology will necessitate and even more rigorous automation of the IC design trajectory. The design scale will increase with Moore's law, approaching 1,000,000,000 transistors in the coming years. This enables the design of SoC systems with complexities unprecedented unhuman history. At the same time the physics of silicon manufacturing is increasing the 'silicon complexity'. Additional design steps are required to address cross talk, voltage drop, antenna rules and others. Much more so than in previous technology nodes, the effects of parasitics must be addressed at various stages of the IC design flow. Nothing less than a full automation of the silicon complexity issues is required to stop the design productivity gap from growing.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123807600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信