2008 IEEE International Conference on Computer Design最新文献

筛选
英文 中文
Mathematical analysis of buffer sizing for Network-on-Chips under multimedia traffic 多媒体流量下片上网络缓冲区大小的数学分析
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751854
A. Khonsari, Mohammadreza Aghajani, Arash Tavakkol, M. S. Talebi
{"title":"Mathematical analysis of buffer sizing for Network-on-Chips under multimedia traffic","authors":"A. Khonsari, Mohammadreza Aghajani, Arash Tavakkol, M. S. Talebi","doi":"10.1109/ICCD.2008.4751854","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751854","url":null,"abstract":"Designing appropriate buffer sizes for routers within Network-on-Chip (NoC) so as to minimize the power while preserving the required performance in the presence of self-similar traffic has been considered a challenging problem in the literature. A few analytical studies carried out in NoC modeling have been adopted assumptions such as exponentially-distributed packet inter-arrivals, and conclusions reached under such assumptions may be inappropriate in the presence of self-similar traffic. Through mathematical analysis this paper predicts the optimal buffer size under self-similar traffic using Discrete Poisson Pareto Burst Process (DPPBP). The validity of the mathematical expressions is demonstrated through simulation experiments.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114248451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Test cost minimization through adaptive test development 通过自适应测试开发最小化测试成本
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751867
Mingjing Chen, A. Orailoglu
{"title":"Test cost minimization through adaptive test development","authors":"Mingjing Chen, A. Orailoglu","doi":"10.1109/ICCD.2008.4751867","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751867","url":null,"abstract":"The ever-increasing complexity of mixed-signal circuits imposes an increasingly complicated and comprehensive parametric test requirement, resulting in a highly lengthened manufacturing test phase. Attaining parametric test cost reduction with no test quality degradation constitutes a critical challenge during test development. The capability of parametric test data to capture systematic process variations engenders a highly accurate prediction of the efficiency of each test for a particular lot of chips even on the basis of a small quantity of characterized data. The predicted test efficiency further enables the adjustment of the test set and test order, leading to an early detection of faults. We explore such an adaptive strategy, by introducing a technique that prunes the test set based on a test correlation analysis. A test selection algorithm is proposed to identify the minimum set of tests that delivers a satisfactory defect coverage. A probabilistic measure that reflects the defect detection efficiency is used to order the test set so as to enhance the probability of an early detection of faulty chips. The test sequence is further optimized during the testing process by dynamically adjusting the initial test order to adapt to the local defect pattern fluctuations in the lot of chips under test. Experimental results show that the proposed technique delivers significant test time reductions while attaining higher test quality compared to previous adaptive test methodologies.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127890990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
SynECO: Incremental technology mapping with constrained placement and fast detail routing for predictable timing improvement SynECO:增量技术映射与约束放置和快速细节路由可预测的时间改进
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751915
Anuj Kumar, Tai-Hsuan Wu, A. Davoodi
{"title":"SynECO: Incremental technology mapping with constrained placement and fast detail routing for predictable timing improvement","authors":"Anuj Kumar, Tai-Hsuan Wu, A. Davoodi","doi":"10.1109/ICCD.2008.4751915","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751915","url":null,"abstract":"We present SynECO, a framework to achieve predictable timing improvement via incremental resynthesis and replacement. We target timing-critical paths postplacement and resynthesize and replace promising gates. We show since the wire delays are the non-negligible contributors to a critical-path delay, it is crucial to accurately estimate them to make a predictable synthesis modification. For this purpose, we incorporate an accurate timing analysis tool which uses fast detail routing for wire delay estimation. This allows generating timing estimates that correlate much better with post-routing values compared to Steiner-tree-based estimate of wiring tree and using D2M delay model. Detail routing information allows incorporation of factors such as crosstalk, metal layer assignment and via delays which are crucial for accurate analysis. For fast synthesis, we constrain our logical modifications to be from the physical neighborhood of target gates on the critical paths. Our synthesis framework is completely integrated with the Cadence Encounter tools for physical design.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"134 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130989020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Low-cost open-page prefetch scheduling in chip multiprocessors 芯片多处理器中的低成本开页预取调度
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751890
Marius Grannæs, Magnus Jahre, L. Natvig
{"title":"Low-cost open-page prefetch scheduling in chip multiprocessors","authors":"Marius Grannæs, Magnus Jahre, L. Natvig","doi":"10.1109/ICCD.2008.4751890","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751890","url":null,"abstract":"The pressure on off-chip memory increases significantly as more cores compete for the same resources. A CMP deals with the memory wall by exploiting thread level parallelism (TLP), shifting the focus from reducing overall memory latency to memory throughput. This extends to the memory controller where the 3D structure of modern DRAM is exploited to increase throughput. Traditionally, prefetching reduces latency by fetching data before it is needed. In this paper we explore how prefetching can be used to increase memory throughput. We present our own low-cost open-page prefetch scheduler that exploits the 3D structure of DRAM when issuing prefetches. We show that because of the complex structure of modern DRAM, prefetches can be made cheaper than ordinary reads, thus making prefetching beneficial even when prefetcher accuracy is low. As a result, prefetching with good coverage is more important than high accuracy. By exploiting this observation our low-cost open page scheme increases performance and QoS. Furthermore, we explore how prefetches should be scheduled in a state of the art memory controller by examining sequential, scheduled region, CZone/delta correlation and reference prediction table prefetchers.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115831101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Conversion driven design of binary to mixed radix circuits 二进制到混合基数电路的转换驱动设计
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751893
A. Rafiev, Julian P. Murphy, D. Sokolov, A. Yakovlev
{"title":"Conversion driven design of binary to mixed radix circuits","authors":"A. Rafiev, Julian P. Murphy, D. Sokolov, A. Yakovlev","doi":"10.1109/ICCD.2008.4751893","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751893","url":null,"abstract":"A conversion driven design approach is described. It takes the outputs of mature and time-proven EDA synthesis tools to generate mixed radix datapath circuits in an endeavour to investigate the added relative advantages or disadvantages. An algorithm underpinning the approach is presented and formally described together with m-of-n encoded gate-level implementations. The application is found in a wide variety and overlapping areas of circuit design, here a subset are analysed where the method finds the strongest application: arithmetic circuits and hardware security. The obtained results are reported showing an increase in power consumption but with considerable improvement in resistance to differential power analysis (DPA).","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130858688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Dynamically reconfigurable soft output MIMO detector 动态可重构的软输出MIMO检测器
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751842
Pankaj Bhagawat, Rajballav Dash, G. Choi
{"title":"Dynamically reconfigurable soft output MIMO detector","authors":"Pankaj Bhagawat, Rajballav Dash, G. Choi","doi":"10.1109/ICCD.2008.4751842","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751842","url":null,"abstract":"MIMO systems (with multiple transmit and receive antennas) are becoming increasingly popular, and many next-generation systems such as WiMAX, 3-GPP LTE and IEEE802.11 n wireless LANs rely on the increased throughput of MIMO systems with up to four antennas at receiver and transmitter. High throughput implementation of the detection unit for MIMO systems is a significant challenge. This challenge becomes still harder, because the above mentioned standards demand support for multiple modulation and coding schemes. This implies that the MIMO detector must be dynamically reconfigurable. Also, to achieve required bit error rate (BER) or frame error rate (FER) performance, the detector has to provide soft values to advanced forward error correction (FEC) schemes like turbo Codes. This paper presents an ASIC implementation of a novel MIMO detector architecture that is able to reconfigure on the fly and provides soft values as output. The design is implemented in 45 nm predictive technology library, and has a parallelism factor of four. The detector has many qualities of a systolic architecture and achieves a continuous throughput of 1 Gbps for QPSK, 500 Mbps for 16-QAM, and 187.5 Mbps for 64-QAM. The total area is estimated to be approximately 70 KGates equivalent, and power consumption is estimated to be 114 mW.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130868165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Simulation points for SPEC CPU 2006 模拟点的规格CPU 2006
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751891
Arun A. Nair, L. John
{"title":"Simulation points for SPEC CPU 2006","authors":"Arun A. Nair, L. John","doi":"10.1109/ICCD.2008.4751891","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751891","url":null,"abstract":"Increasing sizes of benchmarks make detailed simulation an extremely time consuming process. Statistical techniques such as the SimPoint methodology have been proposed in order to address this problem during the initial design phase. The SimPoint methodology attempts to identify repetitive, long, large-grain phases in programs and predict the performance of the architecture based on its aggregate performance on the individual phases. This study attempts to compare accuracy of the SimPoint methodology for the SPEC CPU 2006 benchmark suite with that of SPEC CPU 2000 and to study the large-grain phases in the two benchmark suites using the SimPoint methodology. We find that there has not been a significant increase in the number of simulation points required to accurately predict the behavior of the programs in SPEC CPU 2006, despite its significantly larger data footprint and dynamic instruction count. We also find that the programs in both benchmark suites have similar characteristics in terms of the number of phases that contribute significantly towards overall behavior, further emphasizing the similarity between the two benchmark suites with respect to the number of simulation points required for similar accuracies.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125095867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Characterization and design of sequential circuit elements to combat soft error 对抗软误差的顺序电路元件的特性与设计
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751861
H. Abrishami, S. Hatami, Massoud Pedram
{"title":"Characterization and design of sequential circuit elements to combat soft error","authors":"H. Abrishami, S. Hatami, Massoud Pedram","doi":"10.1109/ICCD.2008.4751861","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751861","url":null,"abstract":"This paper performs analysis and design of latches and flip-flops while considering the effect of event upsets caused by energetic particle hits. First it is shown that the conventional analysis of this effect in sequential circuit elements (SCEs) tends to underestimate the threat posed by such events. More precisely, there exists a timing window close to the triggering edge of the clock during which a SCE is more vulnerable to the particle hit. This phenomenon has been ignored by previous work, resulting in false negatives. Next the paper explains how to size transistors of a familiar SCE i.e., a clocked CMOS latch, to make it more robust to such events. Experimental results to validate the characterization and transistor sizing steps are provided and discussed.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121302511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Adaptive techniques for leakage power management in L2 cache peripheral circuits L2高速缓存外围电路中泄漏电源管理的自适应技术
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751917
H. Homayoun, A. Veidenbaum, J. Gaudiot
{"title":"Adaptive techniques for leakage power management in L2 cache peripheral circuits","authors":"H. Homayoun, A. Veidenbaum, J. Gaudiot","doi":"10.1109/ICCD.2008.4751917","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751917","url":null,"abstract":"Recent studies indicate that a considerable amount of an L2 cache leakage power is dissipated in its peripheral circuits, e.g., decoders, word-lines and I/O drivers. In addition, L2 cache is becoming larger, thus increasing the leakage power. This paper proposes two adaptive architectural techniques (ADM and ASM) to reduce leakage in the L2 cache peripheral circuits. The adaptive techniques use the product of cache hierarchy miss rates to guide the leakage control in accordance with program behavior. The result for SPEC2K benchmarks show that the first technique (ASM) achieves a 34% average leakage power reduction with a 1.8% average IPC reduction. The second technique (ADM) achieves a 52% average savings with a 1.9% average IPC reduction. This corresponds to a 2 to 3 X improvement over recently proposed static techniques.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125804978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Post-silicon verification for cache coherence 缓存相干性的后硅验证
2008 IEEE International Conference on Computer Design Pub Date : 2008-10-01 DOI: 10.1109/ICCD.2008.4751884
A. DeOrio, Adam Bauserman, V. Bertacco
{"title":"Post-silicon verification for cache coherence","authors":"A. DeOrio, Adam Bauserman, V. Bertacco","doi":"10.1109/ICCD.2008.4751884","DOIUrl":"https://doi.org/10.1109/ICCD.2008.4751884","url":null,"abstract":"Modern processor designs are extremely complex and difficult to validate during development, causing a growing portion of the verification effort to shift to post-silicon, after the first few hardware prototypes become available. Extremely slow simulation speeds during pre-silicon verification result in functional errors escaping into silicon, a problem that is further exacerbated by the growing complexity of the memory subsystem in multi-core platforms. In this work we present CoSMa, a novel technology offering high coverage functional post-silicon validation of cache coherence protocols in multi-core systems. It enables the detection and diagnosis of functional errors in the memory subsystem by recording at runtime a compact encoding of the operations occurring at each cache line and checking their correctness at regular intervals. We leverage the systempsilas existing memory resources to store the required activity, thus minimizing area overhead. When the system is finally ready for customer shipment, CoSMa can be completely disabled, eliminating any performance or memory overhead. We reproduce in our experiments a set of coherence protocol bugs based on published errata documents of commercial multi-core designs, and show that CoSMa is highly effective in detecting them.","PeriodicalId":345501,"journal":{"name":"2008 IEEE International Conference on Computer Design","volume":"353 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123188581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信