2007 25th International Conference on Computer Design最新文献

筛选
英文 中文
Voltage drop reduction for on-chip power delivery considering leakage current variations 考虑泄漏电流变化的片上供电电压降降低
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601883
Jeffrey Fan, N. Mi, S. Tan
{"title":"Voltage drop reduction for on-chip power delivery considering leakage current variations","authors":"Jeffrey Fan, N. Mi, S. Tan","doi":"10.1109/ICCD.2007.4601883","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601883","url":null,"abstract":"In this paper, we propose a novel on-chip voltage drop reduction technique for on-chip power delivery networks of VLSI systems in the presence of variational leakage current sources. The new method inserts decoupling capacitors (decaps) into the power grid networks to reduce the voltage fluctuation. The optimization is based on sensitivity-based conjugate gradientmethod and sequence of linear programming approach. Different from existing power grid noise reduction methods, the new approach considers the impacts of inter-die and intra-die variational leakage current sources due to unavoidable process variability during the decap optimization process for the first time. Leakage currents, which although are static in nature typically, can still add to the total voltage drops and dynamic voltage reduction thus must consider the leakage-induced voltage variations. The proposed algorithm exploits the relative constant variations for different decap configurations of power grid circuits to speed up the statistical optimization process. Decaps can be inserted in such a way that the resulting circuits have much higher probability to meet the voltage drop constraints in the presence of leakage current variations. Experimental results demonstrate the effectiveness of the proposed approach and show that the new method has 100X to 1,000X of speedup over the Monte Carlo based statistical decap optimization method.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"79 1","pages":"78-83"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73319654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automatic SystemC TLM generation for custom communication platforms 自定义通信平台的自动SystemC TLM生成
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601878
Lochi Yu, S. Abdi
{"title":"Automatic SystemC TLM generation for custom communication platforms","authors":"Lochi Yu, S. Abdi","doi":"10.1109/ICCD.2007.4601878","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601878","url":null,"abstract":"This paper presents a tool for automatic generation of transaction level models (TLMs) in SystemC for MPSoC designs with custom communication platforms. The MPSoC platform is captured as a graphical net-list of components, busses and bridge elements. The application is captured as C processes mapped to the platform components. Once the platform is decided, a set of transaction level communication APIs is automatically generated for each application C process. After the C code is input, an executable SystemC TLM of the design is automatically generated using our tool. This TLM can be executed using standard SystemC simulators for early functional verification of the design. Although, several TLM styles and standards have been proposed in the past, our approach differs in the fact that the designers do not need to understand the underlying SystemC code or TLM modeling style to verify that their application executes on the selected platform. Another key advantage of our tool is that the platform can be easily customized for the application and a new TLM for that platform can be automatically generated. The TLM can be used to program the custom platform early in the design cycle before the components are available. Our experimental results demonstrate that for large industrial applications such as MP3 decoder and H.264, high-speed TLMs can be generated for several platforms in a few seconds.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"59 1","pages":"41-46"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91538619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
System level power estimation methodology with H.264 decoder prediction IP case study 系统级功率估计方法与H.264解码器预测IP的案例研究
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601959
Young-Hwan Park, S. Pasricha, F. Kurdahi, N. Dutt
{"title":"System level power estimation methodology with H.264 decoder prediction IP case study","authors":"Young-Hwan Park, S. Pasricha, F. Kurdahi, N. Dutt","doi":"10.1109/ICCD.2007.4601959","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601959","url":null,"abstract":"This paper presents a methodology to generate a hierarchy of power models for power estimation of custom hardware IP blocks, enabling a trade-off between power estimation accuracy, modeling effort and estimation speed. Our power estimation approach enables several novel system-level explorations - such as observing the effect of clock gating, and the effects of tweaking application-level parameters on system power - with an estimation accuracy that is close to the gate-level. We implemented our methodology on an H.264 video decoder prediction IP case study, created power models, and evaluated the effects of varying design parameters (e.g., clock gating, IIP frame ratios, quantization), allowing rapid system-level power exploration of these design parameters.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"79 1","pages":"601-608"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79066799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Energy-aware co-processor selection for embedded processors on FPGAs fpga上嵌入式处理器的能量感知协处理器选择
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601895
A. H. Gholamipour, E. Bozorgzadeh, Sudarshan Banerjee
{"title":"Energy-aware co-processor selection for embedded processors on FPGAs","authors":"A. H. Gholamipour, E. Bozorgzadeh, Sudarshan Banerjee","doi":"10.1109/ICCD.2007.4601895","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601895","url":null,"abstract":"In this paper, we present co-processor selection problem for minimum energy consumption in hw/sw co-design on FPGAs with dual power mode. We provide theoretical analysis for the problem under no constraint, resource constraint, and timing constraint. We prove that the complexity of the problem in each case is NP-Hard and we provide a generalized ILP formulation. We compared the result of our approach in minimizing energy to the result of other approaches that had not considered both static and dynamic power during optimization and we showed that we can reduce energy by 63% in some cases.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"413 1","pages":"158-163"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79214170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Two-level ata prefetching 两级数据预取
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601908
Fei Gao, Hanyu Cui, S. Sair
{"title":"Two-level ata prefetching","authors":"Fei Gao, Hanyu Cui, S. Sair","doi":"10.1109/ICCD.2007.4601908","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601908","url":null,"abstract":"Data prefetching has been shown to be an effective tool in hiding part of the latency associated with cache misses in modern processors. Traditionally, data prefetchers fetch data into a small prefetch buffer near the LI for low latency, or the L2 cache for greater coverage and less cache pollution. However, with the L1-L2 cache speed gap growing, significant performance gains can be obtained if the data pref etcher can operate as aggressively as an L2-level pref etcher but with the fast hit times of an LI-level pref etcher. In this paper, we propose a prefetching framework where an LI-level prefetcher and an L2- level prefetcher work cooperatively to reduce the average access time more than either one alone can. We evaluate several design alternatives suited to perform synergistically under different workloads. From the insight we gather from this analysis, we propose a confidence-based adaptive prefetcher that can improve prefetch efficiency significantly with judicious use of available bus bandwidth. Our results show that for certain prefetcher combinations, two- level prefetching can achieve the cumulative speedup attained from either prefetcher alone. Furthermore, when compared to other two-level prefetching models, the adaptive design provides similar speedups with appreciably less bus traffic.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"36 1","pages":"238-244"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77428284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CAP: Criticality analysis for power-efficient speculative multithreading CAP:高能效推测多线程的临界性分析
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601932
James Tuck, Wei Liu, J. Torrellas
{"title":"CAP: Criticality analysis for power-efficient speculative multithreading","authors":"James Tuck, Wei Liu, J. Torrellas","doi":"10.1109/ICCD.2007.4601932","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601932","url":null,"abstract":"While speculative multithreading (SM) on a chip multiprocessor (CMP) has the ability to speed-up hard-to- parallelize applications, the power inefficiency of aggressive speculation is a concern. To improve SMs power effeciency, we note that not all the tasks that are running in a SM environment are equally critical. To leverage this insight, this paper develops a novel, widely-applicable task-criticality model for SM. It also proposes CAP, a novel architecture that builds a task-criticality graph dynamically and uses it to make scheduling decisions in a SM CMP. Experiments with SPECint, SPECfp, and Olden applications show that, in a CMP with one fast core and three slow ones, the E D2 with CAP is, on average, 91-95% of that without. Moreover, it is only 77-91% of the E D2 of a CMP with four fast cores and no CAP. Overall, we argue that scheduling for task criticality is beneficial.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"41 1","pages":"409-416"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73711647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Power-aware mapping for reconfigurable NoC architectures 可重构NoC架构的功率感知映射
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601933
M. Modarressi, H. Sarbazi-Azad
{"title":"Power-aware mapping for reconfigurable NoC architectures","authors":"M. Modarressi, H. Sarbazi-Azad","doi":"10.1109/ICCD.2007.4601933","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601933","url":null,"abstract":"A core mapping method for reconfigurable network-on-chip (NoC) architectures is presented in this paper. In most of the existing methods, mapping is carried out based on the traffic characteristics of a single application. However, several different applications are implemented and integrated in the modern complex system-on-chips which should be considered by mapping methods. In the proposed method, the reconfiguration (which is achieved by embedding programmable switches between routers of a mesh-based NoC) allows us to dynamically change the network topology in order to adapt it with the running application and optimize the power and performance metrics. The presented network architecture can be configured as an application- specific topology, while it still holds the benefits of the regular NoC topologies such as modularity and predictable electrical properties. The experimental results show that this method can effectively adapt the NoC to the running application and improve the power consumption and performance of the system.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"1 1","pages":"417-422"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79920797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Reducing leakage power in peripheral circuits of L2 caches 降低L2缓存外围电路漏功率
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601907
H. Homayoun, A. Veidenbaum
{"title":"Reducing leakage power in peripheral circuits of L2 caches","authors":"H. Homayoun, A. Veidenbaum","doi":"10.1109/ICCD.2007.4601907","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601907","url":null,"abstract":"Leakage power has grown significantly and is a major challenge in microprocessor design. Leakage is the dominant power component in second-level (L2) caches. This paper presents two architectural techniques to utilize leakage reduction circuits in L2 caches. They primarily target the leakage in the peripheral circuitry of an L2 cache and as such have to be able to cope with longer delays. One technique exploits the fact that processor activity decreases significantly after an L2 cache miss occurs and saves power during L2 miss service time. Two algorithms, a static one and an adaptive one, are proposed for deciding when to apply this leakage reduction technique. Another technique attempts to keep the peripheral circuits in a lower-power state most of the time. The results for SPEC2K benchmarks show that the first technique can achieve a 18 to 22% reduction in L2 power consumption, on average (and up to 63%), depending on the decision algorithm. The second technique can save 25%, on average (and up to 80%). This comes with a negligible 1 to 2% performance impact, on average, depending on the technique used.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"5 1","pages":"230-237"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90554162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Scan chain design for three-dimensional integrated circuits (3D ICs) 三维集成电路(3D ic)扫描链设计
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601902
Xiaoxia Wu, P. Falkenstern, Yuan Xie
{"title":"Scan chain design for three-dimensional integrated circuits (3D ICs)","authors":"Xiaoxia Wu, P. Falkenstern, Yuan Xie","doi":"10.1109/ICCD.2007.4601902","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601902","url":null,"abstract":"Scan chains are widely used to improve the testability of IC designs. In traditional 2D IC designs, various design techniques on the construction of scan chains have been proposed to facilitate DFT (Design-For-Test). Recently, three-dimensional (3D) technologies have been proposed as a promising solution to continue technology scaling. In this paper, we study the scan chain construction for 3D ICs, examining the impact of 3D technologies on scan chain ordering. Three different 3D scan chain design approaches (namely, VIA3D, MAP3D, and OPT3D) are proposed and compared, with the experimental results for ISCAS89 benchmark circuits. The advantages as well as disadvantages for each approach are discussed. The results show that both MAP3D and VIA3D approaches require no changes of 2D scan chain algorithms, but OPT3D can achieve the best wire length reduction for the scan chain design. The average scan chain wire length of six ISCAS89 benchmarks obtained from OPT3D has 46.0% reduction compared to the 2D scan chain design. To the best of our knowledge, this is the first study on scan chain design for 3D integrated circuits.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"65 1","pages":"208-214"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86511513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
Hybrid resistor/FET-logic demultiplexer architecture design for hybrid CMOS/nanodevice circuits CMOS/纳米器件混合电路的混合电阻/场效应晶体管逻辑解复用器架构设计
2007 25th International Conference on Computer Design Pub Date : 2007-10-01 DOI: 10.1109/ICCD.2007.4601955
Shu Li, Tong Zhang
{"title":"Hybrid resistor/FET-logic demultiplexer architecture design for hybrid CMOS/nanodevice circuits","authors":"Shu Li, Tong Zhang","doi":"10.1109/ICCD.2007.4601955","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601955","url":null,"abstract":"Hybrid nanoelectronics are emerging as one viable option to sustain the Moorepsilas Law after the CMOS scaling limit is reached. One main design challenge in hybrid nanoelectronics is the interface (named as demux) between the highly dense nanowires in nanodevice crossbars and relatively coarse microwires in CMOS domain. The prior work on demux design use a single type of devices to realize the demultiplexing function, but hardly provides a satisfactory solution. This work proposes to combine resistor with FET to implement the demux, leading to the so-called hybrid resistor/FET-logic demux. Such hybrid demux architecture can make these two types of devices well complement each other to improve the overall demux design effectiveness. Furthermore, the effects of resistor conductance variability are analyzed and evaluated based on computer simulations.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"27 1","pages":"574-579"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83690749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信