2010 IEEE International Conference on Computer Design最新文献

Rate-monotonic scheduling for reducing system-wide energy consumption for hard real-time systems 用于降低硬实时系统全系统能耗的速率单调调度

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647804

Linwei Niu

引用次数: 1

RTOS-aware modeling of embedded hardware/software systems 嵌入式硬件/软件系统的rtos感知建模

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647795

Matthias Müller, J. Gerlach, W. Rosenstiel

{"title":"RTOS-aware modeling of embedded hardware/software systems","authors":"Matthias Müller, J. Gerlach, W. Rosenstiel","doi":"10.1109/ICCD.2010.5647795","DOIUrl":"https://doi.org/10.1109/ICCD.2010.5647795","url":null,"abstract":"Modern embedded systems such as mobile phones or electronic control units from the automotive domain include a bulk of highly complex and highly interacting functions. Due to several reasons—flexibility and cost effectiveness may be the most important ones—a large and permanently growing part of these functions is implemented in software. This comes along with the demand for more and more processing power, paving the way for multi-core architectures, and widespread use of real-time operating systems. Application software implementation and operating system configuration strongly influence the overall system behavior. Design methodologies for such complex systems, consisting of hardware, software and real-time operating systems, must provide an early, model-based view on the overall system. The approach described in this paper enables automatic generation of system-level models of complex systems from abstract application specifications. Additionally, a compiler-based technique allows automatic calculation of precise software runtime information and annotation of the generated model. The resulting system-level model facilitates early exploration of systems on high level of abstraction, taking into account functional and temporal characteristics of hardware, software and real-time operating system. A key feature of the approach is its high accuracy, which is shown by applying it to an industrial application from the automotive domain.","PeriodicalId":182350,"journal":{"name":"2010 IEEE International Conference on Computer Design","volume":"76 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120882158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A flexible simulation methodology and tool for nanoarray-based architectures 一种灵活的基于纳米阵列架构的仿真方法和工具

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647586

S. Frache, M. Graziano, M. Zamboni

引用次数: 20

Improving cache performance by combining cost-sensitivity and locality principles in cache replacement algorithms 在缓存替换算法中结合成本敏感性和局部性原则提高缓存性能

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647594

Rami Sheikh, Mazen Kharbutli

{"title":"Improving cache performance by combining cost-sensitivity and locality principles in cache replacement algorithms","authors":"Rami Sheikh, Mazen Kharbutli","doi":"10.1109/ICCD.2010.5647594","DOIUrl":"https://doi.org/10.1109/ICCD.2010.5647594","url":null,"abstract":"Due to the ever increasing performance gap between the processor and the main memory, it becomes crucial to bridge that gap by designing an efficient memory hierarchy capable of reducing the average memory access time. The cache replacement algorithm plays a central role in designing an efficient memory hierarchy. Many of the recent studies in cache replacement algorithms have focused on improving L2 cache replacement algorithms by minimizing the miss count. However, depending on the dependency chain, cache miss bursts, and other factors, a processor's ability to partially hide the cost of an L2 cache miss varies; that is, cache miss costs are not uniform. Therefore, a better solution would account also for the aggregate miss cost in designing cache replacement algorithms. Our proposed solution combines the two principles of locality and cost-sensitivity into one which we call: LACS: Locality-Aware Cost-Sensitive cache replacement algorithm. LACS estimates a cache block's cost from the number of instructions the processor manages to issue during a cache miss on that block and then victimizes cache blocks with low cost and poor locality in order to maximize the overall cache performance. When LACS is evaluated using a uniprocessor architecture model, it speeds up 10 L2 cache performance-constrained SPEC CPU2000 benchmarks by up to 85% and 15% on average while not slowing down any of the 20 SPEC CPU2000 benchmarks evaluated. When evaluated using a dual-core CMP architecture model, LACS speeds up 6 SPEC CPU2000 benchmark pairs by up to 44% and 11% on average.","PeriodicalId":182350,"journal":{"name":"2010 IEEE International Conference on Computer Design","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122971927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Lizard: Energy-efficient hard fault detection, diagnosis and isolation in the ALU 蜥蜴:在ALU中节能的硬故障检测、诊断和隔离

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647708

Seokin Hong, Soontae Kim

{"title":"Lizard: Energy-efficient hard fault detection, diagnosis and isolation in the ALU","authors":"Seokin Hong, Soontae Kim","doi":"10.1109/ICCD.2010.5647708","DOIUrl":"https://doi.org/10.1109/ICCD.2010.5647708","url":null,"abstract":"Digital circuits are expected to increasingly suffer from more hard faults due to technology scaling. Especially, a single hard fault in the ALU might lead to a total failure in the embedded systems. In addition, energy efficiency is critical in these systems. To address these increasingly important problems in the ALU, we propose a novel energy-efficient fault-tolerant ALU design called Lizard. Lizard utilizes two 16-bit ALUs to perform 32-bit computations with fault detection and diagnosis. By exploiting predictable operations, fault detection is performed in a single cycle. The 16-bit ALUs can be partitioned into two 8-bit ALUs. When a fault occurs in one of the four 8-bit ALUs, Lizard diagnoses and isolates a faulty 8-bit ALU for itself. After the faulty 8-bit ALU is isolated, Lizard continues its operation using the remaining three 8-bit ALUs, which can detect and isolate another fault. In this way, Lizard can survive faults on at most two sub-ALUs increasing its lifetime and fault tolerance. We conducted comparative evaluations with an unprotected ALU, triple modular redundancy ALU, and quadruple time redundancy ALU in terms of area, energy consumption, performance, and reliability. It is demonstrated that Lizard outperforms other ALU designs in most cases, especially in energy efficiency.","PeriodicalId":182350,"journal":{"name":"2010 IEEE International Conference on Computer Design","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124716042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Dynamic register file partitioning in superscalar microprocessors for energy efficiency 用于能效的超标量微处理器动态寄存器文件分区

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647631

Meltem Ozsoy, Yusuf Onur Koçberber, M. Kayaalp, O. Ergin

引用次数: 0

Scenario-based design space exploration of MPSoCs 基于场景的mpsoc设计空间探索

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647727

P. V. Stralen, A. Pimentel

引用次数: 80

Practical completion detection for 2-of-N delay-insensitive codes 实用的2-of-N延迟不敏感码补全检测

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647809

Marco Cannizzaro, Weiwei Jiang, S. Nowick

{"title":"Practical completion detection for 2-of-N delay-insensitive codes","authors":"Marco Cannizzaro, Weiwei Jiang, S. Nowick","doi":"10.1109/ICCD.2010.5647809","DOIUrl":"https://doi.org/10.1109/ICCD.2010.5647809","url":null,"abstract":"There is increasing interest in using m-of-n delay-insensitive codes for robust asynchronous global communication, to support the design of coding-efficient and low-power channels. However, a fundamental obstacle in using these codes has been complex and expensive hardware support. This paper addresses this issue, introducing and evaluating practical completion detector units for 2-of-n codes. Designs are proposed for both return-to-zero (RZ) and non-return-to-zero (NRZ) codes. The RZ designs build on prior work of Piestrak [14]; this paper proposes a small modification to their work to provide a fully timing-robust (i.e. quasi-delay insensitive, or QDI) version. The main contribution of the paper is an efficient completion for NRZ 2-of-n codes. Both detector architectures are modular and simple, composed of basic cells in a binary tree. Initial simulation results were performed on several implementations of a 2-of-9 detector using Cadence's Spectre environment, after mapping to a 90nm standard cell library. The new RZ detector has 35% area reduction and comparable delays and energy to the earlier Piestrak design, but unlike the latter, ensures robust QDI operation. The new NRZ detector is shown to have negligible stabilization time between successive codewords (0.05–0.19 ns) when compared to a recent alternative approach.","PeriodicalId":182350,"journal":{"name":"2010 IEEE International Conference on Computer Design","volume":"55 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132463123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

IP characterization methodology for fast and accurate power consumption estimation at transactional level model 在事务级模型中快速准确估算功耗的IP表征方法

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647622

Michel Rogers-Vallée, Marc-André Cantin, Laurent Moss, G. Bois

引用次数: 6

Thermal-aware scratchpad memory design and allocation 热感知刮刮板存储器的设计和分配

2010 IEEE International Conference on Computer Design Pub Date : 2010-11-29 DOI: 10.1109/ICCD.2010.5647616

M. Damavandpeyma, S. Stuijk, T. Basten, M. Geilen, H. Corporaal

{"title":"Thermal-aware scratchpad memory design and allocation","authors":"M. Damavandpeyma, S. Stuijk, T. Basten, M. Geilen, H. Corporaal","doi":"10.1109/ICCD.2010.5647616","DOIUrl":"https://doi.org/10.1109/ICCD.2010.5647616","url":null,"abstract":"Scratchpad memories (SPMs) have become a promising on-chip storage solution for embedded systems from an energy, performance and predictability perspective. The thermal behavior of these types of memories has not been considered in detail. This thermal behavior plays an important role in the reliability of silicon devices and in their static (leakage) power consumption. In this paper, we propose two different techniques to improve the thermal behavior of SPMs. First, we propose a hardware-based, thermal-aware address translation technique that physically distributes memory accesses to consecutive addresses evenly over the whole memory area. Second, we propose a software-based, thermal-aware address generation technique. This technique tries to distribute the variables that are allocated to the SPM in such a way that an even thermal distribution is achieved. The first technique works particularly well for applications with a regular access pattern, whereas the second technique can also improve the behavior of applications with irregular access patterns. The two techniques thus complement each other and work well together. Using the first technique we show that the peak temperature of an SPM in 65nm technology, when running a typical streaming application, is decreased by up-to 10.0°C. Temperature cycling is reduced from up-to 14.8°C to almost zero in comparison with a non-thermal-aware solution. For our benchmark applications with an irregular access pattern, the second technique is able to reduce the peak temperature by up-to 3.5°C. These savings for both techniques are obtained without any performance degradation or extra silicon area.","PeriodicalId":182350,"journal":{"name":"2010 IEEE International Conference on Computer Design","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127418229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4