2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)最新文献_第4页

On-demand thread-level fault detection in a concurrent programming environment 并发编程环境中按需线程级故障检测

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621132

Jian Fu, Qiang Yang, R. Poss, C. Jesshope, Chunyuan Zhang

{"title":"On-demand thread-level fault detection in a concurrent programming environment","authors":"Jian Fu, Qiang Yang, R. Poss, C. Jesshope, Chunyuan Zhang","doi":"10.1109/SAMOS.2013.6621132","DOIUrl":"https://doi.org/10.1109/SAMOS.2013.6621132","url":null,"abstract":"The vulnerability of multi-core processors is increasing due to tighter design margins and greater susceptibility to interference. Moreover, concurrent programming environments are the norm in the exploitation of multi-core systems. In this paper, we present an on-demand thread-level fault detection mechanism for multi-cores. The main contribution is on-demand redundancy, which allows users to set the redundancy scope in the concurrent code. To achieve this we introduce intelligent redundant thread creation and synchronization, which manages concurrency and synchronization between the redundant threads via the master. This framework was implemented in an emulation of a multi-threaded, many-core processor with single, in-order issue cores. It was evaluated by a range of programs in image and signal processing, and encryption. The performance overhead of redundancy is less than 11% for single core execution and is always less than 100% for all scenarios. This efficiency derives from the platform's hardware concurrency management and latency tolerance.","PeriodicalId":382307,"journal":{"name":"2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123976400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Mapping of PRP/HSR redundancy protocols onto a configurable FPGA/CPU based architecture 将PRP/HSR冗余协议映射到基于FPGA/CPU的可配置架构

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621114

Holger Flatt, J. Jasperneite, Daniel Dennstedt, Tran Dinh Hung

{"title":"Mapping of PRP/HSR redundancy protocols onto a configurable FPGA/CPU based architecture","authors":"Holger Flatt, J. Jasperneite, Daniel Dennstedt, Tran Dinh Hung","doi":"10.1109/SAMOS.2013.6621114","DOIUrl":"https://doi.org/10.1109/SAMOS.2013.6621114","url":null,"abstract":"This paper presents the mapping of the seamless redundancy protocols PRP and HSR in combination with IEEE 1588 based clock synchronization onto a configurable CPU/FPGA based Redundancy Box architecture. Whereas core functions of PRP, HSR, and IEEE 1588 are mapped onto the FPGA, a CPU executes the control parts of these protocols. An optional attached standard switch ASIC provides direct connection to several network devices. For validation purpose, a special embedded platform is proposed that is composed of an FPGA and a commercial off-the-shelf switch ASIC. The results show that even a low-cost Altera Cyclone IV FPGA comprising 74,000 logic elements fulfills the requirements for protocol processing at 100 Mbps per port. Minimum size frames are forwarded by the FPGA up to two times faster than competitive implementations. Three connected PRP/HSR RedBoxes and an IEEE 1588 clock master are synchronizing in laboratory within an accuracy of 30 ns. Using several RedBoxes in PRP and HSR mode, a seamless redundancy is demonstrated for a PROFINET RT test network and supplemental network components. Overall, the presented RedBox can be flexibly integrated into time-synchronized industrial networks in order to significantly increase the communication reliability.","PeriodicalId":382307,"journal":{"name":"2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124719582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Concurrent multi-level arrays: Wait-free extensible hash maps 并发多级数组:无等待的可扩展散列映射

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621118

S. Feldman, P. Laborde, D. Dechev

{"title":"Concurrent multi-level arrays: Wait-free extensible hash maps","authors":"S. Feldman, P. Laborde, D. Dechev","doi":"10.1109/SAMOS.2013.6621118","DOIUrl":"https://doi.org/10.1109/SAMOS.2013.6621118","url":null,"abstract":"In this work we present the first design and implementation of a wait-free hash map. Our multiprocessor data structure allows a large number of threads to concurrently put, get, and remove information. Wait-freedom means that all threads make progress in a finite amount of time - an attribute that can be critical in real-time environments. This is opposed to the traditional blocking implementations of shared data structures which suffer from the negative impact of deadlock and related correctness and performance issues. Our design is portable because we only use atomic operations that are provided by the hardware; therefore, our hash map can be utilized by a variety of data-intensive applications including those within the domains of embedded systems and supercomputers. The challenges of providing this guarantee make the design and implementation of wait-free objects difficult. As such, there are few wait-free data structures described in the literature; in particular, there are no wait-free hash maps. It often becomes necessary to sacrifice performance in order to achieve wait-freedom. However, our experimental evaluation shows that our hash map design is, on average, 5 times faster than a traditional blocking design. Our solution outperforms the best available alternative non-blocking designs in a large majority of cases, typically by a factor of 8 or higher.","PeriodicalId":382307,"journal":{"name":"2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128905152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Cobra: A comprehensive bundle-based reliable architecture Cobra:一个全面的基于捆绑包的可靠架构

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/samos.2013.6621131

Andrea Pellegrini, V. Bertacco

{"title":"Cobra: A comprehensive bundle-based reliable architecture","authors":"Andrea Pellegrini, V. Bertacco","doi":"10.1109/samos.2013.6621131","DOIUrl":"https://doi.org/10.1109/samos.2013.6621131","url":null,"abstract":"The declining robustness of transistors and their ever-denser integration threatens the dependability of future microprocessors. Classic multicores offer a simple solution to overcome hardware defects: faulty processors can be disabled without affecting the rest of the system. However, this approach becomes quickly an impractical solution at high fault rates. Recently, distributed computer architectures have been proposed to mitigate the effects of faulty transistors by utilizing finegrained hardware reconfiguration, managed by fully decoupled control logic. Unfortunately, such solutions trade flexibility for execution locality, and thus they do not scale to large systems. To overcome this issue we propose Cobra, a distributed, scalable, highly parallel reliable architecture. Cobra is a service-based architecture where groups of dynamic instructions flow independently through the system, making use of the available hardware resources. Cobra organizes the system's units dynamically using a novel resource assignment that preserves locality and limits communication overhead. Our experiments show that Cobra is extremely dependable, and outperforms classic multicores when subjected to 5 or more defects per 100 million transistors. We also show that Cobra operates 80% faster than previous state-of-the-art solutions on multi-programmed SPEC CPU2006 workloads and it improves cache hit rate by up to 62%. Our runtime fault detection techniques have a performance impact of only 3%.","PeriodicalId":382307,"journal":{"name":"2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128937557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

An accurate energy model for streaming applications mapped on MPSoC platforms 一个精确的能量模型流应用映射在MPSoC平台上

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621124

J. Spasić, T. Stefanov

引用次数: 4

Low-power application-specific FFT processor for LTE applications 低功耗专用于LTE应用的FFT处理器

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621102

Tomasz Patyk, D. Guevorkian, Teemu Pitkänen, P. Jääskeläinen, J. Takala

引用次数: 18

A Process-based Reconfigurable SystemC Module for simulation speedup 基于进程的可重构系统仿真加速模块

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621108

Efstathios Sotiriou-Xanthopoulos, K. Siozios, G. Economakos, D. Soudris

引用次数: 5

GPUburn: A system to test and mitigate GPU hardware failures GPUburn:用于测试和减轻GPU硬件故障的系统

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621133

D. Defour, E. Petit

{"title":"GPUburn: A system to test and mitigate GPU hardware failures","authors":"D. Defour, E. Petit","doi":"10.1109/SAMOS.2013.6621133","DOIUrl":"https://doi.org/10.1109/SAMOS.2013.6621133","url":null,"abstract":"Due to many factors such as, high transistor density, high frequency, and low voltage, today's processors are more than ever subject to hardware failures. These errors have various impacts depending on the location of the error and the type of processor. Because of the hierarchical structure of the compute units and work scheduling, the hardware failure on GPUs affect only part of the application. In this paper we present a new methodology to characterize the hardware failures of Nvidia GPUs based on a software micro-benchmarking platform implemented in OpenCL. We also present which hardware part of TESLA architecture is more sensitive to intermittent errors, which usually appears when the processor is aging. We obtained these results by accelerating the aging process by running the processors at high temperature. We show that on GPUs, intermittent errors impact is limited to a localized architecture tile. Finally, we propose a methodology to detect, record location of defective units in order to avoid them to ensure the program correctness on such architectures, improving the GPU fault-tolerance capability and lifespan.","PeriodicalId":382307,"journal":{"name":"2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127867371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

High speed cycle approximate simulation for cache-incoherent MPSoCs 缓存非相干mpsoc的高速周期近似仿真

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621110

Christopher Thompson, Miles Gould, N. Topham

引用次数: 3

An effective model extraction method with state space compression for model checking SystemC TLM designs 一种有效的状态空间压缩模型提取方法，用于模型检测SystemC TLM设计

2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2013-07-15 DOI: 10.1109/SAMOS.2013.6621107

Yanyan Gao, Xi Li

{"title":"An effective model extraction method with state space compression for model checking SystemC TLM designs","authors":"Yanyan Gao, Xi Li","doi":"10.1109/SAMOS.2013.6621107","DOIUrl":"https://doi.org/10.1109/SAMOS.2013.6621107","url":null,"abstract":"SystemC has become a de-facto standard language for SoC and ASIP designs. The verification of implementation with SystemC is the key to guarantee the correctness of designs and prevent the errors from propagating to the lower levels. The gap between SystemC TLM model and its corresponding formal model makes it hard to perform automated translation between them. SystemC describes process behavior in sequential statements and usually employs intermediate variables, while most model checking languages for hardware only describe parallel behaviors, in which the usage of intermediate variables not only increases state space and may prolong execution time, but also introduce potential errors. For a model checking language which supports parallel description, the elimination of redundant intermediate variables is requisite and also an efficient way to reduce the state space. This paper intends to solve these issues: (1) proposing an extraction method that can implement the translation from a description which supports sequential execution to a description supports parallel execution; (2) identifying and removing redundant intermediate variables. In this paper, a novel mechanism is presented to automatically extract behavior description from SystemC to a widespreadly used model checking language SMV. We have implemented a tool SC2SMV and performed actual extraction process on it to demonstrate the effectiveness of the method presented in this paper.","PeriodicalId":382307,"journal":{"name":"2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115606275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0