2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)最新文献_第5页

Parallelism extraction in embedded software for android devices android设备嵌入式软件的并行抽取

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363654

M. Aguilar, Juan Fernando Eusse Giraldo, Projjol Ray, R. Leupers, G. Ascheid, Weihua Sheng, Prashant Sharma

引用次数: 9

GPU implementation of an anisotropic Huber-L1 dense optical flow algorithm using OpenCL 利用OpenCL实现GPU各向异性Huber-L1密集光流算法

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363693

Duygu Buyukaydin, Toygar Akgün

引用次数: 9

Software fault tolerance for FPUs via vectorization 基于矢量化的fpu软件容错

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363677

Zhi Chen, R. Inagaki, A. Nicolau, A. Veidenbaum

{"title":"Software fault tolerance for FPUs via vectorization","authors":"Zhi Chen, R. Inagaki, A. Nicolau, A. Veidenbaum","doi":"10.1109/SAMOS.2015.7363677","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363677","url":null,"abstract":"Future generation processors are expected to have high soft error rates and will require increased fault detection and fault tolerance. This work focuses on errors in execution units. Hardware or software duplication or triplication, parity, or residue codes could be used to detect errors in execution units. However, hardware duplication/triplication have significant area overhead and, in applications with high utilization of floating point units (FPU), very high energy cost. Software duplication/ triplication of instructions also increases both execution time and energy consumption. This paper proposes to reduce the cost of redundant instruction execution in FPUs through vectorization. Duplicated or triplicated instructions and result comparisons can be packed by a compiler into vector instructions, such as SSE or AVX. Experimental results using hand vectorization on a variety of benchmarks show that, compared to error detection through scalar instruction duplication, vector mode redundant execution achieves 1.78× and 2.73× average speedup for SSE and AVX instructions, respectively. It also significantly reduces the energy consumption, by an average of 40% and 53%, respectively, for SSE and AVX. Thus the proposed technique enables error detection with no hardware cost and reduced time and energy overhead compared to brute-force scalar instruction duplication.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117327099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Pre-simulation elaboration of heterogeneous systems: The SystemC multi-disciplinary virtual prototyping approach 异构系统的预仿真细化:SystemC多学科虚拟样机方法

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363686

C. Aoun, Liliana Andrade, Torsten Mähne, F. Pêcheux, M. Louërat, A. Vachoux

{"title":"Pre-simulation elaboration of heterogeneous systems: The SystemC multi-disciplinary virtual prototyping approach","authors":"C. Aoun, Liliana Andrade, Torsten Mähne, F. Pêcheux, M. Louërat, A. Vachoux","doi":"10.1109/SAMOS.2015.7363686","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363686","url":null,"abstract":"Designers of the upcoming digital-centric More-than-Moore systems are lacking a common design and simulation environment able to efficiently manage all the multi-disciplinary aspects of its components of various nature that closely interact with each other. A key to successful design and verification lies in a SystemC-based virtual prototyping environment that is able to simulate a complex heterogeneous system as a whole, for which each component is described and solved using the most appropriate Model of Computation (MoC). In this paper, we present a new generic MoC-independent elaboration scheme that aims at preparing a Virtual Prototype (VP) for simulation. It requires to check the correct composition of the system model through dimensional analysis, to explore the model structure to identify involved MoC and interfaces between MoCs, and to detect the underlying dependencies. Eventually, information extracted from the exploration allow the instantiation of MoC-specific solvers. To soundly handle the global model execution with a Discrete Event (DE) kernel as the main solver, synchronization mechanisms with master-slave semantics within the model structure are implicitly deduced.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134102493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Chip-independent Error Correction in main memories 主存储器中与芯片无关的纠错

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363674

Mehrtash Manoochehri, M. Dubois

引用次数: 0

Imposing coarse-grained reconfiguration to general purpose processors 对通用处理器实施粗粒度的重新配置

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363658

M. Duric, Milan Stanic, Ivan Ratković, Oscar Palomar, O. Unsal, A. Cristal, M. Valero, Aaron Smith

{"title":"Imposing coarse-grained reconfiguration to general purpose processors","authors":"M. Duric, Milan Stanic, Ivan Ratković, Oscar Palomar, O. Unsal, A. Cristal, M. Valero, Aaron Smith","doi":"10.1109/SAMOS.2015.7363658","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363658","url":null,"abstract":"Mobile devices execute applications with diverse compute and performance demands. This paper proposes a general purpose processor that adapts the underlying hardware to a given workload. Existing mobile processors need to utilize more complex heterogeneous substrates to deliver the demanded performance. They incorporate different cores and specialized accelerators. On the contrary, our processor utilizes only modest homogeneous cores and dynamically provides an execution substrate suitable to accelerate a particular workload. Instead of incorporating accelerators, the processor reconfigures one or more cores into accelerators on-the-fly. It improves performance with minimal hardware additions. The accelerators are made of general purpose ALUs reconfigured into a compute fabric and the general purpose pipeline that streams data through the fabric. To enable reconfiguration of ALUs into the fabric, the floorplan of a 4-core processor is changed to place the ALUs in close proximity on the chip. A configurable switched network is added to couple and dynamically reconfigure the ALUs to perform computation of frequently repeated regions, instead of executing general purpose instructions. Through this reconfiguration, the mobile processor specializes its substrate for a given workload and maximizes performance of the existing resources. Our results show that reconfiguration accelerates a set of selected compute intensive workloads by 1.56×, 2,39×, 3,51×, when configuring the accelerator of 1-, 2-, or 4- cores respectively.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131551818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Reconfigurable computing for future vision-capable devices 可重构计算的未来视觉设备

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363657

Miguel Bordallo López, A. Nieto, O. Silvén, J. Boutellier, D. L. Vilariño

{"title":"Reconfigurable computing for future vision-capable devices","authors":"Miguel Bordallo López, A. Nieto, O. Silvén, J. Boutellier, D. L. Vilariño","doi":"10.1109/SAMOS.2015.7363657","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363657","url":null,"abstract":"Mobile devices have been identified as promising platforms for interactive vision-based applications. However, this type of applications still pose significant challenges in terms of latency, throughput and energy-efficiency. In this context, the integration of reconfigurable architectures on mobile devices allows dynamic reconfiguration to match the computation and data flow of interactive applications, demonstrating significant performance benefits compared to general purpose architectures. This paper presents concepts laying on platform level adaptability, exploring the acceleration of vision-based interactive applications through the utilization of three reconfigurable architectures: A low-power EnCore processor with a Configurable Flow Accelerator co-processor, a hybrid reconfigurable SIMD/MIMD platform and Transport-Triggered Architecture-based processors. The architectures are evaluated and compared with current processors, analyzing their advantages and weaknesses in terms of performance and energy-efficiency when implementing highly interactive vision-based applications. The results show that the inclusion of reconfigurable platforms on mobile devices can enable the computation of several computationally heavy tasks with high performance and small energy consumption while providing enough flexibility.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127791102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tervel: A unification of descriptor-based techniques for non-blocking programming Tervel:用于非阻塞编程的基于描述符技术的统一

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363668

S. Feldman, P. Laborde, D. Dechev

引用次数: 14

Hardware task migration module for improved fault tolerance and predictability 硬件任务迁移模块，提高容错性和可预测性

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363676

Shyamsundar Venkataraman, Rui Santos, Akash Kumar, Jasper Kuijsten

引用次数: 9

An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm 基于fpga的心脏收缩阵列加速BWA-MEM基因组图谱算法

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) Pub Date : 2015-07-19 DOI: 10.1109/SAMOS.2015.7363679

Ernst Houtgast, V. Sima, K. Bertels, Z. Al-Ars

引用次数: 58