Software and Compilers for Embedded Systems最新文献_第6页

Efficient event-driven simulation of parallel processor architectures 并行处理器架构的高效事件驱动仿真

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269854

A. Kupriyanov, D. Kissler, Frank Hannig, J. Teich

{"title":"Efficient event-driven simulation of parallel processor architectures","authors":"A. Kupriyanov, D. Kissler, Frank Hannig, J. Teich","doi":"10.1145/1269843.1269854","DOIUrl":"https://doi.org/10.1145/1269843.1269854","url":null,"abstract":"In this paper we present a new approach for generating high-speed optimized event-driven instruction set level simulators for adaptive massively parallel processor architectures. The simulator generator is part of a methodology for the systematic mapping, evaluation, and exploration of massively parallel processor architectures that are designed for special purpose applications in the world of embedded computers. The generation of high-speed cycle-accurate simulators is of utmost importance here, because they are directly used both for parallel processor architecture debugging and evaluation purposes, as well as during time-consuming architecture/compiler co-exploration. We developed a modeling environment which automatically generates a C++ simulation model either from a graphical input or directly from an XML-based architecture description. Here, we focus on the underlying event-driven simulation model and present our modeling environment, in particular the features of the graphical parallel processor architecture editor and the automatic instruction set level simulator generator. Finally, in a case-study, we demonstrate the pertinence of our approach by simulating different processor arrays. The superior performance of the generated simulators compared to existing simulators and simulator generation approaches is shown.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128237649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications 操作系统集成了多进程应用的能量感知刮擦板分配策略

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269850

R. Pyka, Christoph Faßbach, Manish Verma, H. Falk, P. Marwedel

引用次数: 29

Whole-program linear-constant analysis with applications to link-time optimization 整个程序线性常数分析与应用程序链接时间优化

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269853

L. V. Put, Dominique Chanet, K. D. Bosschere

引用次数: 2

Optimal chain rule placement for instruction selection based on SSA graphs 基于SSA图的指令选择的最优链式规则放置

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269857

Stefan Schäfer, Bernhard Scholz

引用次数: 6

Reducing fine-grain communication overhead in multithread code generation for heterogeneous MPSoC 减少异构MPSoC多线程代码生成中的细粒度通信开销

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269855

L. Brisolara, Sang-Il Han, X. Guerin, L. Carro, R. Reis, S. Chae, A. Jerraya

{"title":"Reducing fine-grain communication overhead in multithread code generation for heterogeneous MPSoC","authors":"L. Brisolara, Sang-Il Han, X. Guerin, L. Carro, R. Reis, S. Chae, A. Jerraya","doi":"10.1145/1269843.1269855","DOIUrl":"https://doi.org/10.1145/1269843.1269855","url":null,"abstract":"Heterogeneous MPSoCs present unique opportunities for emerging embedded applications, which require both high-performance and programmability. Although, software programming for these MPSoC architectures requires tedious and error-prone tasks, thereby automatic code generation tools are required. A code generation method based on fine-grain specification can provide more design space and optimization opportunities, such as exploiting fine-level parallelism and more efficient partitions. However, when partitioned, fine-grain models may require a large number of inter-processor communications, decreasing the overall system performance. This paper presents a Simulink-based multithread code generation method, which applies Message Aggregation optimization technique to reduce the number of inter-processor communications. This technique reduces the communication overheads in terms of execution time by reduction on the number of messages exchanged and in terms of memory size by the reduction on the number of channels. The paper also presents experiment results for one multimedia application, showing performance improvements and memory reduction obtained with Message Aggregation technique.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130158669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Optimization of dynamic data structures in multimedia embedded systems using evolutionary computation 基于进化计算的多媒体嵌入式系统动态数据结构优化

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269849

David Atienza Alonso, Christos Baloukas, Lazaros Papadopoulos, C. Poucet, S. Mamagkakis, J. Hidalgo, F. Catthoor, D. Soudris, J. Lanchares

{"title":"Optimization of dynamic data structures in multimedia embedded systems using evolutionary computation","authors":"David Atienza Alonso, Christos Baloukas, Lazaros Papadopoulos, C. Poucet, S. Mamagkakis, J. Hidalgo, F. Catthoor, D. Soudris, J. Lanchares","doi":"10.1145/1269843.1269849","DOIUrl":"https://doi.org/10.1145/1269843.1269849","url":null,"abstract":"Embedded consumer devices are increasing their capabilities and can now implement new multimedia applications reserved only for powerful desktops a few years ago. These applications share complex and intensive dynamic memory use. Thus, dynamic memory optimizations are a requirement when porting these applications. Within these optimizations, the refinement of the Dynamically (de)allocated Data Type (or DDT) implementations is one of the most important and difficult parts for an efficient mapping onto low-power embedded devices.\u0000 In this paper, we describe a new automatic optimization approach for the DDTs of object-oriented multimedia applications. It is based on an analytical pre-characterization of the possible elementary DDT blocks, and a multi-objective genetic algorithm to explore the design space and to select the best implementation according to different optimization criteria (i.e., memory accesses, memory footprint and energy consumption). Our results in real-life multimedia applications show that the best implementations of DDTs can be obtained in an automated way in few hours, while typically designers would require days to find a suitable implementation, achieving important savings in exploration time with respect to other state-of-the-art heuristics-based optimization methods for this task.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116528573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Automatic partitioning and mapping of stream-based applications onto the Intel IXP Network processor 将基于流的应用程序自动分区和映射到Intel IXP Network处理器

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269847

Sjoerd Meijer, B. Kienhuis, J. Walters, David Snuijf

引用次数: 14

Systematic intermediate sequence removal for reduced memory accesses 为了减少内存访问，系统地移除中间序列

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269851

C. Poucet, S. Mamagkakis, David Atienza Alonso, F. Catthoor

{"title":"Systematic intermediate sequence removal for reduced memory accesses","authors":"C. Poucet, S. Mamagkakis, David Atienza Alonso, F. Catthoor","doi":"10.1145/1269843.1269851","DOIUrl":"https://doi.org/10.1145/1269843.1269851","url":null,"abstract":"Modern software applications are growing in complexity and demand very intensive use of data. Therefore, a wide variety of data structures are utilized to facilitate the storage and access to these vast amounts of computed information. Additionally, the need for reliable software design and the development of large applications following the object-oriented paradigm increase the amount of dynamic buffers and redundant accesses to the data stored in these buffers. In this paper, we propose a systematic, design optimization methodology to remove these intermediate dynamic buffers, thereby reducing the memory accesses of the targeted applications without altering the input-output behaviour of the algorithms. The reduction is focused on sequences and is especially relevant for embedded systems, which have limited on-chip communication bandwidth and the energy consumption of the memory subsystem is high, due to the energy consumption associated with each memory access. The effectiveness of the proposed methodology is assessed in a 3D reconstruction multimedia application and shows a significant reduction in memory accesses. In addition, the general trends for memory improvement and the scalability of our approach are supported as well by a parameterized benchmark set.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115549311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Interference graphs for procedures in static single information form are interval graphs 静态单信息形式程序的干涉图是区间图

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269858

P. Brisk, M. Sarrafzadeh

引用次数: 9

Improvements to the Psi-SSA representation 对Psi-SSA表示的改进

Software and Compilers for Embedded Systems Pub Date : 2007-04-20 DOI: 10.1145/1269843.1269859

F. D. Ferrière

{"title":"Improvements to the Psi-SSA representation","authors":"F. D. Ferrière","doi":"10.1145/1269843.1269859","DOIUrl":"https://doi.org/10.1145/1269843.1269859","url":null,"abstract":"Modern compiler implementations use the Static Single Assignment representation [5] as a way to efficiently implement optimizing algorithms. However this representation is not well adapted to architectures with a predicated instruction set. The ψ-SSA representation was first introduced in [11] as an extension to the Static Single Assignment representation. The ψ-SSA representation extends the SSA representation such that standard SSA algorithms can be easily adapted to an architecture with a fully predicated instruction set. A new pseudo operation, the ψ operation, is introduced to merge several conditional definitions into a unique definition.\u0000 This paper presents an adaptation of the ψ-SSA representation to support architectures with a partially predicated instruction set. The definition of the ψ operation is extended to support the generation and the optimization of partially predicated code. In particular, a predicate promotion transformation is introduced to reduce the number of predicated operations, as well as the number of operations used to compute guard registers. An out of ψ-SSA algorithm is also described, which fixes and improves the algorithm described in [11]. This algorithm is derived from the out of SSA algorithm from Sreedhar et al. [10], where the definitions of liveness and interferences have been extended for the ψ operations. This algorithm inserts predicated copy operations to restore the correct semantics in the program in a non-SSA form.\u0000 The ψ-SSA representation is used in our production compilers, based on the Open64 technology, for the ST200 family processors. In this compiler, predicated code is generated by an if-conversion algorithm performed under the ψ-SSA representation [12, 1].","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132754659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4