Software and Compilers for Embedded Systems最新文献

筛选
英文 中文
Minimizing the cost of synchronisations in the WCET of real-time parallel programs 最小化实时并行程序在WCET中的同步成本
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609261
Haluk Ozaktas, Christine Rochange, P. Sainrat
{"title":"Minimizing the cost of synchronisations in the WCET of real-time parallel programs","authors":"Haluk Ozaktas, Christine Rochange, P. Sainrat","doi":"10.1145/2609248.2609261","DOIUrl":"https://doi.org/10.1145/2609248.2609261","url":null,"abstract":"Designing time-predictable architectures to support the requirements of hard real-time systems is the goal of several research projects. In this paper we assume that such platforms exist and we focus on the timing analysis of parallel real-time applications. One of the main challenges is to determine how much the delays induced by software constructs such as synchronisations can impact the worst-case execution times (WCETs) of parallel threads. In this paper, we refine state-of-the-art analysis: first, we derive more accurate estimations of stalls at critical sections; second, we introduce new locking primitives that minimise stall times on the worst-case path. Experimental results show noticeable improvements on the WCETs of benchmarks.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122723352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A parallel action language for embedded applications and its compilation flow 一种用于嵌入式应用程序的并行操作语言及其编译流程
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609257
Ivan Llopard, Albert Cohen, Christian Fabre, N. Hili
{"title":"A parallel action language for embedded applications and its compilation flow","authors":"Ivan Llopard, Albert Cohen, Christian Fabre, N. Hili","doi":"10.1145/2609248.2609257","DOIUrl":"https://doi.org/10.1145/2609248.2609257","url":null,"abstract":"The complexity of Embedded System (ES) development is increasing dramatically. This has several cumulative sources: the intricate combination of data-intensive, computational and control aspects; the ubiquity of parallelism and heterogeneity of modern architectures; and the diversity of target-specific, non-deterministic programming models (e.g., C++ with explicit message passing, OpenCL, VHDL). Model-Driven Engineering (MDE) proposes to manage complexity by raising the level of abstraction for designers and developers, and refining the implementation for a particular context and platform through model transformations. In such frameworks, behavior is often specified by means of Hierarchical State Machines (HSMs) equiped with an action language. However, although such models represent some level of control parallelism through objects and HSMs, data parallelism, compound data, and the exploitation and optimization thereof remains very limited.\u0000 In this paper, we propose an action language that seamlessly combines HSMs with data parallelism and operations on compound data. It preserves the expressivity of HSM and captures a layout-neutral description of data organisation. It also extends message-passing with an intuitive semantics for this additional paralellism and provides strong foundation for array-based optimisation techniques. We present this language together with a baseline code generation flow to enable the production of efficient, low-level imperative code.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122267265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Single-rate approximations of cyclo-static synchronous dataflow graphs 循环静态同步数据流图的单速率近似
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609249
R. D. Groote, P. Hölzenspies, J. Kuper, G. Smit
{"title":"Single-rate approximations of cyclo-static synchronous dataflow graphs","authors":"R. D. Groote, P. Hölzenspies, J. Kuper, G. Smit","doi":"10.1145/2609248.2609249","DOIUrl":"https://doi.org/10.1145/2609248.2609249","url":null,"abstract":"Exact analysis of synchronous dataflow (sdf) graphs is often considered too costly, because of the expensive transformation of the graph into a single-rate equivalent. As an alternative, several authors have proposed approximate analyses. Existing approaches to approximation are based on the operational semantics of an sdf graph.\u0000 We propose an approach to approximation that is based on functional semantics. This generalises earlier work done on multi-rate sdf graphs towards cyclo-static sdf (csdf) graphs. We take, as a starting point, a mathematical characterisation, and derive two transformations of a csdf graph into hsdf graphs. These hsdf graphs have the same size as the csdf graph, and are approximations: their respective temporal behaviours are optimistic and pessimistic with respect to the temporal behaviour of the csdf graph. Analysis results computed for these single-rate approximations give bounds on the analysis results of the csdf graph. As an illustration, we show how these single-rate approximations may be used to compute bounds on the buffer sizes required to reach a given throughput.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116158566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A data parallel view on polyhedral process networks 多面体过程网络的数据并行视图
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988939
A. Balevic, B. Kienhuis
{"title":"A data parallel view on polyhedral process networks","authors":"A. Balevic, B. Kienhuis","doi":"10.1145/1988932.1988939","DOIUrl":"https://doi.org/10.1145/1988932.1988939","url":null,"abstract":"Emerging architectures in embedded space are expected to make use of a diverse mix of multicorcs, vector-based units, GPU cores and special function accelerators. In order to facilitate mapping onto diverse architectures, different models of computation have been considered. Polyhedral Process Networks (PPNs) have been extensively used in automatic generation of task and pipeline parallel programs for embedded architectures. However, the single program multiple data (SPMD) type of data parallelism has not been addressed in the PPN model. In this paper, we propose a Data Parallel View (DPV) on PPNs which introduces abstractions necessary for capturing and exploiting data parallelism on top of the PPN model. As a proof of concept, we demonstrate how a PPN can be mapped onto a modern GPU using the DPV. By complementing the native PPN support for task and pipeline parallelism with the DPV support for data parallelism, we expect to make the best use of different types of architectural components and types of parallelism on heterogeneous architectures.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132063810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
The case for application specific compilers 特定于应用程序的编译器的案例
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988943
M. Beemster
{"title":"The case for application specific compilers","authors":"M. Beemster","doi":"10.1145/1988932.1988943","DOIUrl":"https://doi.org/10.1145/1988932.1988943","url":null,"abstract":"We believe it makes sense to develop compilers that are specific for particular application domains. In many areas of embedded computing, processor architectures are designed specifically to run a narrow band of application code very well. These architectures are unlike any the world has seen before and to program them is a challenge to say the least. CoSy's flexible compiler technology thrives in this area. By viewing the compiler as a means and not as a goal, it is possible to achieve spectacular results in a very short time-frame.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121393001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoupled graph-coloring register allocation with hierarchical aliasing 分层混叠的解耦图着色寄存器分配
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988934
A. Tavares, Quentin Colombet, Mariza Bigonha, C. Guillon, Fernando Magno Quintão Pereira, F. Rastello
{"title":"Decoupled graph-coloring register allocation with hierarchical aliasing","authors":"A. Tavares, Quentin Colombet, Mariza Bigonha, C. Guillon, Fernando Magno Quintão Pereira, F. Rastello","doi":"10.1145/1988932.1988934","DOIUrl":"https://doi.org/10.1145/1988932.1988934","url":null,"abstract":"Recent results have shown how to do graph-coloring-based register allocation in a way that decouples spilling from register assignment. This decoupled approach has the main advantage of simplifying the implementation of register allocators. However, the decoupled model, as described in previous works, faces many problems when dealing with register aliasing, a phenomenon typical in architectures usually seen in embedded systems, such as ARM. In this paper we introduce the semi-elementary form, a program representation that brings decoupled register allocation to architectures with register aliasing. The semi-elementary form is much smaller than program representations used by previous decoupled solutions; thus, leading to register allocators that perform better in terms of time and space. Furthermore, this representation reduces the number of copies that traditional allocators insert into assembly programs. We have empirically validated our results by showing how our representation improves two well known graph coloring based allocators, namely the Iterated Register Coalescer (IRC), and Bouchez et al.'s brute force (BF) method, both augmented with Smith et al. extensions to handle aliasing. Running our techniques on SPEC CPU 2000, we have reduced the number of nodes in the interference graphs by a factor of 4 to 5; hence, speeding-up allocation time by a factor of 3 to 5. Additionally the semi-elementary form reduces by 8% the number of copies that IRC leaves uncoalesced.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126204207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
SMT-based optimization for synchronous programs 基于smt的同步程序优化
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988935
Yu Bai, J. Brandt, K. Schneider
{"title":"SMT-based optimization for synchronous programs","authors":"Yu Bai, J. Brandt, K. Schneider","doi":"10.1145/1988932.1988935","DOIUrl":"https://doi.org/10.1145/1988932.1988935","url":null,"abstract":"In this paper, we present several optimization techniques to improve the runtime and size of the code generated from synchronous programs. These optimizations work on extended finite state machines (EFSMs) that can be used as intermediate representation for any synchronous system. Our optimizations consists of two phases: First, local optimization guides the EFSM generation and considers the states and edges separately. Second, global optimization is based on a dataflow analysis of the entire EFSM. For both phases, we employ an SMT (Satisfiability Modulo Theories) solver to verify the individual optimization steps. Our experiments show the potential of the presented optimizations: optimized programs generally have a smaller size and a better run-time performance.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115356151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
B2P2: bounds based procedure placement for instruction TLB power reduction in embedded systems 嵌入式系统中指令TLB功耗降低的基于边界的过程放置
Software and Compilers for Embedded Systems Pub Date : 2010-06-28 DOI: 10.1145/1811212.1811215
Reiley Jeyapaul, Aviral Shrivastava
{"title":"B2P2: bounds based procedure placement for instruction TLB power reduction in embedded systems","authors":"Reiley Jeyapaul, Aviral Shrivastava","doi":"10.1145/1811212.1811215","DOIUrl":"https://doi.org/10.1145/1811212.1811215","url":null,"abstract":"High performance embedded processors are equipped with the Translation Look-aside Buffer (TLB) which forms the key ingredient to efficient and speedy virtual memory management. The TLB though small, is frequently accessed, and therefore not only consumes significant energy, but also is one of the important thermal hot-spots in the processor. Among the many circuit and microarchitectural techniques proposed to reduce TLB power consumption, the Use-Last TLB is one very efficient technique in which power is consumed only when different pages are accessed in succession, i.e., when there is a page-switch [26]. Though the Use-Last technique is effective in reducing i-TLB power, there is scope to further improve its effectiveness by changing the relative code placement of the program. In this work, we formulate the code placement problem to minimize the page-switches in a program. We prove that this problem is NP-complete and propose an efficient Bounds Based Procedure Placement (B2P2) heuristic to efficiently reduce the program's page-switches. Our procedure placement technique delivers an average of 76% reduction in the instrucion-TLB power with negligible (< 2%) impact on performance, over and above the reduction achieved by the Use-Last TLB architecture alone.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130234007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
System level MPSoC design: a bright future for compiler technology? 系统级MPSoC设计:编译器技术的光明未来?
Software and Compilers for Embedded Systems Pub Date : 2010-06-28 DOI: 10.1145/1811212.1811225
R. Leupers
{"title":"System level MPSoC design: a bright future for compiler technology?","authors":"R. Leupers","doi":"10.1145/1811212.1811225","DOIUrl":"https://doi.org/10.1145/1811212.1811225","url":null,"abstract":"Looking back at the SCOPES history, compiler research for embedded processors started out in the 1990s with two major ambitions: (1) more architecture aware code optimizations to better support specialized target machines such as DSPs, and (2) higher flexibility to enable compiler retargeting over a wide range of machines. These research efforts have led to numerous results, many of which are part of industrial products today. So, what is left to do in embedded compilers and who -in a world with \"free\" tools like GCC and LLVM- will pay for them? Naturally, the evolution of embedded processor architectures demands for a never-ending stream of code optimization innovations. However, we argue that the current trend towards ESL design of embedded MPSoC platforms opens up the most promising new opportunities for compiler research, going far beyond the obvious problem of sequential code partitioning. Increasingly complex software stacks, consolidation of the MPSoC platform market, and higher design abstraction levels induce many interesting novel compiler technology use cases, some of which will be highlighted in this presentation.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127016304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel copy motion 平行复制运动
Software and Compilers for Embedded Systems Pub Date : 2010-06-28 DOI: 10.1145/1811212.1811214
Florent Bouchez, Quentin Colombet, A. Darte, F. Rastello, C. Guillon
{"title":"Parallel copy motion","authors":"Florent Bouchez, Quentin Colombet, A. Darte, F. Rastello, C. Guillon","doi":"10.1145/1811212.1811214","DOIUrl":"https://doi.org/10.1145/1811212.1811214","url":null,"abstract":"Recent results on the static single assignment (SSA) form open promising directions for the design of register allocation heuristics for just-in-time (JIT) compilation. In particular, tree-scan allocators with two decoupled phases, one for spilling and one for splitting/coloring/coalescing, seem good candidates for designing fast, memory-friendly, and competitive register allocators. Linear-scan allocators, introduced earlier, are also well-suited for JIT compilation. All do live-range splitting (mostly on control-flow edges) to avoid spilling but most of them perform coalescing poorly, leading to many register-to-register copies inside basic blocks, but also, implicitly, on the control-flow graph edges, leading to edge splitting.\u0000 This paper presents parallel copy motion, a technique for optimizing register-allocated codes, which amounts to moving a group of parallel copy instructions from a program point to another. While the scheduling is shackled by data dependencies, a copy can \"traverse\" all instructions of a basic block, thanks to register renaming, except those with conflicting naming constraints. Also, with an adequate management of compensation code, parallel copies can also be moved across edges. A first application is reducing the cost of copies by a better placement. A second application is moving copies out of critical edges, i.e., edges going from a block with multiple successors to a block with multiple predecessors. This is often beneficial compared to the alternative: splitting the edge. A direct use case is the handling of control-flow graphs with non-splittable edges, introduced by some compilers for specific architectural constraints, region boundaries, or exception handling code.\u0000 Experiments with the SPECint and our own benchmarks suite show that an SSA-based register allocator can be applied broadly now, even for procedures with non-splittable edges: while those procedures could not be compiled before, with parallel copy motion, all moves could be pushed out of such edges. Even simple strategies for moving copies out of edges and inside basic blocks show some average improvement compared to the standard edge-splitting strategy (3% speedup), with a great reduction of the weighted number of copies (21% move cost reduction for SPECint). This lets us believe that the approach is promising, and not only for improving coalescing in fast register allocators.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133366080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信