Software and Compilers for Embedded Systems最新文献

筛选
英文 中文
Exploiting critical data regions to reduce data cache energy consumption 利用关键数据区域减少数据缓存能耗
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609253
K. Vardhan, Y. Srikant
{"title":"Exploiting critical data regions to reduce data cache energy consumption","authors":"K. Vardhan, Y. Srikant","doi":"10.1145/2609248.2609253","DOIUrl":"https://doi.org/10.1145/2609248.2609253","url":null,"abstract":"In this paper we propose an energy aware optimization that exploits latency tolerance of data regions in programs. We propose techniques to identify data regions and rate their criticality using a dynamic critical path model. We compare latency tolerance of data regions to existing characteristics like access frequency and size of data regions. We leverage previously proposed drowsy cache lines to design an optimization that can reduce energy consumption in a data cache. We target this optimization to a simplified single-core with a private cache and single-threaded system which can be part of any type of a multi-core processor. We compare this technique to existing optimizations that use drowsy caches. We experimentally show that this technique can yield total power savings close to 38% and leakage power savings of 20% in the data cache when compared to a baseline configuration without any significant performance penalty.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122468352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Minimizing the cost of synchronisations in the WCET of real-time parallel programs 最小化实时并行程序在WCET中的同步成本
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609261
Haluk Ozaktas, Christine Rochange, P. Sainrat
{"title":"Minimizing the cost of synchronisations in the WCET of real-time parallel programs","authors":"Haluk Ozaktas, Christine Rochange, P. Sainrat","doi":"10.1145/2609248.2609261","DOIUrl":"https://doi.org/10.1145/2609248.2609261","url":null,"abstract":"Designing time-predictable architectures to support the requirements of hard real-time systems is the goal of several research projects. In this paper we assume that such platforms exist and we focus on the timing analysis of parallel real-time applications. One of the main challenges is to determine how much the delays induced by software constructs such as synchronisations can impact the worst-case execution times (WCETs) of parallel threads. In this paper, we refine state-of-the-art analysis: first, we derive more accurate estimations of stalls at critical sections; second, we introduce new locking primitives that minimise stall times on the worst-case path. Experimental results show noticeable improvements on the WCETs of benchmarks.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122723352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A parallel action language for embedded applications and its compilation flow 一种用于嵌入式应用程序的并行操作语言及其编译流程
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609257
Ivan Llopard, Albert Cohen, Christian Fabre, N. Hili
{"title":"A parallel action language for embedded applications and its compilation flow","authors":"Ivan Llopard, Albert Cohen, Christian Fabre, N. Hili","doi":"10.1145/2609248.2609257","DOIUrl":"https://doi.org/10.1145/2609248.2609257","url":null,"abstract":"The complexity of Embedded System (ES) development is increasing dramatically. This has several cumulative sources: the intricate combination of data-intensive, computational and control aspects; the ubiquity of parallelism and heterogeneity of modern architectures; and the diversity of target-specific, non-deterministic programming models (e.g., C++ with explicit message passing, OpenCL, VHDL). Model-Driven Engineering (MDE) proposes to manage complexity by raising the level of abstraction for designers and developers, and refining the implementation for a particular context and platform through model transformations. In such frameworks, behavior is often specified by means of Hierarchical State Machines (HSMs) equiped with an action language. However, although such models represent some level of control parallelism through objects and HSMs, data parallelism, compound data, and the exploitation and optimization thereof remains very limited.\u0000 In this paper, we propose an action language that seamlessly combines HSMs with data parallelism and operations on compound data. It preserves the expressivity of HSM and captures a layout-neutral description of data organisation. It also extends message-passing with an intuitive semantics for this additional paralellism and provides strong foundation for array-based optimisation techniques. We present this language together with a baseline code generation flow to enable the production of efficient, low-level imperative code.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122267265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Single-rate approximations of cyclo-static synchronous dataflow graphs 循环静态同步数据流图的单速率近似
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609249
R. D. Groote, P. Hölzenspies, J. Kuper, G. Smit
{"title":"Single-rate approximations of cyclo-static synchronous dataflow graphs","authors":"R. D. Groote, P. Hölzenspies, J. Kuper, G. Smit","doi":"10.1145/2609248.2609249","DOIUrl":"https://doi.org/10.1145/2609248.2609249","url":null,"abstract":"Exact analysis of synchronous dataflow (sdf) graphs is often considered too costly, because of the expensive transformation of the graph into a single-rate equivalent. As an alternative, several authors have proposed approximate analyses. Existing approaches to approximation are based on the operational semantics of an sdf graph.\u0000 We propose an approach to approximation that is based on functional semantics. This generalises earlier work done on multi-rate sdf graphs towards cyclo-static sdf (csdf) graphs. We take, as a starting point, a mathematical characterisation, and derive two transformations of a csdf graph into hsdf graphs. These hsdf graphs have the same size as the csdf graph, and are approximations: their respective temporal behaviours are optimistic and pessimistic with respect to the temporal behaviour of the csdf graph. Analysis results computed for these single-rate approximations give bounds on the analysis results of the csdf graph. As an illustration, we show how these single-rate approximations may be used to compute bounds on the buffer sizes required to reach a given throughput.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116158566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A framework for dynamic parallelization of FPGA-accelerated applications fpga加速应用的动态并行化框架
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609256
J. Fowers, Jianye Liu, G. Stitt
{"title":"A framework for dynamic parallelization of FPGA-accelerated applications","authors":"J. Fowers, Jianye Liu, G. Stitt","doi":"10.1145/2609248.2609256","DOIUrl":"https://doi.org/10.1145/2609248.2609256","url":null,"abstract":"High-level synthesis and compiler studies have introduced many compile-time techniques for parallelizing applications. However, one fundamental limitation of compile-time optimization is the requirement for pessimistic dependence assumptions that can significantly restrict parallelism. To avoid this limitation, many compilers require a restrictive coding style that is not practical for many designers. We present a more transparent approach that aggressively parallelizes applications by dynamically analyzing actual runtime dependencies and scheduling functions onto multiple devices when dependencies allow. In addition, the approach applies FPGA-specific pipelining optimizations to exploit deep parallelism in chains of dependent functions. Experimental results show a speedup of 4.9x for a video-processing application compared to sequential software execution, a speedup of 5.6x compared to traditional FPGA execution, with a framework overhead of only 4%.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121881524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Temporal analysis model extraction for optimizing modal multi-rate stream processing applications 用于优化模态多速率流处理应用的时间分析模型提取
Software and Compilers for Embedded Systems Pub Date : 2014-06-10 DOI: 10.1145/2609248.2609252
Stefan J. Geuns, J. Hausmans, M. Bekooij
{"title":"Temporal analysis model extraction for optimizing modal multi-rate stream processing applications","authors":"Stefan J. Geuns, J. Hausmans, M. Bekooij","doi":"10.1145/2609248.2609252","DOIUrl":"https://doi.org/10.1145/2609248.2609252","url":null,"abstract":"Modern real-time stream processing applications, such as Software Defined Radio (SDR) applications, typically have multiple modes and multi-rate behavior. Modes are often described using while-loops whereas multi-rate behavior is frequently described using arrays with pseudo-random indexing patterns. The temporal properties of these applications have to be analyzed in order to determine whether optimizations improve throughput. However, no method exists in which a temporal analysis model is derived from these applications that is suitable for temporal analysis and optimization.\u0000 In this paper an approach is presented in which a concurrency model for the temporal analysis and optimization of stream processing applications is automatically extracted from a parallelized sequential application. With this model it can be determined whether a program transformation improves the worst-case temporal behavior. The key feature of the presented approach is that arrays with arbitrary indexing patterns can be described, allowing the description of multi-rate behavior, while still supporting the description of modes using while-loops. In the model, an over-approximation of the synchronization dependencies is used in case of arrays with pseudo-random indexing patterns. Despite the use of this approximation, we show that deadlock is only concluded from the model if there is also deadlock in the parallelized application. The relevance and applicability of the presented approach are demonstrated using an Orthogonal Frequency-Division Multiplexing (OFDM) transmitter application.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113962126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A data parallel view on polyhedral process networks 多面体过程网络的数据并行视图
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988939
A. Balevic, B. Kienhuis
{"title":"A data parallel view on polyhedral process networks","authors":"A. Balevic, B. Kienhuis","doi":"10.1145/1988932.1988939","DOIUrl":"https://doi.org/10.1145/1988932.1988939","url":null,"abstract":"Emerging architectures in embedded space are expected to make use of a diverse mix of multicorcs, vector-based units, GPU cores and special function accelerators. In order to facilitate mapping onto diverse architectures, different models of computation have been considered. Polyhedral Process Networks (PPNs) have been extensively used in automatic generation of task and pipeline parallel programs for embedded architectures. However, the single program multiple data (SPMD) type of data parallelism has not been addressed in the PPN model. In this paper, we propose a Data Parallel View (DPV) on PPNs which introduces abstractions necessary for capturing and exploiting data parallelism on top of the PPN model. As a proof of concept, we demonstrate how a PPN can be mapped onto a modern GPU using the DPV. By complementing the native PPN support for task and pipeline parallelism with the DPV support for data parallelism, we expect to make the best use of different types of architectural components and types of parallelism on heterogeneous architectures.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132063810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Resource-aware programming and simulation of MPSoC architectures through extension of X10 通过扩展X10实现MPSoC架构的资源感知编程和仿真
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988941
Frank Hannig, Sascha Roloff, G. Snelting, J. Teich, Andreas Zwinkau
{"title":"Resource-aware programming and simulation of MPSoC architectures through extension of X10","authors":"Frank Hannig, Sascha Roloff, G. Snelting, J. Teich, Andreas Zwinkau","doi":"10.1145/1988932.1988941","DOIUrl":"https://doi.org/10.1145/1988932.1988941","url":null,"abstract":"The efficient use of future MPSoCs with 1000 or more processor cores requires new means of resource-aware programming to deal with increasing imperfections such as process variation, fault rates, aging effects, and power as well as thermal problems. In this paper, we apply a new approach called invasive computing that enables an application programmer to spread computations to processors deliberately and on purpose at certain points of the program. Such decisions can be made depending on the degree of application parallelism and the state of the underlying resources such as utilization, load, and temperature. The introduced programming constructs for resource-aware programming are embedded into the parallel computing language X10 as developed by IBM using a library-based approach. Moreover, we show how individual heterogeneous MPSoC architectures may be modeled for subsequent functional simulation by defining compute resources such as processors themselves by lightweight threads that are executed in parallel together with the application threads by the X10 run-time system. Thus, the state changes of each hardware resource may be simulated including temperature, aging, and other useful monitor functionality to provide a first high-level programming test-bed for invasive computing.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125904285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Static run-time mode extraction by state partitioning in synchronous process networks 同步进程网络中基于状态划分的静态运行时模式提取
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988938
M. Beyer, S. Glesner
{"title":"Static run-time mode extraction by state partitioning in synchronous process networks","authors":"M. Beyer, S. Glesner","doi":"10.1145/1988932.1988938","DOIUrl":"https://doi.org/10.1145/1988932.1988938","url":null,"abstract":"Process Networks (PNs) are used for modeling streaming-oriented applications with changing behavior, which must be mapped on a concurrent architecture to meet the performance and energy constraints of embedded devices. Finding an optimal mapping of Process Networks to the constrained architecture presumes that the behavior of the PN is statically known. In this paper we present a static analysis for synchronous PNs that partitions the state space according to extract run-time modes based on a Data Augmented Control Flow Automaton (DACFA). The result is a mode automaton whose nodes describe identified program modes and whose edges represent transitions among them. Optimizing back-ends mapping from PNs to concurrent architectures can be guided by these analysis results.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130270537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The case for application specific compilers 特定于应用程序的编译器的案例
Software and Compilers for Embedded Systems Pub Date : 2011-06-27 DOI: 10.1145/1988932.1988943
M. Beemster
{"title":"The case for application specific compilers","authors":"M. Beemster","doi":"10.1145/1988932.1988943","DOIUrl":"https://doi.org/10.1145/1988932.1988943","url":null,"abstract":"We believe it makes sense to develop compilers that are specific for particular application domains. In many areas of embedded computing, processor architectures are designed specifically to run a narrow band of application code very well. These architectures are unlike any the world has seen before and to program them is a challenge to say the least. CoSy's flexible compiler technology thrives in this area. By viewing the compiler as a means and not as a goal, it is possible to achieve spectacular results in a very short time-frame.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121393001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信