Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems最新文献

Compiler-Directed Data Locality Optimization in MATLAB MATLAB中编译器导向的数据局部优化

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906378

Christakis Lezos, I. Latifis, G. Dimitroulakos, K. Masselos

{"title":"Compiler-Directed Data Locality Optimization in MATLAB","authors":"Christakis Lezos, I. Latifis, G. Dimitroulakos, K. Masselos","doi":"10.1145/2906363.2906378","DOIUrl":"https://doi.org/10.1145/2906363.2906378","url":null,"abstract":"Array programming languages, such as MATLAB, are often used for algorithm development by scientists and engineers without taking into consideration implementation related issues and with limited emphasis on relevant optimizations. Application code optimization, especially in terms of data storage and transfer behavior, is still an important issue and heavily affects implementations' quality in terms of performance, power consumption etc. Efficient approaches for the optimization of high level application code are required to derive high quality implementations while still reducing development time and cost. This paper presents MemAssist, a software tool supporting application developers in detecting parts of the application code in MATLAB that do not exploit efficiently the targeted processor architecture and especially the memory hierarchy. Furthermore, the proposed tool guides application developers in applying code transformations in MATLAB for the optimization of the algorithm's temporal data locality. An image processing algorithm has been optimized using MemAssist as a practical usage scenario. Experimental results prove that the use of MemAssist can heavily reduce cache misses (up to 40%) and improve execution time (up to 30% speedup) on two different processor architectures. Thus, MemAssist can be used for optimized application code development that can lead to efficient implementations while still reducing development time and cost.","PeriodicalId":344390,"journal":{"name":"Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130525752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Studying the Impact of Bit Switching on CPU Energy 研究位交换对CPU能量的影响

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906382

Ghassan Shobaki, Najm Eldeen Abu Rmaileh, J. Jamal

{"title":"Studying the Impact of Bit Switching on CPU Energy","authors":"Ghassan Shobaki, Najm Eldeen Abu Rmaileh, J. Jamal","doi":"10.1145/2906363.2906382","DOIUrl":"https://doi.org/10.1145/2906363.2906382","url":null,"abstract":"It has been proposed in previous work that compiler instruction scheduling may reduce energy consumption by reordering instructions to minimize bit switching. Multiple algorithms have been proposed in the literature for performing this form of instruction scheduling. However, the impact of these algorithms on actual energy consumption has not been quantified using real hardware measurements; only simulation results have been reported. In this paper, we study the impact of bit switching on the CPU energy consumption using direct hardware measurements on a modern ARM processor. The measurements are performed using an energy probe provided by ARM. The experimental results show that the switching energy is significant and measurable, thus negating the hypothesis that compiling for performance is equivalent to compiling for energy. Yet, our experimental evaluation of multiple bit-switching-aware algorithms suggests that developing a compiler scheduling algorithm for reducing energy consumption by minimizing bit switching is quite challenging, because bit switching may conflict with execution time. An instruction order that minimizes bit switching but increases execution time may result in an overall increase in CPU energy, because the execution time has a higher impact on CPU energy than bit switching. In conclusion, our experimental results show that although performance is a primary factor that affects energy, it is not the only factor; switching energy is another significant factor.","PeriodicalId":344390,"journal":{"name":"Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129196163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Task-Level Monitoring Framework for Multi-Processor Platforms 面向多处理器平台的任务级监控框架

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906373

Philipp Ittershagen, Kim Grüttner, W. Nebel

引用次数: 1

Machine Learning Approach to Generate Pareto Front for List-scheduling Algorithms 列表调度算法Pareto Front生成的机器学习方法

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906380

Pham Nam Khanh, Akash Kumar, Khin Mi Mi Aung

{"title":"Machine Learning Approach to Generate Pareto Front for List-scheduling Algorithms","authors":"Pham Nam Khanh, Akash Kumar, Khin Mi Mi Aung","doi":"10.1145/2906363.2906380","DOIUrl":"https://doi.org/10.1145/2906363.2906380","url":null,"abstract":"List Scheduling is one of the most widely used techniques for scheduling due to its simplicity and efficiency. In traditional list-based schedulers, a cost/priority function is used to compute the priority of tasks/jobs and put them in an ordered list. The cost function has been becoming more and more complex to cover increasing number of constraints in the system design. However, most of the existing list-based schedulers implement a static priority function that usually provides only one schedule for each task graph input. Therefore, they may not be able to satisfy the desire of system designers, who want to examine the trade-off between a number of design requirements (performance, power, energy, reliability ...). To address this problem, we propose a framework to utilize the Genetic Algorithm (GA) for exploring the design space and obtaining Pareto-optimal design points. Furthermore, multiple regression techniques are used to build predictive models for the Pareto fronts to limit the execution time of GA. The models are built using training task graph datasets and applied on incoming task graphs. The Pareto fronts for incoming task graphs are produced in time 2 orders of magnitude faster than the traditional GA, with only 4% degradation in the quality.","PeriodicalId":344390,"journal":{"name":"Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134494809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms 向量化同步数据流图映射到CPU-GPU平台的设计框架

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906374

Shuoxin Lin, Yanzhou Liu, W. Plishker, S. Bhattacharyya

{"title":"A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms","authors":"Shuoxin Lin, Yanzhou Liu, W. Plishker, S. Bhattacharyya","doi":"10.1145/2906363.2906374","DOIUrl":"https://doi.org/10.1145/2906363.2906374","url":null,"abstract":"Heterogeneous computing platforms with multicore central processing units (CPUs) and graphics processing units (GPUs) are of increasing interest to designers of embedded signal processing systems since they offer the potential for significant performance boost while maintaining the flexibility of software-based design flows. Developing optimized implementations for CPU-GPU platforms is challenging due to complex, inter-related design issues, including task scheduling, interprocessor communication, memory management, and modeling and exploitation of different forms of parallelism. In this paper, we present an automated, dataflow based, design framework called DIF-GPU for application mapping and software synthesis on heterogeneous CPU-GPU platforms. DIF-GPU is based on novel extensions to the dataflow interchange format (DIF) package, which is a software environment for developing and experimenting with dataflow-based design methods and synthesis techniques for embedded signal processing systems. DIF-GPU exploits multiple forms of parallelism by deeply incorporating efficient vectorization and scheduling techniques for synchronous dataflow specifications, and incorporating techniques for streamlining interprocessor communication. DIF-GPU also provides software synthesis capabilities to help accelerate the process of moving from high-level application models to optimized implementations.","PeriodicalId":344390,"journal":{"name":"Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114885798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Exploring Single Source Shortest Path Parallelization on Shared Memory Accelerators 在共享内存加速器上探索单源最短路径并行化

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2915925

D. Palossi, A. Marongiu

引用次数: 1

In-Place Update in a Dataflow Synchronous Language: A Retiming-Enabled Language Experiment 一种数据流同步语言中的就地更新:一种支持定时的语言实验

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906379

Ulysse Beaugnon, Albert Cohen, Marc Pouzet

引用次数: 1

An Extensible Platform Description Language Supporting Retargetable Toolchains and Adaptive Execution 一种支持可重目标工具链和自适应执行的可扩展平台描述语言

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906366

C. Kessler, Lu Li, A. Atalar, A. Dobre

引用次数: 2

Introducing MoC Drivers for the Integration of Sensor-Actuator Behaviors in Model-Based Design Flows of Embedded Systems 介绍了基于模型的嵌入式系统设计流程中传感器-执行器行为集成的MoC驱动

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906368

Omair Rafique, K. Schneider

{"title":"Introducing MoC Drivers for the Integration of Sensor-Actuator Behaviors in Model-Based Design Flows of Embedded Systems","authors":"Omair Rafique, K. Schneider","doi":"10.1145/2906363.2906368","DOIUrl":"https://doi.org/10.1145/2906363.2906368","url":null,"abstract":"Model-based design flows for embedded systems have been introduced to allow late design changes while still keeping tight time-to-market deadlines. In general, these design flows start with abstract models and refine these to a final implementation maintaining already implemented properties. However, essentially all of these design flows suffer from a deployment gap in the sense that the finally generated files are general program files which assume a particular model of computation (MoC) that may not be provided by the chosen target architecture. For this reason, the final deployment is usually a non-trivial manual design step that can break all correctness-by-construction guarantees of the previous model-based design. In this paper, we therefore introduce the idea of MoC drivers which wraps the real sensor and actuator interaction in a shell that provides the MoC of the generated software. As a particular example, we discuss in this paper how MoC drivers bridge the deployment gap between automatically generated dataflow programs and event-driven behaviors of the target architecture. The approach is illustrated with a Speedometer application on a distributed automotive embedded platform.","PeriodicalId":344390,"journal":{"name":"Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122800417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Practical Challenges of ILP-based SPM Allocation Optimizations 基于ilp的SPM分配优化的实际挑战

Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2016-05-23 DOI: 10.1145/2906363.2906371

Dominic Oehlert, Arno Luppold, H. Falk

引用次数: 9