Automatic parallelization of embedded software using hierarchical task graphs and integer linear programming

2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2010-10-24 DOI:10.1145/1878961.1879009

D. Cordes, P. Marwedel, A. Mallik

{"title":"Automatic parallelization of embedded software using hierarchical task graphs and integer linear programming","authors":"D. Cordes, P. Marwedel, A. Mallik","doi":"10.1145/1878961.1879009","DOIUrl":null,"url":null,"abstract":"The last years have shown that there is no way to disregard the advantages provided by multiprocessor System-on-Chip (MPSoC) architectures in the embedded systems domain. Using multiple cores in a single system enables to close the gap between energy consumption, problems concerning heat dissipation, and computational power. Nevertheless, these benefits do not come for free. New challenges arise, if existing applications have to be ported to these multiprocessor platforms. One of the most ambitious tasks is to extract efficient parallelism from these existing sequential applications. Hence, many parallelization tools have been developed, most of them are extracting as much parallelism as possible, which is in general not the best choice for embedded systems with their limitations in hardware and software support. In contrast to previous approaches, we present a new automatic parallelization tool, tailored to the particular requirements of the resource constrained embedded systems. Therefore, this paper presents an algorithm which automatically steers the granularity of the generated tasks, with respect to architectural requirements and the overall execution time reduction. For this purpose, we exploit hierarchical task graphs to simplify a new integer linear programming based approach in order to split up sequential programs in an efficient way. Results on real-life benchmarks have shown that the presented approach is able to speed sequential applications up by a factor of up to 3.7 on a four core MPSoC architecture.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1878961.1879009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 56

Abstract

The last years have shown that there is no way to disregard the advantages provided by multiprocessor System-on-Chip (MPSoC) architectures in the embedded systems domain. Using multiple cores in a single system enables to close the gap between energy consumption, problems concerning heat dissipation, and computational power. Nevertheless, these benefits do not come for free. New challenges arise, if existing applications have to be ported to these multiprocessor platforms. One of the most ambitious tasks is to extract efficient parallelism from these existing sequential applications. Hence, many parallelization tools have been developed, most of them are extracting as much parallelism as possible, which is in general not the best choice for embedded systems with their limitations in hardware and software support. In contrast to previous approaches, we present a new automatic parallelization tool, tailored to the particular requirements of the resource constrained embedded systems. Therefore, this paper presents an algorithm which automatically steers the granularity of the generated tasks, with respect to architectural requirements and the overall execution time reduction. For this purpose, we exploit hierarchical task graphs to simplify a new integer linear programming based approach in order to split up sequential programs in an efficient way. Results on real-life benchmarks have shown that the presented approach is able to speed sequential applications up by a factor of up to 3.7 on a four core MPSoC architecture.

查看原文本刊更多论文

使用分层任务图和整数线性规划的嵌入式软件自动并行化

过去的几年已经表明，在嵌入式系统领域，没有办法忽视多处理器片上系统(MPSoC)架构所提供的优势。在单个系统中使用多个核心可以缩小能耗、散热问题和计算能力之间的差距。然而，这些好处并不是免费的。如果现有的应用程序必须移植到这些多处理器平台上，就会出现新的挑战。最具挑战性的任务之一是从这些现有的顺序应用程序中提取有效的并行性。因此，已经开发了许多并行化工具，其中大多数都是尽可能多地提取并行性，这通常不是嵌入式系统在硬件和软件支持方面的最佳选择。与以前的方法相比，我们提出了一种新的自动并行化工具，针对资源受限的嵌入式系统的特定需求量身定制。因此，本文提出了一种算法，根据体系结构需求和总体执行时间的减少，自动控制生成任务的粒度。为此，我们利用分层任务图来简化一种新的基于整数线性规划的方法，以便有效地分割顺序程序。实际基准测试的结果表明，所提出的方法能够在四核MPSoC架构上将顺序应用程序的速度提高3.7倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)

自引率

0.00%

发文量