Exploiting just-enough parallelism when mapping streaming applications in hard real-time systems

2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2013-05-29 DOI:10.1145/2463209.2488944

J. Zhai, M. Bamakhrama, T. Stefanov

{"title":"Exploiting just-enough parallelism when mapping streaming applications in hard real-time systems","authors":"J. Zhai, M. Bamakhrama, T. Stefanov","doi":"10.1145/2463209.2488944","DOIUrl":null,"url":null,"abstract":"Embedded streaming applications specified using parallel Models of Computation (MoC) often contain ample amount of parallelism which can be exploited using Multi-Processor System-on-Chip (MPSoC) platforms. It has been shown that the various forms of parallelism in an application should be explored to achieve the maximum system performance. However, if more parallelism is revealed than needed, it will overload the underlying MPSoC platform. At the same time, the revealed parallelism should be sufficient such that the MPSoC platform is fully utilized. Therefore, the amount of revealed and exploited parallelism has to be just-enough with respect to the platform constraints. In this paper, we study the problem of exploiting just-enough parallelism by application task unfolding, when mapping streaming applications modeled using the Synchronous Data Flow (SDF) MoC onto MPSoC platforms in hard real-time systems. We show that our problem of simultaneously unfolding and allocating tasks under hard real-time scheduling has a bounded solution space and derive its upper bounds. Subsequently, we devise an efficient algorithm to solve the problem, while the obtained solution meets a pre-specified quality. The experiments on a set of real-life streaming applications demonstrate that our algorithm results, within reasonable amount of time, in a system specification with large performance gain. Finally, we show that our proposed algorithm is on average 100 times faster than one of the state-of-the-art meta-heuristics, i.e., NSGA-II genetic algorithm, while achieving the same quality of solutions.","PeriodicalId":320207,"journal":{"name":"2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2463209.2488944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

Embedded streaming applications specified using parallel Models of Computation (MoC) often contain ample amount of parallelism which can be exploited using Multi-Processor System-on-Chip (MPSoC) platforms. It has been shown that the various forms of parallelism in an application should be explored to achieve the maximum system performance. However, if more parallelism is revealed than needed, it will overload the underlying MPSoC platform. At the same time, the revealed parallelism should be sufficient such that the MPSoC platform is fully utilized. Therefore, the amount of revealed and exploited parallelism has to be just-enough with respect to the platform constraints. In this paper, we study the problem of exploiting just-enough parallelism by application task unfolding, when mapping streaming applications modeled using the Synchronous Data Flow (SDF) MoC onto MPSoC platforms in hard real-time systems. We show that our problem of simultaneously unfolding and allocating tasks under hard real-time scheduling has a bounded solution space and derive its upper bounds. Subsequently, we devise an efficient algorithm to solve the problem, while the obtained solution meets a pre-specified quality. The experiments on a set of real-life streaming applications demonstrate that our algorithm results, within reasonable amount of time, in a system specification with large performance gain. Finally, we show that our proposed algorithm is on average 100 times faster than one of the state-of-the-art meta-heuristics, i.e., NSGA-II genetic algorithm, while achieving the same quality of solutions.

查看原文本刊更多论文

在硬实时系统中映射流应用程序时，利用刚好足够的并行性

使用并行计算模型(MoC)指定的嵌入式流应用程序通常包含大量的并行性，可以使用多处理器片上系统(MPSoC)平台加以利用。研究表明，为了实现最大的系统性能，应该探索应用程序中各种形式的并行性。但是，如果显示的并行性超过所需，则会使底层MPSoC平台过载。同时，显示的并行性应该足够，以便MPSoC平台得到充分利用。因此，暴露和利用的并行性的数量必须刚好满足平台约束。在本文中，我们研究了当将使用同步数据流(SDF) MoC建模的流应用程序映射到硬实时系统的MPSoC平台时，通过应用程序任务展开来利用刚好足够的并行性的问题。证明了在硬实时调度下任务同时展开和分配问题具有有界的解空间，并导出了它的上界。随后，我们设计了一种有效的算法来解决问题，同时得到的解满足预先指定的质量。在一组实际的流媒体应用程序上的实验表明，我们的算法在合理的时间内，在系统规范中获得了较大的性能增益。最后，我们表明，我们提出的算法比最先进的元启发式算法(即NSGA-II遗传算法)平均快100倍，同时获得相同质量的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC)

自引率

0.00%

发文量