Fast Crown Scheduling Heuristics for Energy-Efficient Mapping and Scaling of Moldable Streaming Tasks on Many-Core Systems

Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems Pub Date : 2015-01-09 DOI:10.1145/2764967.2764975

Nicolas Melot, C. Kessler, J. Keller, Patrick Eitschberger

{"title":"Fast Crown Scheduling Heuristics for Energy-Efficient Mapping and Scaling of Moldable Streaming Tasks on Many-Core Systems","authors":"Nicolas Melot, C. Kessler, J. Keller, Patrick Eitschberger","doi":"10.1145/2764967.2764975","DOIUrl":null,"url":null,"abstract":"Exploiting effectively massively parallel architectures is a major challenge that stream programming can help to face. We investigate the problem of generating energy-optimal code for a collection of streaming tasks that include parallelizable or moldable tasks on a generic manycore processor with dynamic discrete frequency scaling. In this paper we consider crown scheduling, a novel technique for the combined optimization of resource allocation, mapping and discrete voltage/frequency scaling for moldable streaming task collections in order to optimize energy efficiency given a throughput constraint. We present optimal off-line algorithms for separate and integrated crown scheduling based on integer linear programming (ILP) and heuristics able to compute solution faster and for bigger problems. We make no restricting assumption about speedup behavior. Our experimental evaluation of the ILP models for a generic manycore architecture shows that at least for small and medium sized streaming task collections even the integrated variant of crown scheduling can be solved to optimality by a state-of-the-art ILP solver within a few seconds. Our heuristics produce makespan and energy consumption close to optimality within the limits of the phase-separated crown scheduling technique and the crown structure. Their optimization time is longer than the one of other algorithms we test, but our heuristics consistently produce better solutions. This is an extended abstract of Melot et al., ACM Trans. Arch. Code Opt. 11(4) 2015.","PeriodicalId":110157,"journal":{"name":"Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2764967.2764975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

Exploiting effectively massively parallel architectures is a major challenge that stream programming can help to face. We investigate the problem of generating energy-optimal code for a collection of streaming tasks that include parallelizable or moldable tasks on a generic manycore processor with dynamic discrete frequency scaling. In this paper we consider crown scheduling, a novel technique for the combined optimization of resource allocation, mapping and discrete voltage/frequency scaling for moldable streaming task collections in order to optimize energy efficiency given a throughput constraint. We present optimal off-line algorithms for separate and integrated crown scheduling based on integer linear programming (ILP) and heuristics able to compute solution faster and for bigger problems. We make no restricting assumption about speedup behavior. Our experimental evaluation of the ILP models for a generic manycore architecture shows that at least for small and medium sized streaming task collections even the integrated variant of crown scheduling can be solved to optimality by a state-of-the-art ILP solver within a few seconds. Our heuristics produce makespan and energy consumption close to optimality within the limits of the phase-separated crown scheduling technique and the crown structure. Their optimization time is longer than the one of other algorithms we test, but our heuristics consistently produce better solutions. This is an extended abstract of Melot et al., ACM Trans. Arch. Code Opt. 11(4) 2015.

查看原文本刊更多论文

多核系统上可塑流任务节能映射和缩放的快速冠形调度启发式

有效地利用大规模并行架构是流编程可以帮助解决的主要挑战。我们研究了在具有动态离散频率缩放的通用多核处理器上为一组流任务(包括可并行或可建模任务)生成能量最优代码的问题。在本文中，我们考虑皇冠调度，一种新的技术组合优化资源分配，映射和离散电压/频率缩放可建模流任务集合，以优化能源效率给定的吞吐量约束。本文提出了基于整数线性规划(ILP)和启发式算法的分离和集成皇冠调度的最优离线算法，该算法能够更快地计算更大问题的解。我们对加速行为不做限制性假设。我们对通用多核架构的ILP模型的实验评估表明，至少对于中小型流任务集合，即使是crown调度的集成变体，也可以通过最先进的ILP求解器在几秒钟内求解到最优性。在分相冠调度技术和冠结构的限制下，我们的启发式算法使完工时间和能耗接近最优。它们的优化时间比我们测试的其他算法要长，但我们的启发式总是能产生更好的解决方案。这是Melot等人的扩展摘要，ACM翻译。拱门。Code Opt. 11(4) 2015。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems

自引率

0.00%

发文量