Force-directed scheduling for Data Flow Graph mapping on Coarse-Grained Reconfigurable Architectures

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI:10.1109/ReConFig.2014.7032519

Alexander Fell, Z. Rákossy, A. Chattopadhyay

{"title":"Force-directed scheduling for Data Flow Graph mapping on Coarse-Grained Reconfigurable Architectures","authors":"Alexander Fell, Z. Rákossy, A. Chattopadhyay","doi":"10.1109/ReConFig.2014.7032519","DOIUrl":null,"url":null,"abstract":"In terms of energy and flexibility, Coarse-Grained Reconfigurable Architectures (CGRA) are proven to be advantageous over fine-grained architectures, massively parallel GPUs and generic CPUs. However the key challenge of programmability is preventing wide-spread adoption. To exploit instruction level parallelism inherent to such architectures, optimal scheduling and mapping of algorithmic kernels is essential. Transforming an input algorithm in the form of a Data Flow Graph (DFG) into a CGRA schedule and mapping configuration is very challenging, due the necessity to consider architectural details such as memory bandwidth requirements, communication patterns, pipelining and heterogeneity to optimally extract maximum performance. In this paper, an algorithm is proposed that employs Force-Directed Scheduling concepts to solve such scheduling and resource minimization problems. Our heuristic extensions are flexible enough for generic heterogeneous CGRAs, allowing to estimate the execution time of an algorithm with different configurations, while maximizing the utilization of available hardware. Beside our experiments, we compare also given CGRA configurations introduced by state-of-the-art mapping algorithms such as EPIMap, achieving optimal resource utilization by our schedule with a reduced overall DFG execution time by 39% on average.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReConFig.2014.7032519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

Abstract

In terms of energy and flexibility, Coarse-Grained Reconfigurable Architectures (CGRA) are proven to be advantageous over fine-grained architectures, massively parallel GPUs and generic CPUs. However the key challenge of programmability is preventing wide-spread adoption. To exploit instruction level parallelism inherent to such architectures, optimal scheduling and mapping of algorithmic kernels is essential. Transforming an input algorithm in the form of a Data Flow Graph (DFG) into a CGRA schedule and mapping configuration is very challenging, due the necessity to consider architectural details such as memory bandwidth requirements, communication patterns, pipelining and heterogeneity to optimally extract maximum performance. In this paper, an algorithm is proposed that employs Force-Directed Scheduling concepts to solve such scheduling and resource minimization problems. Our heuristic extensions are flexible enough for generic heterogeneous CGRAs, allowing to estimate the execution time of an algorithm with different configurations, while maximizing the utilization of available hardware. Beside our experiments, we compare also given CGRA configurations introduced by state-of-the-art mapping algorithms such as EPIMap, achieving optimal resource utilization by our schedule with a reduced overall DFG execution time by 39% on average.

查看原文本刊更多论文

粗粒度可重构架构中数据流图映射的强制定向调度

在能量和灵活性方面，粗粒度可重构架构(CGRA)被证明优于细粒度架构、大规模并行gpu和通用cpu。然而，可编程性的主要挑战是阻止广泛采用。为了利用这种架构固有的指令级并行性，优化调度和算法内核的映射是必不可少的。将数据流图(DFG)形式的输入算法转换为CGRA调度和映射配置是非常具有挑战性的，因为需要考虑架构细节，如内存带宽需求、通信模式、管道和异构性，以最佳地提取最大性能。本文提出了一种采用力导向调度的算法来解决这类调度和资源最小化问题。我们的启发式扩展对于通用异构CGRAs来说足够灵活，允许使用不同配置估计算法的执行时间，同时最大限度地利用可用硬件。除了我们的实验，我们还比较了由最先进的映射算法(如EPIMap)引入的给定CGRA配置，通过我们的调度实现了最佳的资源利用，总体DFG执行时间平均减少了39%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)

自引率

0.00%

发文量