{"title":"Operation scheduling and architecture co-synthesis for energy-efficient dataflow computations on FPGAs (abstract only)","authors":"C. Y. Lin, N. Wong, Hayden Kwok-Hay So","doi":"10.1145/2145694.2145757","DOIUrl":null,"url":null,"abstract":"Compiling high-level user applications for execution on FPGAs often involves synthesizing dataflow graphs beyond the size of the available on-chip computational resources. One way to address this is by folding the execution of the given dataflow graphs onto an array of directly connected simple configurable processing elements (CPEs). Under this scenario, the performance and energy-efficiency of the resulting system depends not only on the mapping schedule of the compute operations on the CPEs, but also on the topology of the interconnect array that connects the CPEs. This paper presents a framework in which the operation scheduler and the underlying CPE interconnect network topology are co-optimized on a per-application basis for energy-efficient FPGA computation. Given the same application, more than 2.5x difference in energy-efficiency was achievable by the use of different common regular array topologies to connect the CPEs. Moreover, by using irregular application-specific interconnect topologies derived from a genetic algorithm, up to 50% improvement in energy-delay-product was achievable when compared to the use of even the best regular topology. The use of such framework is anticipated to serve as part of a rapid high-level FPGA application compiler since minimum hardware place-and-route is needed to generate the optimal schedule and topology.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"38 8","pages":"270"},"PeriodicalIF":0.0000,"publicationDate":"2012-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2145694.2145757","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Compiling high-level user applications for execution on FPGAs often involves synthesizing dataflow graphs beyond the size of the available on-chip computational resources. One way to address this is by folding the execution of the given dataflow graphs onto an array of directly connected simple configurable processing elements (CPEs). Under this scenario, the performance and energy-efficiency of the resulting system depends not only on the mapping schedule of the compute operations on the CPEs, but also on the topology of the interconnect array that connects the CPEs. This paper presents a framework in which the operation scheduler and the underlying CPE interconnect network topology are co-optimized on a per-application basis for energy-efficient FPGA computation. Given the same application, more than 2.5x difference in energy-efficiency was achievable by the use of different common regular array topologies to connect the CPEs. Moreover, by using irregular application-specific interconnect topologies derived from a genetic algorithm, up to 50% improvement in energy-delay-product was achievable when compared to the use of even the best regular topology. The use of such framework is anticipated to serve as part of a rapid high-level FPGA application compiler since minimum hardware place-and-route is needed to generate the optimal schedule and topology.