{"title":"Lina: Timing-Constrained High-Level Synthesis Performance Estimator for Fast DSE","authors":"A. B. Perina, J. Becker, Vanderlei Bonato","doi":"10.1109/ICFPT47387.2019.00063","DOIUrl":null,"url":null,"abstract":"The adoption of Field-Programmable Gate Array (FPGA) for general use in the High-Performance Computing scenario has been limited by its complex development flow required to get optimised designs coupled with a time-consuming compilation. High-Level Synthesis (HLS) tools are adopted to improve programmability, however the developer must perform several iterations of optimisation schemes in order to achieve reasonable performance results, which is tedious and not trivial. Several works employ Design Space Exploration (DSE) through different optimisation possibilities, coupled with fast performance estimators to avoid the unacceptable compilation times. This paper presents Lina, an expansion of the Lin-Analyzer fast peformance estimator for C/C++ HLS including timing-constrained scheduling and an extended analysis for nested loops. Results over the PolyBench benchmark show that the average relative error dropped from 8.85% to 3.02% when loop unrolling and pipelining directives were considered. As a result Lina becomes a better estimator for non-perfect loop nests and for different timing constraints, which can be adopted as an additional design space exploration knob.","PeriodicalId":241340,"journal":{"name":"2019 International Conference on Field-Programmable Technology (ICFPT)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT47387.2019.00063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The adoption of Field-Programmable Gate Array (FPGA) for general use in the High-Performance Computing scenario has been limited by its complex development flow required to get optimised designs coupled with a time-consuming compilation. High-Level Synthesis (HLS) tools are adopted to improve programmability, however the developer must perform several iterations of optimisation schemes in order to achieve reasonable performance results, which is tedious and not trivial. Several works employ Design Space Exploration (DSE) through different optimisation possibilities, coupled with fast performance estimators to avoid the unacceptable compilation times. This paper presents Lina, an expansion of the Lin-Analyzer fast peformance estimator for C/C++ HLS including timing-constrained scheduling and an extended analysis for nested loops. Results over the PolyBench benchmark show that the average relative error dropped from 8.85% to 3.02% when loop unrolling and pipelining directives were considered. As a result Lina becomes a better estimator for non-perfect loop nests and for different timing constraints, which can be adopted as an additional design space exploration knob.
现场可编程门阵列(FPGA)在高性能计算场景中的普遍应用受到其复杂的开发流程的限制,需要优化设计,再加上耗时的编译。采用高级综合(High-Level Synthesis, HLS)工具来提高可编程性,但是开发人员必须执行多次优化方案的迭代才能获得合理的性能结果,这是乏味而又不琐碎的。一些作品通过不同的优化可能性使用了设计空间探索(Design Space Exploration, DSE),再加上快速的性能估计器,以避免不可接受的编译时间。本文介绍了Lina,它是对C/ c++ HLS的Lin-Analyzer快速性能估计器的扩展,包括时间约束调度和嵌套循环的扩展分析。PolyBench基准测试结果表明,当考虑循环展开和流水线指令时,平均相对误差从8.85%下降到3.02%。因此,Lina可以更好地估计非完美的环路巢和不同的时间约束,这可以作为一个额外的设计空间探索旋钮。