高级合成中环路管道的最优绑定和端口分配

Nicolai Fiege, Patrick Sittel, P. Zipf
{"title":"高级合成中环路管道的最优绑定和端口分配","authors":"Nicolai Fiege, Patrick Sittel, P. Zipf","doi":"10.1109/FPL57034.2022.00047","DOIUrl":null,"url":null,"abstract":"In order to provide high throughput for custom hardware implementations, academic and commercial high-level synthesis (HLS) tools use loop pipelining by modulo scheduling. When provided a resource allocation and a schedule, the binding algorithm can be used to reduce the number of required lifetime registers (LR) and multiplexers (MUX). Contrary to non-modulo schedules, optimal solutions to the binding problem for implementing modulo schedules with respect to minimizing required LRs and MUXs have not been published. To address this topic, we propose a novel optimal binding algorithm to simultaneously minimize MUX and LR costs for loop pipelining using Integer Linear Programming. We evaluated our algorithm on a set of commonly used benchmark instances from digital signal processing and report that all encountered problems could be solved, with 36.53% of the solutions being optimal within a time limit of only five minutes. Compared to worst case evaluations, we report MUX and LR savings of up to 42.74% and 26.62%, respectively. To evaluate the impact on the resulting circuit after place and route, we studied FPGA implementations of several benchmark instances and recorded look-up table and flip-flop reductions of up to 13.70% and 5.24%, respectively, compared to previous work and to an extensive set of randomly generated bindings when state-of-the-art algorithms fail to find a feasible solution.","PeriodicalId":380116,"journal":{"name":"2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Optimal Binding and Port Assignment for Loop Pipelining in High-Level Synthesis\",\"authors\":\"Nicolai Fiege, Patrick Sittel, P. Zipf\",\"doi\":\"10.1109/FPL57034.2022.00047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to provide high throughput for custom hardware implementations, academic and commercial high-level synthesis (HLS) tools use loop pipelining by modulo scheduling. When provided a resource allocation and a schedule, the binding algorithm can be used to reduce the number of required lifetime registers (LR) and multiplexers (MUX). Contrary to non-modulo schedules, optimal solutions to the binding problem for implementing modulo schedules with respect to minimizing required LRs and MUXs have not been published. To address this topic, we propose a novel optimal binding algorithm to simultaneously minimize MUX and LR costs for loop pipelining using Integer Linear Programming. We evaluated our algorithm on a set of commonly used benchmark instances from digital signal processing and report that all encountered problems could be solved, with 36.53% of the solutions being optimal within a time limit of only five minutes. Compared to worst case evaluations, we report MUX and LR savings of up to 42.74% and 26.62%, respectively. To evaluate the impact on the resulting circuit after place and route, we studied FPGA implementations of several benchmark instances and recorded look-up table and flip-flop reductions of up to 13.70% and 5.24%, respectively, compared to previous work and to an extensive set of randomly generated bindings when state-of-the-art algorithms fail to find a feasible solution.\",\"PeriodicalId\":380116,\"journal\":{\"name\":\"2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)\",\"volume\":\"71 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPL57034.2022.00047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPL57034.2022.00047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

为了为定制硬件实现提供高吞吐量,学术和商业高级综合(HLS)工具通过模调度使用循环流水线。当提供资源分配和调度时,绑定算法可用于减少所需的生命周期寄存器(LR)和多路复用器(MUX)的数量。与非模调度相反,实现与最小化所需LRs和mux相关的模调度绑定问题的最优解尚未公布。为了解决这个问题,我们提出了一种新的最优绑定算法,以同时最小化循环管道的MUX和LR成本,使用整数线性规划。我们在一组来自数字信号处理的常用基准实例上评估了我们的算法,并报告所有遇到的问题都可以解决,其中36.53%的解决方案在仅5分钟的时间限制内是最优的。与最坏情况的评估相比,MUX和LR分别节省了42.74%和26.62%。为了评估放置和路由后对结果电路的影响,我们研究了几个基准实例的FPGA实现,并记录了查找表和触发器分别减少13.70%和5.24%,与之前的工作和大量随机生成的绑定相比,当最先进的算法无法找到可行的解决方案时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Optimal Binding and Port Assignment for Loop Pipelining in High-Level Synthesis
In order to provide high throughput for custom hardware implementations, academic and commercial high-level synthesis (HLS) tools use loop pipelining by modulo scheduling. When provided a resource allocation and a schedule, the binding algorithm can be used to reduce the number of required lifetime registers (LR) and multiplexers (MUX). Contrary to non-modulo schedules, optimal solutions to the binding problem for implementing modulo schedules with respect to minimizing required LRs and MUXs have not been published. To address this topic, we propose a novel optimal binding algorithm to simultaneously minimize MUX and LR costs for loop pipelining using Integer Linear Programming. We evaluated our algorithm on a set of commonly used benchmark instances from digital signal processing and report that all encountered problems could be solved, with 36.53% of the solutions being optimal within a time limit of only five minutes. Compared to worst case evaluations, we report MUX and LR savings of up to 42.74% and 26.62%, respectively. To evaluate the impact on the resulting circuit after place and route, we studied FPGA implementations of several benchmark instances and recorded look-up table and flip-flop reductions of up to 13.70% and 5.24%, respectively, compared to previous work and to an extensive set of randomly generated bindings when state-of-the-art algorithms fail to find a feasible solution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信