Area-Efficient Memory Scheduling for Dynamically Scheduled High-Level Synthesis

2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI:10.1109/ICFPT56656.2022.9974262

Xue-Xin He, Jianyi Cheng, G. Constantinides

{"title":"Area-Efficient Memory Scheduling for Dynamically Scheduled High-Level Synthesis","authors":"Xue-Xin He, Jianyi Cheng, G. Constantinides","doi":"10.1109/ICFPT56656.2022.9974262","DOIUrl":null,"url":null,"abstract":"In high-level synthesis, scheduling maps operations into clock cycles. It can either be done at compile time (statically) or run time (dynamically). There has been recent interests in dynamic scheduling as it can potentially achieve a better performance. The state-of-the-art dynamically scheduled HLS tool Dynamatic generates dataflow-style hardware in a netlist of pre-defined components connected using handshake signals. The memory operations are executed by a component named load-store queue (LSQ), which can achieve run-time out-of-order memory accesses for high performance. However, the additional logic for the LSQ leads to significant area overhead compared to static scheduling. In this paper, we propose an area-efficient approach for scheduling memory operations at run time. We approximate the memory dependence distance to its minimal value and efficiently parallelise memory accesses in dynamically scheduled hardware. Over several benchmarks from related works, our results show that our approach achieves on average $0.2\\times$ of the area-delay product compared to the original designs using LSQs.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT56656.2022.9974262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In high-level synthesis, scheduling maps operations into clock cycles. It can either be done at compile time (statically) or run time (dynamically). There has been recent interests in dynamic scheduling as it can potentially achieve a better performance. The state-of-the-art dynamically scheduled HLS tool Dynamatic generates dataflow-style hardware in a netlist of pre-defined components connected using handshake signals. The memory operations are executed by a component named load-store queue (LSQ), which can achieve run-time out-of-order memory accesses for high performance. However, the additional logic for the LSQ leads to significant area overhead compared to static scheduling. In this paper, we propose an area-efficient approach for scheduling memory operations at run time. We approximate the memory dependence distance to its minimal value and efficiently parallelise memory accesses in dynamically scheduled hardware. Over several benchmarks from related works, our results show that our approach achieves on average $0.2\times$ of the area-delay product compared to the original designs using LSQs.

查看原文本刊更多论文

动态调度高级综合的区域高效内存调度

在高级综合中，调度将操作映射到时钟周期中。它可以在编译时(静态)或运行时(动态)完成。最近人们对动态调度很感兴趣，因为它有可能实现更好的性能。最先进的动态调度HLS工具dynamic在使用握手信号连接的预定义组件的网络列表中生成数据流风格的硬件。内存操作由一个名为负载存储队列(load-store queue, LSQ)的组件执行，该组件可以实现运行时无序内存访问，从而获得高性能。然而，与静态调度相比，LSQ的附加逻辑会导致显著的面积开销。在本文中，我们提出了一种在运行时调度内存操作的区域高效方法。我们将内存依赖距离逼近到最小值，并在动态调度的硬件中有效地并行化内存访问。通过相关工作的几个基准测试，我们的结果表明，与使用LSQs的原始设计相比，我们的方法平均实现了0.2倍的区域延迟产品。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Conference on Field-Programmable Technology (ICFPT)

自引率

0.00%

发文量