Type-directed scheduling of streaming accelerators

David Durst, Matthew Feldman, Dillon Huff, David Akeley, Ross G. Daly, G. Bernstein, Marco Patrignani, K. Fatahalian, P. Hanrahan
{"title":"Type-directed scheduling of streaming accelerators","authors":"David Durst, Matthew Feldman, Dillon Huff, David Akeley, Ross G. Daly, G. Bernstein, Marco Patrignani, K. Fatahalian, P. Hanrahan","doi":"10.1145/3385412.3385983","DOIUrl":null,"url":null,"abstract":"Designing efficient, application-specialized hardware accelerators requires assessing trade-offs between a hardware module’s performance and resource requirements. To facilitate hardware design space exploration, we describe Aetherling, a system for automatically compiling data-parallel programs into statically scheduled, streaming hardware circuits. Aetherling contributes a space- and time-aware intermediate language featuring data-parallel operators that represent parallel or sequential hardware modules, and sequence data types that encode a module’s throughput by specifying when sequence elements are produced or consumed. As a result, well-typed operator composition in the space-time language corresponds to connecting hardware modules via statically scheduled, streaming interfaces. We provide rules for transforming programs written in a standard data-parallel language (that carries no information about hardware implementation) into equivalent space-time language programs. We then provide a scheduling algorithm that searches over the space of transformations to quickly generate area-efficient hardware designs that achieve a programmer-specified throughput. Using benchmarks from the image processing domain, we demonstrate that Aetherling enables rapid exploration of hardware designs with different throughput and area characteristics, and yields results that require 1.8-7.9× fewer FPGA slices than those of prior hardware generation systems.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3385412.3385983","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35

Abstract

Designing efficient, application-specialized hardware accelerators requires assessing trade-offs between a hardware module’s performance and resource requirements. To facilitate hardware design space exploration, we describe Aetherling, a system for automatically compiling data-parallel programs into statically scheduled, streaming hardware circuits. Aetherling contributes a space- and time-aware intermediate language featuring data-parallel operators that represent parallel or sequential hardware modules, and sequence data types that encode a module’s throughput by specifying when sequence elements are produced or consumed. As a result, well-typed operator composition in the space-time language corresponds to connecting hardware modules via statically scheduled, streaming interfaces. We provide rules for transforming programs written in a standard data-parallel language (that carries no information about hardware implementation) into equivalent space-time language programs. We then provide a scheduling algorithm that searches over the space of transformations to quickly generate area-efficient hardware designs that achieve a programmer-specified throughput. Using benchmarks from the image processing domain, we demonstrate that Aetherling enables rapid exploration of hardware designs with different throughput and area characteristics, and yields results that require 1.8-7.9× fewer FPGA slices than those of prior hardware generation systems.
流式加速器的类型导向调度
设计高效的专用于应用程序的硬件加速器需要评估硬件模块的性能和资源需求之间的权衡。为了方便硬件设计空间的探索,我们描述了Aetherling,一个自动将数据并行程序编译成静态调度的流硬件电路的系统。Aetherling提供了一种具有空间和时间感知的中间语言,其特点是数据并行运算符(表示并行或顺序硬件模块)和序列数据类型(通过指定何时产生或使用序列元素来编码模块的吞吐量)。因此,在时空语言中,类型良好的操作符组合对应于通过静态调度的流接口连接硬件模块。我们提供了将用标准数据并行语言(不携带有关硬件实现的信息)编写的程序转换为等效时空语言程序的规则。然后,我们提供了一种调度算法,该算法搜索转换空间,以快速生成区域高效的硬件设计,从而实现程序员指定的吞吐量。使用来自图像处理领域的基准测试,我们证明了Aetherling能够快速探索具有不同吞吐量和面积特性的硬件设计,并且产生的结果比先前的硬件生成系统需要1.8-7.9倍的FPGA切片。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信