Automatic latency-optimal design of FPGA-based systolic arrays

Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Pub Date : 2002-09-22 DOI:10.1109/FPGA.2002.1106692

J. Nash

{"title":"Automatic latency-optimal design of FPGA-based systolic arrays","authors":"J. Nash","doi":"10.1109/FPGA.2002.1106692","DOIUrl":null,"url":null,"abstract":"\"Systolic\" algorithms have been shown to be suitable for a very large range of structured problems (i.e., linear algebra, graph theory, computational geometry, number-theoretic algorithms, string matching, sorting/searching, dynamic programming, discreet mathematics). Usage of this systolic architecture class has not been widespread in the past, in part because programmable hardware that supported this computing paradigm was not cost-effective to build and no design tools existed. However, suitable hardware has begun to appear. Complex FPGAs now provide an adequate level of speed, density and programmability in the form of reconfigurable computers, boards, and chips with embedded computational support. Such hardware could allow rapid implementation and change of systolic algorithms leading to inexpensive \"programmable\" systolic array hardware. Furthermore, the architectural characteristics of much FPGA hardware matches that required by systolic processing, because this technology is constructed from tiling identical memory and logic blocks along with supporting mesh interconnection networks. The symbolic parallel algorithm development environment (SPADE) described here is being developed to allow a designer to easily and rapidly explore the design space of various systolic algorithm implementations so that FPGA system tradeoffs can be efficiently analyzed. The intention is to allow a user to specify his algorithm with traditional high-level code, set some architectural constraints and then view the results in a meaningful graphical format.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPGA.2002.1106692","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

"Systolic" algorithms have been shown to be suitable for a very large range of structured problems (i.e., linear algebra, graph theory, computational geometry, number-theoretic algorithms, string matching, sorting/searching, dynamic programming, discreet mathematics). Usage of this systolic architecture class has not been widespread in the past, in part because programmable hardware that supported this computing paradigm was not cost-effective to build and no design tools existed. However, suitable hardware has begun to appear. Complex FPGAs now provide an adequate level of speed, density and programmability in the form of reconfigurable computers, boards, and chips with embedded computational support. Such hardware could allow rapid implementation and change of systolic algorithms leading to inexpensive "programmable" systolic array hardware. Furthermore, the architectural characteristics of much FPGA hardware matches that required by systolic processing, because this technology is constructed from tiling identical memory and logic blocks along with supporting mesh interconnection networks. The symbolic parallel algorithm development environment (SPADE) described here is being developed to allow a designer to easily and rapidly explore the design space of various systolic algorithm implementations so that FPGA system tradeoffs can be efficiently analyzed. The intention is to allow a user to specify his algorithm with traditional high-level code, set some architectural constraints and then view the results in a meaningful graphical format.

查看原文本刊更多论文

基于fpga的收缩阵列自动延迟优化设计

“收缩”算法已被证明适用于非常大范围的结构化问题(即线性代数、图论、计算几何、数论算法、字符串匹配、排序/搜索、动态规划、离散数学)。这种收缩体系结构类的使用在过去并不广泛，部分原因是支持这种计算范式的可编程硬件的构建成本不高，而且没有设计工具存在。然而，合适的硬件已经开始出现。复杂的fpga现在以可重构计算机、电路板和具有嵌入式计算支持的芯片的形式提供了足够的速度、密度和可编程性。这样的硬件可以允许快速实现和改变收缩算法，导致廉价的“可编程”收缩阵列硬件。此外，许多FPGA硬件的架构特征与收缩处理所需的结构特征相匹配，因为该技术是由相同的内存和逻辑块以及支持网状互连网络构建而成的。这里描述的符号并行算法开发环境(SPADE)正在开发中，允许设计人员轻松快速地探索各种收缩算法实现的设计空间，以便可以有效地分析FPGA系统权衡。其目的是允许用户使用传统的高级代码指定算法，设置一些架构约束，然后以有意义的图形格式查看结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

自引率

0.00%

发文量