Automatic Sliding Window Operation Optimization for FPGA-Based

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Pub Date : 2006-04-24 DOI:10.1109/FCCM.2006.29

Haiqian Yu, M. Leeser

{"title":"Automatic Sliding Window Operation Optimization for FPGA-Based","authors":"Haiqian Yu, M. Leeser","doi":"10.1109/FCCM.2006.29","DOIUrl":null,"url":null,"abstract":"FPGA-based computing boards are frequently used as hardware accelerators for image processing algorithms based on sliding window operations (SWOs). SWOs are both computationally intensive and data intensive and benefit from hardware acceleration with FPGAs, especially for delay sensitive applications. The current design process requires that, for each specific application using SWOs with different size of window, image, etc.; a detail design must be completed before a realistic estimate of the achievable speedup can be obtained. We present an automated tool, sliding window operation optimization (SWOOP), that generates the estimate of speedup for a high performance design before detailed implementation is complete. The achievable speedup is determined by the area of the FPGA, or, more often, the memory bandwidth to the processing elements. The memory bandwidth to each processing element is a combination of bandwidth to the FPGA and the efficient use of on-chip RAM as a data cache. SWOOP uses analytic techniques to automatically determine the number of parallel processing elements to implement on the FPGA, the assignment of input and output data to on-board memory, and the organization of data in on-chip memory to most effectively keep the processing elements busy. The result is a block layout of the final design, its memory architecture, and a measure of the achievable speedup. The results, compared to manual designs, show that the estimates obtained usinq SWOOP are very accurate","PeriodicalId":123057,"journal":{"name":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"11221 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2006.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

Abstract

FPGA-based computing boards are frequently used as hardware accelerators for image processing algorithms based on sliding window operations (SWOs). SWOs are both computationally intensive and data intensive and benefit from hardware acceleration with FPGAs, especially for delay sensitive applications. The current design process requires that, for each specific application using SWOs with different size of window, image, etc.; a detail design must be completed before a realistic estimate of the achievable speedup can be obtained. We present an automated tool, sliding window operation optimization (SWOOP), that generates the estimate of speedup for a high performance design before detailed implementation is complete. The achievable speedup is determined by the area of the FPGA, or, more often, the memory bandwidth to the processing elements. The memory bandwidth to each processing element is a combination of bandwidth to the FPGA and the efficient use of on-chip RAM as a data cache. SWOOP uses analytic techniques to automatically determine the number of parallel processing elements to implement on the FPGA, the assignment of input and output data to on-board memory, and the organization of data in on-chip memory to most effectively keep the processing elements busy. The result is a block layout of the final design, its memory architecture, and a measure of the achievable speedup. The results, compared to manual designs, show that the estimates obtained usinq SWOOP are very accurate

查看原文本刊更多论文

基于fpga的自动滑动窗口操作优化

基于fpga的计算板经常被用作基于滑动窗口操作(SWOs)的图像处理算法的硬件加速器。swo是计算密集型和数据密集型的，并且受益于fpga的硬件加速，特别是对于延迟敏感的应用。目前的设计流程要求，针对每个具体应用使用的swo，具有不同大小的窗口、图像等;在获得可实现加速的实际估计之前，必须完成详细设计。我们提出了一个自动化工具，滑动窗口操作优化(SWOOP)，它在详细实现完成之前生成高性能设计的加速估计。可实现的加速是由FPGA的面积决定的，或者更常见的是由处理元件的内存带宽决定的。每个处理元素的内存带宽是FPGA带宽和片上RAM作为数据缓存的有效使用的组合。SWOOP使用分析技术自动确定在FPGA上实现的并行处理单元的数量，将输入和输出数据分配到板载存储器，并将数据组织在片上存储器中，以最有效地保持处理单元的忙碌。结果是最终设计的块布局，其内存架构和可实现的加速度量。结果表明，与手工设计相比，使用SWOOP获得的估计是非常准确的

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

自引率

0.00%

发文量