2018 International Conference on Field-Programmable Technology (FPT)最新文献_第9页

R3SGM: Real-Time Raster-Respecting Semi-Global Matching for Power-Constrained Systems R3SGM:功率受限系统的实时光栅半全局匹配

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-10-30 DOI: 10.1109/FPT.2018.00025

Oscar Rahnama, Tommaso Cavallari, S. Golodetz, Simon Walker, Philip H. S. Torr

{"title":"R3SGM: Real-Time Raster-Respecting Semi-Global Matching for Power-Constrained Systems","authors":"Oscar Rahnama, Tommaso Cavallari, S. Golodetz, Simon Walker, Philip H. S. Torr","doi":"10.1109/FPT.2018.00025","DOIUrl":"https://doi.org/10.1109/FPT.2018.00025","url":null,"abstract":"Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is hard to achieve with conventional hardware, making the use of embedded devices such as FPGAs attractive for low-power applications. However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA context, the accuracy of SGM has been improved by More Global Matching (MGM), which also helps tackle the streaking artifacts that afflict SGM. In this paper, we propose a novel, resource-efficient method that is inspired by MGM's techniques for improving depth quality, but which can be implemented to run in real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI and Middlebury), we show that in comparison to other real-time capable stereo approaches, we can achieve a state-of-the-art balance between accuracy, power efficiency and speed, making our approach highly desirable for use in real-time systems with limited power.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform 一种可扩展的FPGA平台上对象区域提议的流水线数据流加速器

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-10-26 DOI: 10.1109/FPT.2018.00070

Wenzhi Fu, Jianlei Yang, Pengcheng Dai, Yiran Chen, Weisheng Zhao

{"title":"A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform","authors":"Wenzhi Fu, Jianlei Yang, Pengcheng Dai, Yiran Chen, Weisheng Zhao","doi":"10.1109/FPT.2018.00070","DOIUrl":"https://doi.org/10.1109/FPT.2018.00070","url":null,"abstract":"Region proposal is critical for object detection while it usually poses a bottleneck in improving the computation efficiency on traditional control-flow architectures. We have observed region proposal tasks are potentially suitable for performing pipelined parallelism by exploiting dataflow driven acceleration. In this paper, a scalable pipelined dataflow accelerator is proposed for efficient region proposals on FPGA platform. The accelerator processes image data by a streaming manner with three sequential stages: resizing, kernel computing and sorting. First, Ping-Pong cache strategy is adopted for rotation loading in resize module to guarantee continuous output streaming. Then, a multiple pipelines architecture with tiered memory is utilized in kernel computing module to complete the main computation tasks. Finally, a bubble-pushing heap sort method is exploited in sorting module to find the top-k largest candidates efficiently. Our design is implemented with high level synthesis on FPGA platforms, and experimental re-sults on VOC2007 datasets show that it could achieve about 3.67X speedups than traditional desktop CPU platform and >250X energy efficiency improvement than embedded ARM platform.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122428743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Live Migration for OpenCL FPGA Accelerators OpenCL FPGA加速器的实时迁移

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-09-17 DOI: 10.1109/FPT.2018.00017

Anuj Vaishnav, K. Pham, Dirk Koch

引用次数: 7

Large Utility Sorting on FPGAs fpga上的大型效用排序

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-09-16 DOI: 10.1109/FPT.2018.00067

Kristiyan Manev, Dirk Koch

引用次数: 12