2018 International Conference on Field-Programmable Technology (FPT)最新文献

筛选
英文 中文
R3SGM: Real-Time Raster-Respecting Semi-Global Matching for Power-Constrained Systems R3SGM:功率受限系统的实时光栅半全局匹配
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-10-30 DOI: 10.1109/FPT.2018.00025
Oscar Rahnama, Tommaso Cavallari, S. Golodetz, Simon Walker, Philip H. S. Torr
{"title":"R3SGM: Real-Time Raster-Respecting Semi-Global Matching for Power-Constrained Systems","authors":"Oscar Rahnama, Tommaso Cavallari, S. Golodetz, Simon Walker, Philip H. S. Torr","doi":"10.1109/FPT.2018.00025","DOIUrl":"https://doi.org/10.1109/FPT.2018.00025","url":null,"abstract":"Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is hard to achieve with conventional hardware, making the use of embedded devices such as FPGAs attractive for low-power applications. However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA context, the accuracy of SGM has been improved by More Global Matching (MGM), which also helps tackle the streaking artifacts that afflict SGM. In this paper, we propose a novel, resource-efficient method that is inspired by MGM's techniques for improving depth quality, but which can be implemented to run in real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI and Middlebury), we show that in comparison to other real-time capable stereo approaches, we can achieve a state-of-the-art balance between accuracy, power efficiency and speed, making our approach highly desirable for use in real-time systems with limited power.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform 一种可扩展的FPGA平台上对象区域提议的流水线数据流加速器
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-10-26 DOI: 10.1109/FPT.2018.00070
Wenzhi Fu, Jianlei Yang, Pengcheng Dai, Yiran Chen, Weisheng Zhao
{"title":"A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform","authors":"Wenzhi Fu, Jianlei Yang, Pengcheng Dai, Yiran Chen, Weisheng Zhao","doi":"10.1109/FPT.2018.00070","DOIUrl":"https://doi.org/10.1109/FPT.2018.00070","url":null,"abstract":"Region proposal is critical for object detection while it usually poses a bottleneck in improving the computation efficiency on traditional control-flow architectures. We have observed region proposal tasks are potentially suitable for performing pipelined parallelism by exploiting dataflow driven acceleration. In this paper, a scalable pipelined dataflow accelerator is proposed for efficient region proposals on FPGA platform. The accelerator processes image data by a streaming manner with three sequential stages: resizing, kernel computing and sorting. First, Ping-Pong cache strategy is adopted for rotation loading in resize module to guarantee continuous output streaming. Then, a multiple pipelines architecture with tiered memory is utilized in kernel computing module to complete the main computation tasks. Finally, a bubble-pushing heap sort method is exploited in sorting module to find the top-k largest candidates efficiently. Our design is implemented with high level synthesis on FPGA platforms, and experimental re-sults on VOC2007 datasets show that it could achieve about 3.67X speedups than traditional desktop CPU platform and >250X energy efficiency improvement than embedded ARM platform.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122428743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Live Migration for OpenCL FPGA Accelerators OpenCL FPGA加速器的实时迁移
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-09-17 DOI: 10.1109/FPT.2018.00017
Anuj Vaishnav, K. Pham, Dirk Koch
{"title":"Live Migration for OpenCL FPGA Accelerators","authors":"Anuj Vaishnav, K. Pham, Dirk Koch","doi":"10.1109/FPT.2018.00017","DOIUrl":"https://doi.org/10.1109/FPT.2018.00017","url":null,"abstract":"FPGAs are currently being deployed at a large scale across data-centres for various applications because of their performance and power benefits. In particular, cloud service operators are now offering FPGAs as a Service. However, to completely integrate FPGAs in a data-centre environment like standard software systems, support for fault tolerance and task migration is essential. In this paper, we propose a live migration technique for FPGA accelerators to provide support for fault tolerance, system maintenance, and resource management. Our technique allows migration of OpenCL accelerators not only within a single FPGA but also across FPGAs with zero downtime. It achieves this by overlapping the computation with datamovements transparently from the user for OpenCL kernels. Moreover, distributed check-pointing mechanisms can be employed to recover from unknown faults with minimal loss of completed work. Altogether it enables system updates such as changing the static FPGA configuration or upgrading the OS without an interruption of service.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130145119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Large Utility Sorting on FPGAs fpga上的大型效用排序
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-09-16 DOI: 10.1109/FPT.2018.00067
Kristiyan Manev, Dirk Koch
{"title":"Large Utility Sorting on FPGAs","authors":"Kristiyan Manev, Dirk Koch","doi":"10.1109/FPT.2018.00067","DOIUrl":"https://doi.org/10.1109/FPT.2018.00067","url":null,"abstract":"This paper presents a merge sorter able of merging thousands of streams in a single run where the logic cost scales logarithmic with the number of streams merged. Moreover, we apply several performance tuning techniques, including speculative execution, deep pipelining and optimized communication schemes between processing elements. An end-to-end case study utilizing a Xilinx VC709 board merges 2048 sequences of the Graysort benchmark between two DRAMs at 9.5GB/s or 1024 sequences at 10.3GB/s effective throughput.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115127977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信