A 39pJ/label 1920x1080 165.7 FPS Block PatchMatch Based Stereo Matching Processor on FPGA

Hongyu Wang, Weiti Zhou, Xiangyu Zhang, Xin Lou
{"title":"A 39pJ/label 1920x1080 165.7 FPS Block PatchMatch Based Stereo Matching Processor on FPGA","authors":"Hongyu Wang, Weiti Zhou, Xiangyu Zhang, Xin Lou","doi":"10.1109/CICC53496.2022.9772830","DOIUrl":null,"url":null,"abstract":"Depth is a fundamental information for lots of computer vision applications. Stereo matching is a commonly used depth estimation method that mimics the human binocular vision system. Most stereo matching systems suffer when attempt to strike a balance between accuracy and computational complexity because of two reasons: 1) It is usually assumed that all the surfaces are fronto-parallel, meaning that neighbors share the same disparity. But a lot of real-world situations are slant surfaces like roads and walls. 2) Conventional methods usually utilize winner takes all (WTA) strategy [1]–[4], where aggregated costs in all disparity levels (usually 128 or 256) must be calculated. But there is only one true guess for each pixel position, such that most of the calculated costs are meaningless. To solve the above issues, in this work, a block PatchMatch-based FPGA accelerator for stereo matching is proposed. PatchMatch introduces random search strategy and slant label, where rectangle superpixels called blocks are used as the basic computing element. Main improvements of this work are: 1) Utilized random search strategy and block level computation can save massive computation. 2) Closer-to-reality slant label improves accuracy. Moreover, plane slant is also helpful for following tasks like 3D reconstruction [5], but none of the existing hardware accelerators can provide this information. 3) Algorithm-hardware co-optimized 6-points Census feature and multi-scale propagation (MSP) are proposed. 4) Based on the testing results on industrial-level KITTI dataset, the real-time performance and energy efficiency of the proposed design outperform state-of-the-art FPGA and ASIC designs, with comparable accuracy.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Custom Integrated Circuits Conference (CICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICC53496.2022.9772830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Depth is a fundamental information for lots of computer vision applications. Stereo matching is a commonly used depth estimation method that mimics the human binocular vision system. Most stereo matching systems suffer when attempt to strike a balance between accuracy and computational complexity because of two reasons: 1) It is usually assumed that all the surfaces are fronto-parallel, meaning that neighbors share the same disparity. But a lot of real-world situations are slant surfaces like roads and walls. 2) Conventional methods usually utilize winner takes all (WTA) strategy [1]–[4], where aggregated costs in all disparity levels (usually 128 or 256) must be calculated. But there is only one true guess for each pixel position, such that most of the calculated costs are meaningless. To solve the above issues, in this work, a block PatchMatch-based FPGA accelerator for stereo matching is proposed. PatchMatch introduces random search strategy and slant label, where rectangle superpixels called blocks are used as the basic computing element. Main improvements of this work are: 1) Utilized random search strategy and block level computation can save massive computation. 2) Closer-to-reality slant label improves accuracy. Moreover, plane slant is also helpful for following tasks like 3D reconstruction [5], but none of the existing hardware accelerators can provide this information. 3) Algorithm-hardware co-optimized 6-points Census feature and multi-scale propagation (MSP) are proposed. 4) Based on the testing results on industrial-level KITTI dataset, the real-time performance and energy efficiency of the proposed design outperform state-of-the-art FPGA and ASIC designs, with comparable accuracy.
基于FPGA的39pJ/label 1920x1080 165.7 FPS块PatchMatch立体匹配处理器
深度是许多计算机视觉应用的基础信息。立体匹配是一种模拟人类双眼视觉系统的常用深度估计方法。大多数立体匹配系统在试图在精度和计算复杂性之间取得平衡时都会受到影响,原因有两个:1)通常假设所有的表面都是正面平行的,这意味着相邻的表面具有相同的差异。但现实世界中很多情况都是倾斜的表面,比如道路和墙壁。2)传统方法通常采用赢家通吃(WTA)策略[1]-[4],必须计算所有视差水平(通常为128或256)的总成本。但是对于每个像素位置只有一个正确的猜测,因此大多数计算的成本都是无意义的。为了解决上述问题,本文提出了一种基于块patchmatch的FPGA立体匹配加速器。PatchMatch引入了随机搜索策略和倾斜标签,其中使用矩形超像素块作为基本计算元素。本工作的主要改进在于:1)利用随机搜索策略和块级计算节省了大量的计算量。2)更接近现实的倾斜标签提高了准确性。此外,平面倾斜也有助于后续的任务,如3D重建[5],但现有的硬件加速器都不能提供这一信息。3)提出了算法硬件协同优化的6点普查特征和多尺度传播(MSP)。4)基于工业级KITTI数据集的测试结果,所提设计的实时性和能效优于当前最先进的FPGA和ASIC设计,且精度相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信