Multi-level pyramid fusion for efficient stereo matching

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Jiaqi Zhu, Bin Li, Xinhua Zhao
{"title":"Multi-level pyramid fusion for efficient stereo matching","authors":"Jiaqi Zhu, Bin Li, Xinhua Zhao","doi":"10.1007/s00530-024-01419-4","DOIUrl":null,"url":null,"abstract":"<p>Stereo matching is a key technology for many autonomous driving and robotics applications. Recently, methods based on Convolutional Neural Network have achieved huge progress. However, it is still difficult to find accurate matching points in inherently ill-posed regions such as areas with weak texture and reflective surfaces. In this paper, we propose a multi-level pyramid fusion volume (MPFV-Stereo) which contains two prominent components: multi-scale cost volume (MSCV) and multi-level cost volume (MLCV). We also design a low-parameter Gaussian attention module to excite cost volume. Our MPFV-Stereo ranks 2nd on KITTI 2012 (Reflective) among all published methods. In addition, MPFV-Stereo has competitive results on both Scene Flow and KITTI datasets and requires less training to achieve strong cross-dataset generalization on Middlebury and ETH3D benchmark.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"56 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01419-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Stereo matching is a key technology for many autonomous driving and robotics applications. Recently, methods based on Convolutional Neural Network have achieved huge progress. However, it is still difficult to find accurate matching points in inherently ill-posed regions such as areas with weak texture and reflective surfaces. In this paper, we propose a multi-level pyramid fusion volume (MPFV-Stereo) which contains two prominent components: multi-scale cost volume (MSCV) and multi-level cost volume (MLCV). We also design a low-parameter Gaussian attention module to excite cost volume. Our MPFV-Stereo ranks 2nd on KITTI 2012 (Reflective) among all published methods. In addition, MPFV-Stereo has competitive results on both Scene Flow and KITTI datasets and requires less training to achieve strong cross-dataset generalization on Middlebury and ETH3D benchmark.

Abstract Image

多级金字塔融合实现高效立体匹配
立体匹配是许多自动驾驶和机器人应用的关键技术。最近,基于卷积神经网络的方法取得了巨大进步。然而,在纹理薄弱区域和反光表面等固有问题区域,仍然很难找到精确的匹配点。在本文中,我们提出了一种多层次金字塔融合体(MPFV-Stereo),它包含两个重要组成部分:多尺度成本体(MSCV)和多层次成本体(MLCV)。我们还设计了一个低参数高斯注意模块来激发成本体积。我们的 MPFV-Stereo 在 2012 年 KITTI(反思)上,在所有已发布的方法中排名第二。此外,MPFV-Stereo 在 Scene Flow 和 KITTI 数据集上的结果也很有竞争力,而且在 Middlebury 和 ETH3D 基准上,只需较少的训练即可实现较强的跨数据集泛化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Multimedia Systems
Multimedia Systems 工程技术-计算机:理论方法
CiteScore
5.40
自引率
7.70%
发文量
148
审稿时长
4.5 months
期刊介绍: This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信