用于多视角立体声的高频域增强和通道关注模块

IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Yongjuan Yang , Jie Cao , Hong Zhao , Zhaobin Chang , Weijie Wang
{"title":"用于多视角立体声的高频域增强和通道关注模块","authors":"Yongjuan Yang ,&nbsp;Jie Cao ,&nbsp;Hong Zhao ,&nbsp;Zhaobin Chang ,&nbsp;Weijie Wang","doi":"10.1016/j.compeleceng.2024.109855","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-view stereo based on deep learning is increasingly popular as a method for 3D reconstruction. Existing methods have made significant advancements in pixel-level depth estimation. However, challenges such as occlusions and non-Lambertian surfaces in images hinder accurate confidence estimation. Moreover, cost volume regularization often results in excessive smoothing at object boundaries. To tackle these challenges, we propose integrating the High Frequency Information Compensator and 3D Channel Attention Module into the Multi-View Stereo Network, termed HFCA-MVS. Firstly, in the feature volume aggregation stage, we introduce a high-frequency information compensator module to enhance the correlation between 2D semantics and 3D space. Subsequently, in the cost volume regularization stage, a 3D channel attention module is introduced to enhance the representation of channel features by capturing relationships among different channels. Lastly, the 3DCNN network employs the GELU activation function to boost the activation response and mitigate excessive object boundary smoothing. HFCA-MVS demonstrates competitive performance in 3D reconstruction across three benchmark datasets: DTU, BlendMVS, and Tanks&amp;Temples. Particularly, compared to CasMVSNet, MVSTER, and Geo-MVSNet on the DTU benchmark, HFCA-MVS achieves a relative improvement in completeness of 33%, 6.5%, and 0.4%, respectively, and an enhancement in overall performance of 15% and 4.2% compared to CasMVSNet and MVSTER. Furthermore, our model yields comparable reconstruction results to existing models on the Tanks&amp;Temples dataset.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109855"},"PeriodicalIF":4.0000,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High frequency domain enhancement and channel attention module for multi-view stereo\",\"authors\":\"Yongjuan Yang ,&nbsp;Jie Cao ,&nbsp;Hong Zhao ,&nbsp;Zhaobin Chang ,&nbsp;Weijie Wang\",\"doi\":\"10.1016/j.compeleceng.2024.109855\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multi-view stereo based on deep learning is increasingly popular as a method for 3D reconstruction. Existing methods have made significant advancements in pixel-level depth estimation. However, challenges such as occlusions and non-Lambertian surfaces in images hinder accurate confidence estimation. Moreover, cost volume regularization often results in excessive smoothing at object boundaries. To tackle these challenges, we propose integrating the High Frequency Information Compensator and 3D Channel Attention Module into the Multi-View Stereo Network, termed HFCA-MVS. Firstly, in the feature volume aggregation stage, we introduce a high-frequency information compensator module to enhance the correlation between 2D semantics and 3D space. Subsequently, in the cost volume regularization stage, a 3D channel attention module is introduced to enhance the representation of channel features by capturing relationships among different channels. Lastly, the 3DCNN network employs the GELU activation function to boost the activation response and mitigate excessive object boundary smoothing. HFCA-MVS demonstrates competitive performance in 3D reconstruction across three benchmark datasets: DTU, BlendMVS, and Tanks&amp;Temples. Particularly, compared to CasMVSNet, MVSTER, and Geo-MVSNet on the DTU benchmark, HFCA-MVS achieves a relative improvement in completeness of 33%, 6.5%, and 0.4%, respectively, and an enhancement in overall performance of 15% and 4.2% compared to CasMVSNet and MVSTER. Furthermore, our model yields comparable reconstruction results to existing models on the Tanks&amp;Temples dataset.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"121 \",\"pages\":\"Article 109855\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790624007821\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790624007821","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

作为一种三维重建方法,基于深度学习的多视角立体技术越来越受欢迎。现有方法在像素级深度估计方面取得了显著进步。然而,图像中的遮挡和非朗伯表面等挑战阻碍了准确的置信度估计。此外,成本体积正则化往往会导致物体边界过度平滑。为了应对这些挑战,我们建议将高频信息补偿器和三维通道注意模块集成到多视图立体网络中,称为 HFCA-MVS。首先,在特征卷聚合阶段,我们引入了高频信息补偿器模块,以增强二维语义与三维空间之间的相关性。随后,在代价卷正则化阶段,我们引入了三维信道关注模块,通过捕捉不同信道之间的关系来增强信道特征的表示。最后,3DCNN 网络采用 GELU 激活函数来增强激活响应,并减少过度的对象边界平滑。HFCA-MVS 在三个基准数据集的三维重建中表现出了极具竞争力的性能:DTU、BlendMVS 和 Tanks&Temples。特别是在 DTU 基准数据集上,与 CasMVSNet、MVSTER 和 Geo-MVSNet 相比,HFCA-MVS 的完整性分别提高了 33%、6.5% 和 0.4%,总体性能比 CasMVSNet 和 MVSTER 分别提高了 15% 和 4.2%。此外,我们的模型在 Tanks&Temples 数据集上获得了与现有模型相当的重建结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
High frequency domain enhancement and channel attention module for multi-view stereo
Multi-view stereo based on deep learning is increasingly popular as a method for 3D reconstruction. Existing methods have made significant advancements in pixel-level depth estimation. However, challenges such as occlusions and non-Lambertian surfaces in images hinder accurate confidence estimation. Moreover, cost volume regularization often results in excessive smoothing at object boundaries. To tackle these challenges, we propose integrating the High Frequency Information Compensator and 3D Channel Attention Module into the Multi-View Stereo Network, termed HFCA-MVS. Firstly, in the feature volume aggregation stage, we introduce a high-frequency information compensator module to enhance the correlation between 2D semantics and 3D space. Subsequently, in the cost volume regularization stage, a 3D channel attention module is introduced to enhance the representation of channel features by capturing relationships among different channels. Lastly, the 3DCNN network employs the GELU activation function to boost the activation response and mitigate excessive object boundary smoothing. HFCA-MVS demonstrates competitive performance in 3D reconstruction across three benchmark datasets: DTU, BlendMVS, and Tanks&Temples. Particularly, compared to CasMVSNet, MVSTER, and Geo-MVSNet on the DTU benchmark, HFCA-MVS achieves a relative improvement in completeness of 33%, 6.5%, and 0.4%, respectively, and an enhancement in overall performance of 15% and 4.2% compared to CasMVSNet and MVSTER. Furthermore, our model yields comparable reconstruction results to existing models on the Tanks&Temples dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Electrical Engineering
Computers & Electrical Engineering 工程技术-工程:电子与电气
CiteScore
9.20
自引率
7.00%
发文量
661
审稿时长
47 days
期刊介绍: The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信