具有点关注的多视点立体网络

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2023-08-26 DOI:10.1007/s10489-023-04806-y

Rong Zhao, Zhuoer Gu, Xie Han, Ligang He, Fusheng Sun, Shichao Jiao

{"title":"具有点关注的多视点立体网络","authors":"Rong Zhao, Zhuoer Gu, Xie Han, Ligang He, Fusheng Sun, Shichao Jiao","doi":"10.1007/s10489-023-04806-y","DOIUrl":null,"url":null,"abstract":"<div><p>In recent years, learning-based multi-view stereo (MVS) reconstruction has gained superiority when compared with traditional methods. In this paper, we introduce a novel point-attention network, with an attention mechanism, based on the point cloud structure. During the reconstruction process, our method with an attention mechanism can guide the network to pay more attention to complex areas such as thin structures and low-texture surfaces. We first infer a coarse depth map using a modified classical MVS deep framework and convert it into the corresponding point cloud. Then, we add the high-frequency features and different-resolution features of the raw images to the point cloud. Finally, our network guides the weight distribution of points in different dimensions through the attention mechanism and computes the depth displacement of each point iteratively as the depth residual, which is added to the coarse depth prediction to obtain the final high-resolution depth map. Experimental results show that our proposed point-attention architecture can achieve a significant improvement in some scenes without reasonable geometrical assumptions on the <i>DTU</i> dataset and the <i>Tanks and Temples</i> dataset, suggesting that our method has a strong generalization ability.\n</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 22","pages":"26622 - 26636"},"PeriodicalIF":3.4000,"publicationDate":"2023-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-view stereo network with point attention\",\"authors\":\"Rong Zhao, Zhuoer Gu, Xie Han, Ligang He, Fusheng Sun, Shichao Jiao\",\"doi\":\"10.1007/s10489-023-04806-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In recent years, learning-based multi-view stereo (MVS) reconstruction has gained superiority when compared with traditional methods. In this paper, we introduce a novel point-attention network, with an attention mechanism, based on the point cloud structure. During the reconstruction process, our method with an attention mechanism can guide the network to pay more attention to complex areas such as thin structures and low-texture surfaces. We first infer a coarse depth map using a modified classical MVS deep framework and convert it into the corresponding point cloud. Then, we add the high-frequency features and different-resolution features of the raw images to the point cloud. Finally, our network guides the weight distribution of points in different dimensions through the attention mechanism and computes the depth displacement of each point iteratively as the depth residual, which is added to the coarse depth prediction to obtain the final high-resolution depth map. Experimental results show that our proposed point-attention architecture can achieve a significant improvement in some scenes without reasonable geometrical assumptions on the <i>DTU</i> dataset and the <i>Tanks and Temples</i> dataset, suggesting that our method has a strong generalization ability.\\n</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"53 22\",\"pages\":\"26622 - 26636\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2023-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-023-04806-y\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-023-04806-y","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

近年来，与传统方法相比，基于学习的多视点立体重建（MVS）方法获得了优势。在本文中，我们介绍了一种新的基于点云结构的具有注意力机制的点注意力网络。在重建过程中，我们的注意力机制方法可以引导网络更多地关注复杂区域，如薄结构和低纹理表面。我们首先使用改进的经典MVS深度框架来推断粗略的深度图，并将其转换为相应的点云。然后，我们将原始图像的高频特征和不同分辨率特征添加到点云中。最后，我们的网络通过注意力机制引导不同维度的点的权重分布，并迭代计算每个点的深度位移作为深度残差，将其添加到粗略的深度预测中，以获得最终的高分辨率深度图。实验结果表明，在没有对DTU数据集和Tanks and Temples数据集进行合理几何假设的情况下，我们提出的点注意力架构可以在某些场景中实现显著的改进，表明我们的方法具有较强的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-view stereo network with point attention

In recent years, learning-based multi-view stereo (MVS) reconstruction has gained superiority when compared with traditional methods. In this paper, we introduce a novel point-attention network, with an attention mechanism, based on the point cloud structure. During the reconstruction process, our method with an attention mechanism can guide the network to pay more attention to complex areas such as thin structures and low-texture surfaces. We first infer a coarse depth map using a modified classical MVS deep framework and convert it into the corresponding point cloud. Then, we add the high-frequency features and different-resolution features of the raw images to the point cloud. Finally, our network guides the weight distribution of points in different dimensions through the attention mechanism and computes the depth displacement of each point iteratively as the depth residual, which is added to the coarse depth prediction to obtain the final high-resolution depth map. Experimental results show that our proposed point-attention architecture can achieve a significant improvement in some scenes without reasonable geometrical assumptions on the DTU dataset and the Tanks and Temples dataset, suggesting that our method has a strong generalization ability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.