A Multi-view Matching Method Based on PatchmatchNet with Sparse Point Information

Proceedings of the 4th World Symposium on Software Engineering Pub Date : 2022-09-28 DOI:10.1145/3568364.3568366

Y. Yang, Huarong Xu, Lifen Weng

{"title":"A Multi-view Matching Method Based on PatchmatchNet with Sparse Point Information","authors":"Y. Yang, Huarong Xu, Lifen Weng","doi":"10.1145/3568364.3568366","DOIUrl":null,"url":null,"abstract":"The learning-based multi-view stereo (MVS) method has become a research hotspot in 3D reconstruction. Deep learning can extract more robust semantic features of images and better adapt to scenes of soft texture and non-diffuse reflection. However, current deep learning methods focus more on improving the quality of reconstruction, and we believe that reducing depth estimation time and GPU memory consumption is equally important. Therefore, this paper proposes S-PatchmatchNet with higher accuracy and faster efficiency. Firstly, in the initial stage of depth estimation, we use Colmap to obtain sparse points and generate initial depth information through triangulation and interpolation to replace the random initialization of PatchmatchNet, which reduces the time consumed in random depth and improves computational efficiency. Secondly, we design an effective data enhancement mechanism. Specifically, a mask 1/3 of the size of the image is used to randomly erase data on the image. By minimizing the error between the prediction result of enhanced data and the ground reality, the sample prediction is standardized, the robustness of the model is enhanced, and the accuracy of depth prediction is improved. To test the validity of our method, we conducted tests on the DTU dataset. Compared to PatchmatchNet, the efficiency and accuracy of our approach are improved to varying degrees. Meanwhile, we get competitive results on challenging Tanks and Temples datasets.","PeriodicalId":262799,"journal":{"name":"Proceedings of the 4th World Symposium on Software Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th World Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3568364.3568366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The learning-based multi-view stereo (MVS) method has become a research hotspot in 3D reconstruction. Deep learning can extract more robust semantic features of images and better adapt to scenes of soft texture and non-diffuse reflection. However, current deep learning methods focus more on improving the quality of reconstruction, and we believe that reducing depth estimation time and GPU memory consumption is equally important. Therefore, this paper proposes S-PatchmatchNet with higher accuracy and faster efficiency. Firstly, in the initial stage of depth estimation, we use Colmap to obtain sparse points and generate initial depth information through triangulation and interpolation to replace the random initialization of PatchmatchNet, which reduces the time consumed in random depth and improves computational efficiency. Secondly, we design an effective data enhancement mechanism. Specifically, a mask 1/3 of the size of the image is used to randomly erase data on the image. By minimizing the error between the prediction result of enhanced data and the ground reality, the sample prediction is standardized, the robustness of the model is enhanced, and the accuracy of depth prediction is improved. To test the validity of our method, we conducted tests on the DTU dataset. Compared to PatchmatchNet, the efficiency and accuracy of our approach are improved to varying degrees. Meanwhile, we get competitive results on challenging Tanks and Temples datasets.

查看原文本刊更多论文

基于PatchmatchNet的稀疏点信息多视图匹配方法

基于学习的多视图立体(MVS)方法已成为三维重建领域的研究热点。深度学习可以提取出更加鲁棒的图像语义特征，更好地适应纹理柔和、非漫反射的场景。然而，目前的深度学习方法更侧重于提高重建质量，我们认为减少深度估计时间和GPU内存消耗同样重要。因此，本文提出了精度更高、效率更快的S-PatchmatchNet算法。首先，在深度估计的初始阶段，我们使用Colmap获取稀疏点，并通过三角剖分和插值生成初始深度信息，取代PatchmatchNet的随机初始化，减少了随机深度所消耗的时间，提高了计算效率。其次，设计了有效的数据增强机制。具体来说，使用图像大小的1/3的掩码随机擦除图像上的数据。通过最小化增强数据预测结果与地面实际的误差，标准化了样本预测，增强了模型的鲁棒性，提高了深度预测的精度。为了测试我们方法的有效性，我们对DTU数据集进行了测试。与PatchmatchNet相比，我们的方法的效率和准确率都有不同程度的提高。同时，我们在具有挑战性的坦克和神庙数据集上获得了有竞争力的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th World Symposium on Software Engineering

自引率

0.00%

发文量