{"title":"A Multi-view Matching Method Based on PatchmatchNet with Sparse Point Information","authors":"Y. Yang, Huarong Xu, Lifen Weng","doi":"10.1145/3568364.3568366","DOIUrl":null,"url":null,"abstract":"The learning-based multi-view stereo (MVS) method has become a research hotspot in 3D reconstruction. Deep learning can extract more robust semantic features of images and better adapt to scenes of soft texture and non-diffuse reflection. However, current deep learning methods focus more on improving the quality of reconstruction, and we believe that reducing depth estimation time and GPU memory consumption is equally important. Therefore, this paper proposes S-PatchmatchNet with higher accuracy and faster efficiency. Firstly, in the initial stage of depth estimation, we use Colmap to obtain sparse points and generate initial depth information through triangulation and interpolation to replace the random initialization of PatchmatchNet, which reduces the time consumed in random depth and improves computational efficiency. Secondly, we design an effective data enhancement mechanism. Specifically, a mask 1/3 of the size of the image is used to randomly erase data on the image. By minimizing the error between the prediction result of enhanced data and the ground reality, the sample prediction is standardized, the robustness of the model is enhanced, and the accuracy of depth prediction is improved. To test the validity of our method, we conducted tests on the DTU dataset. Compared to PatchmatchNet, the efficiency and accuracy of our approach are improved to varying degrees. Meanwhile, we get competitive results on challenging Tanks and Temples datasets.","PeriodicalId":262799,"journal":{"name":"Proceedings of the 4th World Symposium on Software Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th World Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3568364.3568366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The learning-based multi-view stereo (MVS) method has become a research hotspot in 3D reconstruction. Deep learning can extract more robust semantic features of images and better adapt to scenes of soft texture and non-diffuse reflection. However, current deep learning methods focus more on improving the quality of reconstruction, and we believe that reducing depth estimation time and GPU memory consumption is equally important. Therefore, this paper proposes S-PatchmatchNet with higher accuracy and faster efficiency. Firstly, in the initial stage of depth estimation, we use Colmap to obtain sparse points and generate initial depth information through triangulation and interpolation to replace the random initialization of PatchmatchNet, which reduces the time consumed in random depth and improves computational efficiency. Secondly, we design an effective data enhancement mechanism. Specifically, a mask 1/3 of the size of the image is used to randomly erase data on the image. By minimizing the error between the prediction result of enhanced data and the ground reality, the sample prediction is standardized, the robustness of the model is enhanced, and the accuracy of depth prediction is improved. To test the validity of our method, we conducted tests on the DTU dataset. Compared to PatchmatchNet, the efficiency and accuracy of our approach are improved to varying degrees. Meanwhile, we get competitive results on challenging Tanks and Temples datasets.