{"title":"Self-Supervised Depth Estimation Based on the Consistency of Synthetic-real Image Prediction","authors":"Wei Tong, Yubing Gao, E. Wu, Limin Zhu","doi":"10.1109/ICARM58088.2023.10218857","DOIUrl":null,"url":null,"abstract":"Learning-based multi-view stereo aims to restore the real scene from multiple images with overlapping areas. The mainstream self-supervised MVS method trains the model based on the assumption that spatial points from different perspectives share the same color information. To further suppress the interference from specular reflection and illumination noise, this work proposes a self-supervised MVS network based on the consistency of synthetic-real image prediction. The network first applies the coarse-to-fine manner to gradually refine the depth map, and the source images are projected to the reference view to generate the synthesized reference image. Then the synthesized image with real source images are re-input into the network to form a cycled network, and the consistency constraint of the prediction results of the two periods is introduced to improve the color anti-interference of the self- supervised MVS model. The comprehensive experiments on the public dataset show that the proposed work can further improve the reconstruction performance of the benchmark model, which verifies the effectiveness of the proposed work.","PeriodicalId":220013,"journal":{"name":"2023 International Conference on Advanced Robotics and Mechatronics (ICARM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Advanced Robotics and Mechatronics (ICARM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARM58088.2023.10218857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Learning-based multi-view stereo aims to restore the real scene from multiple images with overlapping areas. The mainstream self-supervised MVS method trains the model based on the assumption that spatial points from different perspectives share the same color information. To further suppress the interference from specular reflection and illumination noise, this work proposes a self-supervised MVS network based on the consistency of synthetic-real image prediction. The network first applies the coarse-to-fine manner to gradually refine the depth map, and the source images are projected to the reference view to generate the synthesized reference image. Then the synthesized image with real source images are re-input into the network to form a cycled network, and the consistency constraint of the prediction results of the two periods is introduced to improve the color anti-interference of the self- supervised MVS model. The comprehensive experiments on the public dataset show that the proposed work can further improve the reconstruction performance of the benchmark model, which verifies the effectiveness of the proposed work.