{"title":"改进的自监督深度预测点变换方法","authors":"Ziwen Chen, Zixuan Guo, Jerod J. Weinman","doi":"10.1109/CRV52889.2021.00023","DOIUrl":null,"url":null,"abstract":"Given stereo or egomotion image pairs, a popular and successful method for unsupervised learning of monocular depth estimation is to measure the quality of image reconstructions resulting from the learned depth predictions. Continued research has improved the overall approach in recent years, yet the common framework still suffers from several important limitations, particularly when dealing with points occluded after transformation to a novel viewpoint. While prior work has addressed the problem heuristically, this paper introduces a z-buffering algorithm that correctly and efficiently handles occluded points. Because our algorithm is implemented with operators typical of machine learning libraries, it can be incorporated into any existing unsupervised depth learning framework with automatic support for differentiation. Additionally, because points having negative depth after transformation often signify erroneously shallow depth predictions, we introduce a loss function to explicitly penalize this undesirable behavior. Experimental results on the KITTI data set show that the z-buffer and negative depth loss both improve the performance of a state of the art depth-prediction network. The code is available at https://github.com/arthurhero/ZbuffDepth and archived at https://hdl.handle.net/11084/10450.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Point Transformation Methods For Self-Supervised Depth Prediction\",\"authors\":\"Ziwen Chen, Zixuan Guo, Jerod J. Weinman\",\"doi\":\"10.1109/CRV52889.2021.00023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given stereo or egomotion image pairs, a popular and successful method for unsupervised learning of monocular depth estimation is to measure the quality of image reconstructions resulting from the learned depth predictions. Continued research has improved the overall approach in recent years, yet the common framework still suffers from several important limitations, particularly when dealing with points occluded after transformation to a novel viewpoint. While prior work has addressed the problem heuristically, this paper introduces a z-buffering algorithm that correctly and efficiently handles occluded points. Because our algorithm is implemented with operators typical of machine learning libraries, it can be incorporated into any existing unsupervised depth learning framework with automatic support for differentiation. Additionally, because points having negative depth after transformation often signify erroneously shallow depth predictions, we introduce a loss function to explicitly penalize this undesirable behavior. Experimental results on the KITTI data set show that the z-buffer and negative depth loss both improve the performance of a state of the art depth-prediction network. The code is available at https://github.com/arthurhero/ZbuffDepth and archived at https://hdl.handle.net/11084/10450.\",\"PeriodicalId\":413697,\"journal\":{\"name\":\"2021 18th Conference on Robots and Vision (CRV)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th Conference on Robots and Vision (CRV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CRV52889.2021.00023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th Conference on Robots and Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV52889.2021.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improved Point Transformation Methods For Self-Supervised Depth Prediction
Given stereo or egomotion image pairs, a popular and successful method for unsupervised learning of monocular depth estimation is to measure the quality of image reconstructions resulting from the learned depth predictions. Continued research has improved the overall approach in recent years, yet the common framework still suffers from several important limitations, particularly when dealing with points occluded after transformation to a novel viewpoint. While prior work has addressed the problem heuristically, this paper introduces a z-buffering algorithm that correctly and efficiently handles occluded points. Because our algorithm is implemented with operators typical of machine learning libraries, it can be incorporated into any existing unsupervised depth learning framework with automatic support for differentiation. Additionally, because points having negative depth after transformation often signify erroneously shallow depth predictions, we introduce a loss function to explicitly penalize this undesirable behavior. Experimental results on the KITTI data set show that the z-buffer and negative depth loss both improve the performance of a state of the art depth-prediction network. The code is available at https://github.com/arthurhero/ZbuffDepth and archived at https://hdl.handle.net/11084/10450.