通过观看视频实现统一的无监督光流和立体深度估计

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI:10.1109/CVPR.2019.00826

Yang Wang, Peng Wang, Zhenheng Yang, Chenxu Luo, Yezhou Yang, W. Xu

{"title":"通过观看视频实现统一的无监督光流和立体深度估计","authors":"Yang Wang, Peng Wang, Zhenheng Yang, Chenxu Luo, Yezhou Yang, W. Xu","doi":"10.1109/CVPR.2019.00826","DOIUrl":null,"url":null,"abstract":"In this paper, we propose UnOS, an unified system for unsupervised optical flow and stereo depth estimation using convolutional neural network (CNN) by taking advantages of their inherent geometrical consistency based on the rigid-scene assumption. UnOS significantly outperforms other state-of-the-art (SOTA) unsupervised approaches that treated the two tasks independently. Specifically, given two consecutive stereo image pairs from a video, UnOS estimates per-pixel stereo depth images, camera ego-motion and optical flow with three parallel CNNs. Based on these quantities, UnOS computes rigid optical flow and compares it against the optical flow estimated from the FlowNet, yielding pixels satisfying the rigid-scene assumption. Then, we encourage geometrical consistency between the two estimated flows within rigid regions, from which we derive a rigid-aware direct visual odometry (RDVO) module. We also propose rigid and occlusion-aware flow-consistency losses for the learning of UnOS. We evaluated our results on the popular KITTI dataset over 4 related tasks, \\ie stereo depth, optical flow, visual odometry and motion segmentation.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"2 1","pages":"8063-8073"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"134","resultStr":"{\"title\":\"UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos\",\"authors\":\"Yang Wang, Peng Wang, Zhenheng Yang, Chenxu Luo, Yezhou Yang, W. Xu\",\"doi\":\"10.1109/CVPR.2019.00826\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose UnOS, an unified system for unsupervised optical flow and stereo depth estimation using convolutional neural network (CNN) by taking advantages of their inherent geometrical consistency based on the rigid-scene assumption. UnOS significantly outperforms other state-of-the-art (SOTA) unsupervised approaches that treated the two tasks independently. Specifically, given two consecutive stereo image pairs from a video, UnOS estimates per-pixel stereo depth images, camera ego-motion and optical flow with three parallel CNNs. Based on these quantities, UnOS computes rigid optical flow and compares it against the optical flow estimated from the FlowNet, yielding pixels satisfying the rigid-scene assumption. Then, we encourage geometrical consistency between the two estimated flows within rigid regions, from which we derive a rigid-aware direct visual odometry (RDVO) module. We also propose rigid and occlusion-aware flow-consistency losses for the learning of UnOS. We evaluated our results on the popular KITTI dataset over 4 related tasks, \\\\ie stereo depth, optical flow, visual odometry and motion segmentation.\",\"PeriodicalId\":6711,\"journal\":{\"name\":\"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)\",\"volume\":\"2 1\",\"pages\":\"8063-8073\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"134\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2019.00826\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2019.00826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 134

摘要

本文基于刚性场景假设，利用卷积神经网络(CNN)固有的几何一致性，提出了一种基于卷积神经网络的无监督光流和立体深度估计的统一系统UnOS。UnOS显著优于其他独立处理这两个任务的最先进(SOTA)无监督方法。具体来说，给定来自一个视频的两个连续的立体图像对，UnOS用三个平行的cnn估计每像素立体深度图像、相机自我运动和光流。基于这些量，UnOS计算刚性光流，并将其与FlowNet估计的光流进行比较，产生满足刚性场景假设的像素。然后，我们鼓励在刚性区域内的两个估计流之间的几何一致性，从中我们得出一个刚性感知的直接视觉里程计(RDVO)模块。我们还提出了用于UnOS学习的刚性和闭塞感知流一致性损失。我们在流行的KITTI数据集上评估了4个相关任务的结果，即立体深度、光流、视觉里程计和运动分割。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos

In this paper, we propose UnOS, an unified system for unsupervised optical flow and stereo depth estimation using convolutional neural network (CNN) by taking advantages of their inherent geometrical consistency based on the rigid-scene assumption. UnOS significantly outperforms other state-of-the-art (SOTA) unsupervised approaches that treated the two tasks independently. Specifically, given two consecutive stereo image pairs from a video, UnOS estimates per-pixel stereo depth images, camera ego-motion and optical flow with three parallel CNNs. Based on these quantities, UnOS computes rigid optical flow and compares it against the optical flow estimated from the FlowNet, yielding pixels satisfying the rigid-scene assumption. Then, we encourage geometrical consistency between the two estimated flows within rigid regions, from which we derive a rigid-aware direct visual odometry (RDVO) module. We also propose rigid and occlusion-aware flow-consistency losses for the learning of UnOS. We evaluated our results on the popular KITTI dataset over 4 related tasks, \ie stereo depth, optical flow, visual odometry and motion segmentation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量