Generative Adversarial Networks for Depth Map Estimation from RGB Video

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI:10.1109/CVPRW.2018.00163

Kin Gwn Lore, K. Reddy, M. Giering, Edgar A. Bernal

{"title":"Generative Adversarial Networks for Depth Map Estimation from RGB Video","authors":"Kin Gwn Lore, K. Reddy, M. Giering, Edgar A. Bernal","doi":"10.1109/CVPRW.2018.00163","DOIUrl":null,"url":null,"abstract":"Depth cues are essential to achieving high-level scene understanding, and in particular to determining geometric relations between objects. The ability to reason about depth information in scene analysis tasks can often result in improved decision-making capabilities. Unfortunately, depth-capable sensors are not as ubiquitous as traditional RGB cameras, which limits the availability of depth-related cues. In this work, we investigate data-driven approaches for depth estimation from images or videos captured with monocular cameras. We propose three different approaches and demonstrate their efficacy through extensive experimental validation. The proposed methods rely on processing of (i) a single 3-channel RGB image frame, (ii) a sequence of RGB frames, and (iii) a single RGB frame plus the optical flow field computed between the frame and a neighboring frame in the video stream, and map the respective inputs to an estimated depth map representation. In contrast to existing literature, the input-output mapping is not directly regressed; rather, it is learned through adversarial techniques that leverage conditional generative adversarial networks (cGANs).","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW.2018.00163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

Abstract

Depth cues are essential to achieving high-level scene understanding, and in particular to determining geometric relations between objects. The ability to reason about depth information in scene analysis tasks can often result in improved decision-making capabilities. Unfortunately, depth-capable sensors are not as ubiquitous as traditional RGB cameras, which limits the availability of depth-related cues. In this work, we investigate data-driven approaches for depth estimation from images or videos captured with monocular cameras. We propose three different approaches and demonstrate their efficacy through extensive experimental validation. The proposed methods rely on processing of (i) a single 3-channel RGB image frame, (ii) a sequence of RGB frames, and (iii) a single RGB frame plus the optical flow field computed between the frame and a neighboring frame in the video stream, and map the respective inputs to an estimated depth map representation. In contrast to existing literature, the input-output mapping is not directly regressed; rather, it is learned through adversarial techniques that leverage conditional generative adversarial networks (cGANs).

查看原文本刊更多论文

基于生成对抗网络的RGB视频深度图估计

深度线索对于实现高水平的场景理解至关重要，特别是确定物体之间的几何关系。在场景分析任务中对深度信息进行推理的能力通常可以提高决策能力。不幸的是，深度传感器并不像传统的RGB相机那样普遍，这限制了深度相关线索的可用性。在这项工作中，我们研究了数据驱动的方法，用于从单目摄像机拍摄的图像或视频中进行深度估计。我们提出了三种不同的方法，并通过广泛的实验验证证明了它们的有效性。所提出的方法依赖于处理(i)单个3通道RGB图像帧，(ii) RGB帧序列，以及(iii)单个RGB帧加上视频流中帧与相邻帧之间计算的光流场，并将各自的输入映射到估计的深度图表示。与现有文献相反，输入-输出映射没有直接回归;相反，它是通过利用条件生成对抗网络(cgan)的对抗技术来学习的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

自引率

0.00%

发文量