Stereo-augmented Depth Completion from a Single RGB-LiDAR image

2021 IEEE International Conference on Robotics and Automation (ICRA) Pub Date : 2021-05-30 DOI:10.1109/ICRA48506.2021.9561557

Keunhoon Choi, Somi Jeong, Youngjung Kim, K. Sohn

{"title":"Stereo-augmented Depth Completion from a Single RGB-LiDAR image","authors":"Keunhoon Choi, Somi Jeong, Youngjung Kim, K. Sohn","doi":"10.1109/ICRA48506.2021.9561557","DOIUrl":null,"url":null,"abstract":"Depth completion is an important task in computer vision and robotics applications, which aims at predicting accurate dense depth from a single RGB-LiDAR image. Convolutional neural networks (CNNs) have been widely used for depth completion to learn a mapping function from sparse to dense depth. However, recent methods do not exploit any 3D geometric cues during the inference stage and mainly rely on sophisticated CNN architectures. In this paper, we present a cascade and geometrically inspired learning framework for depth completion, consisting of three stages: view extrapolation, stereo matching, and depth refinement. The first stage extrapolates a virtual (right) view using a single RGB (left) and its LiDAR data. We then mimic the binocular stereo-matching, and as a result, explicitly encode geometric constraints during depth completion. This stage augments the final refinement process by providing additional geometric reasoning. We also introduce a distillation framework based on teacher-student strategy to effectively train our network. Knowledge from a teacher model privileged with real stereo pairs is transferred to the student through feature distillation. Experimental results on KITTI depth completion benchmark demonstrate that the proposed method is superior to state-of-the-art methods.","PeriodicalId":108312,"journal":{"name":"2021 IEEE International Conference on Robotics and Automation (ICRA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48506.2021.9561557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Depth completion is an important task in computer vision and robotics applications, which aims at predicting accurate dense depth from a single RGB-LiDAR image. Convolutional neural networks (CNNs) have been widely used for depth completion to learn a mapping function from sparse to dense depth. However, recent methods do not exploit any 3D geometric cues during the inference stage and mainly rely on sophisticated CNN architectures. In this paper, we present a cascade and geometrically inspired learning framework for depth completion, consisting of three stages: view extrapolation, stereo matching, and depth refinement. The first stage extrapolates a virtual (right) view using a single RGB (left) and its LiDAR data. We then mimic the binocular stereo-matching, and as a result, explicitly encode geometric constraints during depth completion. This stage augments the final refinement process by providing additional geometric reasoning. We also introduce a distillation framework based on teacher-student strategy to effectively train our network. Knowledge from a teacher model privileged with real stereo pairs is transferred to the student through feature distillation. Experimental results on KITTI depth completion benchmark demonstrate that the proposed method is superior to state-of-the-art methods.

查看原文本刊更多论文

单幅RGB-LiDAR图像的立体增强深度补全

深度补全是计算机视觉和机器人应用中的一项重要任务，旨在从单个RGB-LiDAR图像中预测准确的密集深度。卷积神经网络(cnn)被广泛用于深度补全，以学习从稀疏到密集深度的映射函数。然而，最近的方法在推理阶段没有利用任何三维几何线索，主要依赖于复杂的CNN架构。在本文中，我们提出了一个层叠和几何启发的深度补全学习框架，包括三个阶段:视图外推，立体匹配和深度细化。第一阶段使用单个RGB(左)及其激光雷达数据推断出虚拟(右)视图。然后，我们模拟了双目立体匹配，因此，在深度完成期间显式编码几何约束。这个阶段通过提供额外的几何推理来增强最终的细化过程。我们还引入了一个基于师生策略的蒸馏框架来有效地训练我们的网络。通过特征提炼，将具有真实立体对的教师模型中的知识传递给学生。在KITTI深度完井基准上的实验结果表明，该方法优于现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量