RADepthNet: Reflectance-aware monocular depth estimation

Q1 Computer Science

Virtual Reality Intelligent Hardware Pub Date : 2022-10-01 DOI:10.1016/j.vrih.2022.08.005

Chuxuan Li , Ran Yi , Saba Ghazanfar Ali , Lizhuang Ma , Enhua Wu , Jihong Wang , Lijuan Mao , Bin Sheng

{"title":"RADepthNet: Reflectance-aware monocular depth estimation","authors":"Chuxuan Li , Ran Yi , Saba Ghazanfar Ali , Lizhuang Ma , Enhua Wu , Jihong Wang , Lijuan Mao , Bin Sheng","doi":"10.1016/j.vrih.2022.08.005","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Monocular depth estimation aims to predict a dense depth map from a single RGB image, and has important applications in 3D reconstruction, automatic driving, and augmented reality. However, existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy, which leads to inferior performance.</p></div><div><h3>Methods</h3><p>To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy, we propose RADepthNet, a novel reflectance-guided network that fuses boundary features. Specifically, our method predicts depth maps using the following three steps: (1) Intrinsic Image Decomposition. We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance. Through an ablation study, we demonstrate that the module can reduce the influence of illumination on depth estimation. (2) Boundary Detection. A boundary extraction module, consisting of an encoder, refinement block, and upsample block, was proposed to better predict the depth at object boundaries utilizing gradient constraints. (3) Depth Prediction Module<strong>.</strong> We use an encoder different from (2) to obtain depth features from the reflectance map and fuse boundary features to predict depth. In addition, we proposed FIFADataset, a depth-estimation dataset applied in soccer scenarios.</p></div><div><h3>Results</h3><p>Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"4 5","pages":"Pages 418-431"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579622000808/pdf?md5=fc1d9cddf0180762f5b3a461f1d2e01d&pid=1-s2.0-S2096579622000808-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579622000808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Monocular depth estimation aims to predict a dense depth map from a single RGB image, and has important applications in 3D reconstruction, automatic driving, and augmented reality. However, existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy, which leads to inferior performance.

Methods

To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy, we propose RADepthNet, a novel reflectance-guided network that fuses boundary features. Specifically, our method predicts depth maps using the following three steps: (1) Intrinsic Image Decomposition. We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance. Through an ablation study, we demonstrate that the module can reduce the influence of illumination on depth estimation. (2) Boundary Detection. A boundary extraction module, consisting of an encoder, refinement block, and upsample block, was proposed to better predict the depth at object boundaries utilizing gradient constraints. (3) Depth Prediction Module. We use an encoder different from (2) to obtain depth features from the reflectance map and fuse boundary features to predict depth. In addition, we proposed FIFADataset, a depth-estimation dataset applied in soccer scenarios.

Results

Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.

查看原文本刊更多论文

RADepthNet:反射感知单目深度估计

单目深度估计旨在从单个RGB图像中预测密集的深度图，在3D重建，自动驾驶和增强现实中具有重要应用。然而，现有方法直接将原始RGB图像输入到模型中提取深度特征，没有避免深度无关信息对深度估计精度的干扰，导致性能较差。方法为了消除深度无关信息的影响，提高深度预测精度，我们提出了一种融合边界特征的反射制导网络RADepthNet。具体来说，我们的方法通过以下三个步骤来预测深度图:(1)内在图像分解。我们提出了一个由编码器-解码器结构组成的反射率提取模块来提取深度相关反射率。通过烧蚀实验，我们证明了该模块可以减少光照对深度估计的影响。(2)边界检测。为了更好地利用梯度约束预测目标边界深度，提出了一种由编码器、细化块和上样块组成的边界提取模块。(3)深度预测模块。我们使用不同于(2)的编码器从反射率图中获取深度特征，并融合边界特征来预测深度。此外，我们提出了FIFADataset，这是一个应用于足球场景的深度估计数据集。结果在公共数据集和我们提出的fifadata数据集上进行的大量实验表明，我们的方法达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊