RADepthNet: Reflectance-aware monocular depth estimation

Q1 Computer Science
Chuxuan Li , Ran Yi , Saba Ghazanfar Ali , Lizhuang Ma , Enhua Wu , Jihong Wang , Lijuan Mao , Bin Sheng
{"title":"RADepthNet: Reflectance-aware monocular depth estimation","authors":"Chuxuan Li ,&nbsp;Ran Yi ,&nbsp;Saba Ghazanfar Ali ,&nbsp;Lizhuang Ma ,&nbsp;Enhua Wu ,&nbsp;Jihong Wang ,&nbsp;Lijuan Mao ,&nbsp;Bin Sheng","doi":"10.1016/j.vrih.2022.08.005","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Monocular depth estimation aims to predict a dense depth map from a single RGB image, and has important applications in 3D reconstruction, automatic driving, and augmented reality. However, existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy, which leads to inferior performance.</p></div><div><h3>Methods</h3><p>To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy, we propose RADepthNet, a novel reflectance-guided network that fuses boundary features. Specifically, our method predicts depth maps using the following three steps: (1) Intrinsic Image Decomposition. We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance. Through an ablation study, we demonstrate that the module can reduce the influence of illumination on depth estimation. (2) Boundary Detection. A boundary extraction module, consisting of an encoder, refinement block, and upsample block, was proposed to better predict the depth at object boundaries utilizing gradient constraints. (3) Depth Prediction Module<strong>.</strong> We use an encoder different from (2) to obtain depth features from the reflectance map and fuse boundary features to predict depth. In addition, we proposed FIFADataset, a depth-estimation dataset applied in soccer scenarios.</p></div><div><h3>Results</h3><p>Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"4 5","pages":"Pages 418-431"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579622000808/pdf?md5=fc1d9cddf0180762f5b3a461f1d2e01d&pid=1-s2.0-S2096579622000808-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579622000808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Monocular depth estimation aims to predict a dense depth map from a single RGB image, and has important applications in 3D reconstruction, automatic driving, and augmented reality. However, existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy, which leads to inferior performance.

Methods

To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy, we propose RADepthNet, a novel reflectance-guided network that fuses boundary features. Specifically, our method predicts depth maps using the following three steps: (1) Intrinsic Image Decomposition. We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance. Through an ablation study, we demonstrate that the module can reduce the influence of illumination on depth estimation. (2) Boundary Detection. A boundary extraction module, consisting of an encoder, refinement block, and upsample block, was proposed to better predict the depth at object boundaries utilizing gradient constraints. (3) Depth Prediction Module. We use an encoder different from (2) to obtain depth features from the reflectance map and fuse boundary features to predict depth. In addition, we proposed FIFADataset, a depth-estimation dataset applied in soccer scenarios.

Results

Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.

RADepthNet:反射感知单目深度估计
单目深度估计旨在从单个RGB图像中预测密集的深度图,在3D重建,自动驾驶和增强现实中具有重要应用。然而,现有方法直接将原始RGB图像输入到模型中提取深度特征,没有避免深度无关信息对深度估计精度的干扰,导致性能较差。方法为了消除深度无关信息的影响,提高深度预测精度,我们提出了一种融合边界特征的反射制导网络RADepthNet。具体来说,我们的方法通过以下三个步骤来预测深度图:(1)内在图像分解。我们提出了一个由编码器-解码器结构组成的反射率提取模块来提取深度相关反射率。通过烧蚀实验,我们证明了该模块可以减少光照对深度估计的影响。(2)边界检测。为了更好地利用梯度约束预测目标边界深度,提出了一种由编码器、细化块和上样块组成的边界提取模块。(3)深度预测模块。我们使用不同于(2)的编码器从反射率图中获取深度特征,并融合边界特征来预测深度。此外,我们提出了FIFADataset,这是一个应用于足球场景的深度估计数据集。结果在公共数据集和我们提出的fifadata数据集上进行的大量实验表明,我们的方法达到了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Virtual Reality  Intelligent Hardware
Virtual Reality Intelligent Hardware Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
6.40
自引率
0.00%
发文量
35
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信