Chuxuan Li , Ran Yi , Saba Ghazanfar Ali , Lizhuang Ma , Enhua Wu , Jihong Wang , Lijuan Mao , Bin Sheng
{"title":"RADepthNet: Reflectance-aware monocular depth estimation","authors":"Chuxuan Li , Ran Yi , Saba Ghazanfar Ali , Lizhuang Ma , Enhua Wu , Jihong Wang , Lijuan Mao , Bin Sheng","doi":"10.1016/j.vrih.2022.08.005","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Monocular depth estimation aims to predict a dense depth map from a single RGB image, and has important applications in 3D reconstruction, automatic driving, and augmented reality. However, existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy, which leads to inferior performance.</p></div><div><h3>Methods</h3><p>To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy, we propose RADepthNet, a novel reflectance-guided network that fuses boundary features. Specifically, our method predicts depth maps using the following three steps: (1) Intrinsic Image Decomposition. We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance. Through an ablation study, we demonstrate that the module can reduce the influence of illumination on depth estimation. (2) Boundary Detection. A boundary extraction module, consisting of an encoder, refinement block, and upsample block, was proposed to better predict the depth at object boundaries utilizing gradient constraints. (3) Depth Prediction Module<strong>.</strong> We use an encoder different from (2) to obtain depth features from the reflectance map and fuse boundary features to predict depth. In addition, we proposed FIFADataset, a depth-estimation dataset applied in soccer scenarios.</p></div><div><h3>Results</h3><p>Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"4 5","pages":"Pages 418-431"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579622000808/pdf?md5=fc1d9cddf0180762f5b3a461f1d2e01d&pid=1-s2.0-S2096579622000808-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579622000808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Monocular depth estimation aims to predict a dense depth map from a single RGB image, and has important applications in 3D reconstruction, automatic driving, and augmented reality. However, existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy, which leads to inferior performance.
Methods
To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy, we propose RADepthNet, a novel reflectance-guided network that fuses boundary features. Specifically, our method predicts depth maps using the following three steps: (1) Intrinsic Image Decomposition. We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance. Through an ablation study, we demonstrate that the module can reduce the influence of illumination on depth estimation. (2) Boundary Detection. A boundary extraction module, consisting of an encoder, refinement block, and upsample block, was proposed to better predict the depth at object boundaries utilizing gradient constraints. (3) Depth Prediction Module. We use an encoder different from (2) to obtain depth features from the reflectance map and fuse boundary features to predict depth. In addition, we proposed FIFADataset, a depth-estimation dataset applied in soccer scenarios.
Results
Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.