A Novel 3D-Unet Deep Learning Framework Based on High-Dimensional Bilateral Grid for Edge Consistent Single Image Depth Estimation

Mansi Sharma, Abheesht Sharma, Kadvekar Rohit Tushar, Avinash Panneer
{"title":"A Novel 3D-Unet Deep Learning Framework Based on High-Dimensional Bilateral Grid for Edge Consistent Single Image Depth Estimation","authors":"Mansi Sharma, Abheesht Sharma, Kadvekar Rohit Tushar, Avinash Panneer","doi":"10.1109/IC3D51119.2020.9376327","DOIUrl":null,"url":null,"abstract":"The task of predicting smooth and edge-consistent depth maps is notoriously difficult for single image depth estimation. This paper proposes a novel Bilateral Grid based 3D convolutional neural network, dubbed as 3DBG-UNet, that parameterize high dimensional feature space by encoding compact 3D bilateral grids with UNets and infers sharp geometric layout of the scene. Further, an another novel 3DBGES-UNet model is introduced that integrate 3DBG-UNet for inferring an accurate depth map given a single color view. The 3DBGES-UNet concatenate 3DBG-UNet geometry map with the inception network edge accentuation map and a spatial object's boundary map obtained by leveraging semantic segmentation and train the UNet model with ResNet backbone. Both models are designed with a particular attention to explicitly account for edges or minute details. Preserving sharp discontinuities at depth edges is critical for many applications such as realistic integration of virtual objects in AR video or occlusion-aware view synthesis for 3D display applications. The proposed depth prediction network achieves state-of-the-art performance in both qualitative and quantitative evaluations on the challenging NYUv2-Depth data. The code and corresponding pre-trained weights will be made publicly available.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on 3D Immersion (IC3D)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3D51119.2020.9376327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

The task of predicting smooth and edge-consistent depth maps is notoriously difficult for single image depth estimation. This paper proposes a novel Bilateral Grid based 3D convolutional neural network, dubbed as 3DBG-UNet, that parameterize high dimensional feature space by encoding compact 3D bilateral grids with UNets and infers sharp geometric layout of the scene. Further, an another novel 3DBGES-UNet model is introduced that integrate 3DBG-UNet for inferring an accurate depth map given a single color view. The 3DBGES-UNet concatenate 3DBG-UNet geometry map with the inception network edge accentuation map and a spatial object's boundary map obtained by leveraging semantic segmentation and train the UNet model with ResNet backbone. Both models are designed with a particular attention to explicitly account for edges or minute details. Preserving sharp discontinuities at depth edges is critical for many applications such as realistic integration of virtual objects in AR video or occlusion-aware view synthesis for 3D display applications. The proposed depth prediction network achieves state-of-the-art performance in both qualitative and quantitative evaluations on the challenging NYUv2-Depth data. The code and corresponding pre-trained weights will be made publicly available.
基于高维双边网格的3D-Unet深度学习框架边缘一致单幅图像深度估计
对于单图像深度估计来说,预测平滑和边缘一致的深度图是出了名的困难。本文提出了一种新的基于双边网格的三维卷积神经网络3DBG-UNet,该网络通过对紧凑的三维双边网格进行unet编码来参数化高维特征空间,并推断出场景的尖锐几何布局。此外,介绍了另一种新颖的3dbgs - unet模型,该模型集成了3dbgs - unet,用于在给定的单一颜色视图下推断准确的深度图。3dbgs -UNet将3dbgs -UNet几何图与初始网络边缘强化图和利用语义分割得到的空间对象边界图进行拼接,并利用ResNet骨干网训练UNet模型。这两种模型的设计都特别注意明确地说明边缘或微小的细节。在深度边缘保留明显的不连续对于许多应用至关重要,例如AR视频中虚拟物体的现实集成或3D显示应用的闭塞感知视图合成。提出的深度预测网络在具有挑战性的NYUv2-Depth数据的定性和定量评估方面都达到了最先进的性能。代码和相应的预训练权重将公开提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信