用于单目深度估计的深度有序回归网络。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 Epub Date: 2018-12-17 DOI:10.1109/CVPR.2018.00214

Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, Dacheng Tao

{"title":"用于单目深度估计的深度有序回归网络。","authors":"Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, Dacheng Tao","doi":"10.1109/CVPR.2018.00214","DOIUrl":null,"url":null,"abstract":"Monocular depth estimation, which plays a crucial role in understanding 3D scene geometry, is an ill-posed problem. Recent methods have gained significant improvement by exploring image-level information and hierarchical features from deep convolutional neural networks (DCNNs). These methods model depth estimation as a regression problem and train the regression networks by minimizing mean squared error, which suffers from slow convergence and unsatisfactory local solutions. Besides, existing depth estimation networks employ repeated spatial pooling operations, resulting in undesirable low-resolution feature maps. To obtain high-resolution depth maps, skip-connections or multilayer deconvolution networks are required, which complicates network training and consumes much more computations. To eliminate or at least largely reduce these problems, we introduce a spacing-increasing discretization (SID) strategy to discretize depth and recast depth network learning as an ordinal regression problem. By training the network using an ordinary regression loss, our method achieves much higher accuracy and faster convergence in synch. Furthermore, we adopt a multi-scale network structure which avoids unnecessary spatial pooling and captures multi-scale information in parallel. The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, i.e., KITTI [16], Make3D [49], and NYU Depth v2 [41], and outperforms existing methods by a large margin.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2018 ","pages":"2002-2011"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CVPR.2018.00214","citationCount":"1334","resultStr":"{\"title\":\"Deep Ordinal Regression Network for Monocular Depth Estimation.\",\"authors\":\"Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, Dacheng Tao\",\"doi\":\"10.1109/CVPR.2018.00214\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Monocular depth estimation, which plays a crucial role in understanding 3D scene geometry, is an ill-posed problem. Recent methods have gained significant improvement by exploring image-level information and hierarchical features from deep convolutional neural networks (DCNNs). These methods model depth estimation as a regression problem and train the regression networks by minimizing mean squared error, which suffers from slow convergence and unsatisfactory local solutions. Besides, existing depth estimation networks employ repeated spatial pooling operations, resulting in undesirable low-resolution feature maps. To obtain high-resolution depth maps, skip-connections or multilayer deconvolution networks are required, which complicates network training and consumes much more computations. To eliminate or at least largely reduce these problems, we introduce a spacing-increasing discretization (SID) strategy to discretize depth and recast depth network learning as an ordinal regression problem. By training the network using an ordinary regression loss, our method achieves much higher accuracy and faster convergence in synch. Furthermore, we adopt a multi-scale network structure which avoids unnecessary spatial pooling and captures multi-scale information in parallel. The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, i.e., KITTI [16], Make3D [49], and NYU Depth v2 [41], and outperforms existing methods by a large margin.\",\"PeriodicalId\":74560,\"journal\":{\"name\":\"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition\",\"volume\":\"2018 \",\"pages\":\"2002-2011\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/CVPR.2018.00214\",\"citationCount\":\"1334\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2018.00214\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2018/12/17 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2018.00214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2018/12/17 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1334

摘要

单目深度估计是一个不适定问题，在理解三维场景几何中起着至关重要的作用。最近的方法通过从深度卷积神经网络（DCNN）中探索图像级信息和层次特征获得了显著的改进。这些方法将深度估计建模为一个回归问题，并通过最小化均方误差来训练回归网络，该均方误差存在收敛缓慢和局部解不令人满意的问题。此外，现有的深度估计网络采用重复的空间池化操作，导致不希望的低分辨率特征图。为了获得高分辨率的深度图，需要跳跃连接或多层反褶积网络，这使网络训练复杂化并消耗更多的计算。为了消除或至少在很大程度上减少这些问题，我们引入了一种间距增加离散化（SID）策略来离散深度，并将深度网络学习重新定义为有序回归问题。通过使用普通回归损失训练网络，我们的方法实现了更高的精度和更快的同步收敛。此外，我们采用了多尺度网络结构，避免了不必要的空间池，并并行捕获了多尺度信息。所提出的深度有序回归网络（DORN）在三个具有挑战性的基准上取得了最先进的结果，即KITTI[16]、Make3D[49]和NYU Depth v2[41]，并在很大程度上优于现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Deep Ordinal Regression Network for Monocular Depth Estimation.

查看原文本刊更多论文

Deep Ordinal Regression Network for Monocular Depth Estimation.

Monocular depth estimation, which plays a crucial role in understanding 3D scene geometry, is an ill-posed problem. Recent methods have gained significant improvement by exploring image-level information and hierarchical features from deep convolutional neural networks (DCNNs). These methods model depth estimation as a regression problem and train the regression networks by minimizing mean squared error, which suffers from slow convergence and unsatisfactory local solutions. Besides, existing depth estimation networks employ repeated spatial pooling operations, resulting in undesirable low-resolution feature maps. To obtain high-resolution depth maps, skip-connections or multilayer deconvolution networks are required, which complicates network training and consumes much more computations. To eliminate or at least largely reduce these problems, we introduce a spacing-increasing discretization (SID) strategy to discretize depth and recast depth network learning as an ordinal regression problem. By training the network using an ordinary regression loss, our method achieves much higher accuracy and faster convergence in synch. Furthermore, we adopt a multi-scale network structure which avoids unnecessary spatial pooling and captures multi-scale information in parallel. The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, i.e., KITTI [16], Make3D [49], and NYU Depth v2 [41], and outperforms existing methods by a large margin.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

CiteScore

43.50

自引率

0.00%

发文量