深度卷积网络联合语义分割和深度估计

Arsalan Mousavian, H. Pirsiavash, J. Kosecka
{"title":"深度卷积网络联合语义分割和深度估计","authors":"Arsalan Mousavian, H. Pirsiavash, J. Kosecka","doi":"10.1109/3DV.2016.69","DOIUrl":null,"url":null,"abstract":"Multi-scale deep CNNs have been used successfully for problems mapping each pixel to a label, such as depth estimation and semantic segmentation. It has also been shown that such architectures are reusable and can be used for multiple tasks. These networks are typically trained independently for each task by varying the output layer(s) and training objective. In this work we present a new model for simultaneous depth estimation and semantic segmentation from a single RGB image. Our approach demonstrates the feasibility of training parts of the model for each task and then fine tuning the full, combined model on both tasks simultaneously using a single loss function. Furthermore we couple the deep CNN with fully connected CRF, which captures the contextual relationships and interactions between the semantic and depth cues improving the accuracy of the final results. The proposed model is trained and evaluated on NYUDepth V2 dataset [23] outperforming the state of the art methods on semantic segmentation and achieving comparable results on the task of depth estimation.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"184 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"126","resultStr":"{\"title\":\"Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks\",\"authors\":\"Arsalan Mousavian, H. Pirsiavash, J. Kosecka\",\"doi\":\"10.1109/3DV.2016.69\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-scale deep CNNs have been used successfully for problems mapping each pixel to a label, such as depth estimation and semantic segmentation. It has also been shown that such architectures are reusable and can be used for multiple tasks. These networks are typically trained independently for each task by varying the output layer(s) and training objective. In this work we present a new model for simultaneous depth estimation and semantic segmentation from a single RGB image. Our approach demonstrates the feasibility of training parts of the model for each task and then fine tuning the full, combined model on both tasks simultaneously using a single loss function. Furthermore we couple the deep CNN with fully connected CRF, which captures the contextual relationships and interactions between the semantic and depth cues improving the accuracy of the final results. The proposed model is trained and evaluated on NYUDepth V2 dataset [23] outperforming the state of the art methods on semantic segmentation and achieving comparable results on the task of depth estimation.\",\"PeriodicalId\":425304,\"journal\":{\"name\":\"2016 Fourth International Conference on 3D Vision (3DV)\",\"volume\":\"184 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"126\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Fourth International Conference on 3D Vision (3DV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/3DV.2016.69\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Fourth International Conference on 3D Vision (3DV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DV.2016.69","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 126

摘要

多尺度深度cnn已经成功地用于将每个像素映射到标签的问题,如深度估计和语义分割。研究还表明,这样的体系结构是可重用的,可以用于多个任务。这些网络通常通过改变输出层和训练目标来独立训练每个任务。在这项工作中,我们提出了一个新的模型,同时深度估计和语义分割从单一的RGB图像。我们的方法证明了为每个任务训练部分模型的可行性,然后使用单个损失函数同时微调两个任务上的完整组合模型。此外,我们将深度CNN与完全连接的CRF相结合,后者捕获语义和深度线索之间的上下文关系和交互,从而提高最终结果的准确性。所提出的模型在NYUDepth V2数据集上进行了训练和评估[23],在语义分割方面优于目前最先进的方法,在深度估计任务上也取得了相当的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks
Multi-scale deep CNNs have been used successfully for problems mapping each pixel to a label, such as depth estimation and semantic segmentation. It has also been shown that such architectures are reusable and can be used for multiple tasks. These networks are typically trained independently for each task by varying the output layer(s) and training objective. In this work we present a new model for simultaneous depth estimation and semantic segmentation from a single RGB image. Our approach demonstrates the feasibility of training parts of the model for each task and then fine tuning the full, combined model on both tasks simultaneously using a single loss function. Furthermore we couple the deep CNN with fully connected CRF, which captures the contextual relationships and interactions between the semantic and depth cues improving the accuracy of the final results. The proposed model is trained and evaluated on NYUDepth V2 dataset [23] outperforming the state of the art methods on semantic segmentation and achieving comparable results on the task of depth estimation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信