Multi-Scale Deep Convolutional Nets with Attention Model and Conditional Random Fields for Semantic Image Segmentation

International Conference on Signal Processing and Machine Learning Pub Date : 2019-11-27 DOI:10.1145/3372806.3372811

Ming Liu, Caiming Zhang, Zhao Zhang

{"title":"Multi-Scale Deep Convolutional Nets with Attention Model and Conditional Random Fields for Semantic Image Segmentation","authors":"Ming Liu, Caiming Zhang, Zhao Zhang","doi":"10.1145/3372806.3372811","DOIUrl":null,"url":null,"abstract":"Although Convolutional Neural Networks are effective visual models that generate hierarchies of features, there still exist some shortcomings in the application of Deep Convolutional Neural Networks to semantic image segmentation. In this work, our algorithm incorporates multi-scale atrous convolution, attention model and Conditional Random Fields to tackle this problem. Firstly, our method replaces deconvolutional layers with atrous convolutional layers to avoid reducing feature resolution when the Deep Convolutional Neural Networks is employed in a fully convolutional fashion. Secondly, multi-scale architecture and attention model are used to extract the existence of features at multiple scales. Thirdly, we use Conditional Random Fields to prevent the built-in invariance of Deep Convolutional Neural Networks reducing localization accuracy. Moreover, our network completely integrates Conditional Random Fields modelling with Deep Convolutional Neural Networks, making it possible to train the deep network end-to-end. In this paper, our method is used to the matters of semantic image segmentation and is demonstrated the effectiveness of our model with experiments on PASCAL VOC 2012.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Signal Processing and Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3372806.3372811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Although Convolutional Neural Networks are effective visual models that generate hierarchies of features, there still exist some shortcomings in the application of Deep Convolutional Neural Networks to semantic image segmentation. In this work, our algorithm incorporates multi-scale atrous convolution, attention model and Conditional Random Fields to tackle this problem. Firstly, our method replaces deconvolutional layers with atrous convolutional layers to avoid reducing feature resolution when the Deep Convolutional Neural Networks is employed in a fully convolutional fashion. Secondly, multi-scale architecture and attention model are used to extract the existence of features at multiple scales. Thirdly, we use Conditional Random Fields to prevent the built-in invariance of Deep Convolutional Neural Networks reducing localization accuracy. Moreover, our network completely integrates Conditional Random Fields modelling with Deep Convolutional Neural Networks, making it possible to train the deep network end-to-end. In this paper, our method is used to the matters of semantic image segmentation and is demonstrated the effectiveness of our model with experiments on PASCAL VOC 2012.

查看原文本刊更多论文

基于注意模型和条件随机场的多尺度深度卷积网络语义图像分割

虽然卷积神经网络是一种有效的生成特征层次的视觉模型，但是深度卷积神经网络在语义图像分割中的应用还存在一些不足。在这项工作中，我们的算法结合了多尺度亚特鲁斯卷积、注意模型和条件随机场来解决这个问题。首先，我们的方法用反卷积层代替反卷积层，以避免在以全卷积方式使用深度卷积神经网络时降低特征分辨率。其次，采用多尺度结构和注意模型提取多尺度特征的存在性;第三，我们使用条件随机场来防止深度卷积神经网络的内置不变性降低定位精度。此外，我们的网络完全集成了条件随机场建模和深度卷积神经网络，使得端到端训练深度网络成为可能。本文将该方法应用于语义图像分割问题，并在PASCAL VOC 2012上进行了实验，验证了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Signal Processing and Machine Learning

自引率

0.00%

发文量