Semantic segmentation of road scene based on multi-scale feature extraction and deep supervision

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI:10.1117/12.2644695

Longfei Wang, Chunman Yan

{"title":"Semantic segmentation of road scene based on multi-scale feature extraction and deep supervision","authors":"Longfei Wang, Chunman Yan","doi":"10.1117/12.2644695","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of inaccurate segmentation edges, poor adaptability to multi-scale road targets, prone to false segmentation and missing segmentation when segmenting road targets with various and changeable occlusions in the traditional U-Net model, a semantic segmentation model of road scene based on multi-scale feature extraction and deep supervision module is proposed. Firstly, the dual attention module is embedded in the U-Net encoder, which can make the model have the ability to capture the context information of channel dimension and spatial dimension in the global range, and enhance the road features; Secondly, before upsampling, the feature map containing high-level semantic information is input into ASPP module to obtain road features of different scales; Finally, the deep supervision module is introduced into the upsampling part to learn the feature representation at different levels and retain more road detail features. Experiments are carried out on CamVid dataset and Cityscapes dataset. The results show that our Network can effectively segment road targets with different scales, and the segmented road contour is more complete and clear, which improves the accuracy of semantic segmentation while ensuring a certain segmentation speed.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2644695","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Aiming at the problems of inaccurate segmentation edges, poor adaptability to multi-scale road targets, prone to false segmentation and missing segmentation when segmenting road targets with various and changeable occlusions in the traditional U-Net model, a semantic segmentation model of road scene based on multi-scale feature extraction and deep supervision module is proposed. Firstly, the dual attention module is embedded in the U-Net encoder, which can make the model have the ability to capture the context information of channel dimension and spatial dimension in the global range, and enhance the road features; Secondly, before upsampling, the feature map containing high-level semantic information is input into ASPP module to obtain road features of different scales; Finally, the deep supervision module is introduced into the upsampling part to learn the feature representation at different levels and retain more road detail features. Experiments are carried out on CamVid dataset and Cityscapes dataset. The results show that our Network can effectively segment road targets with different scales, and the segmented road contour is more complete and clear, which improves the accuracy of semantic segmentation while ensuring a certain segmentation speed.

查看原文本刊更多论文

基于多尺度特征提取和深度监督的道路场景语义分割

针对传统的U-Net模型在分割遮挡多变性道路目标时存在分割边缘不准确、对多尺度道路目标适应性差、易出现分割错误和分割缺失等问题，提出了一种基于多尺度特征提取和深度监督模块的道路场景语义分割模型。首先，在U-Net编码器中嵌入双注意模块，使模型具有在全局范围内捕获通道维度和空间维度上下文信息的能力，增强道路特征;其次，在上采样前，将包含高级语义信息的特征图输入到ASPP模块中，得到不同尺度的道路特征;最后，在上采样部分引入深度监督模块，学习不同层次的特征表示，保留更多的道路细节特征。在CamVid数据集和cityscape数据集上进行了实验。结果表明，我们的网络可以有效分割不同尺度的道路目标，分割出来的道路轮廓更加完整清晰，在保证一定分割速度的同时提高了语义分割的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Digital Image Processing

自引率

0.00%

发文量