J. Liu, Zhongliang Wu, Yang Hong, Guoyun Zhong, Meifeng Liu
{"title":"基于扩展卷积和多层特征融合的图像语义分割","authors":"J. Liu, Zhongliang Wu, Yang Hong, Guoyun Zhong, Meifeng Liu","doi":"10.1109/AIID51893.2021.9456560","DOIUrl":null,"url":null,"abstract":"At present, most of the research methods of image semantic segmentation are based on Fully Convolutional Networks (FCN). However, FCN will cause the loss of image feature information when performing image semantic segmentation, and the details of the output image will not be processed well. Therefore, we propose to take the ResNet network as the encoder basic network. Using dilated convolution to extract context information, and designing a multi-scale feature fusion method in the decoder to make full use of features from each level to enrich representative ability of feature points, so that it can classify image pixels well. Extensive experiments demonstrate that our method shows superior performance over other methods on the PASCAL VOC2012 [10]validation dataset.","PeriodicalId":412698,"journal":{"name":"2021 IEEE International Conference on Artificial Intelligence and Industrial Design (AIID)","volume":"215 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Image Semantic Segmentation Based on Dilated Convolution and Multi-Layer Feature Fusion\",\"authors\":\"J. Liu, Zhongliang Wu, Yang Hong, Guoyun Zhong, Meifeng Liu\",\"doi\":\"10.1109/AIID51893.2021.9456560\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"At present, most of the research methods of image semantic segmentation are based on Fully Convolutional Networks (FCN). However, FCN will cause the loss of image feature information when performing image semantic segmentation, and the details of the output image will not be processed well. Therefore, we propose to take the ResNet network as the encoder basic network. Using dilated convolution to extract context information, and designing a multi-scale feature fusion method in the decoder to make full use of features from each level to enrich representative ability of feature points, so that it can classify image pixels well. Extensive experiments demonstrate that our method shows superior performance over other methods on the PASCAL VOC2012 [10]validation dataset.\",\"PeriodicalId\":412698,\"journal\":{\"name\":\"2021 IEEE International Conference on Artificial Intelligence and Industrial Design (AIID)\",\"volume\":\"215 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Artificial Intelligence and Industrial Design (AIID)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIID51893.2021.9456560\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Artificial Intelligence and Industrial Design (AIID)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIID51893.2021.9456560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Image Semantic Segmentation Based on Dilated Convolution and Multi-Layer Feature Fusion
At present, most of the research methods of image semantic segmentation are based on Fully Convolutional Networks (FCN). However, FCN will cause the loss of image feature information when performing image semantic segmentation, and the details of the output image will not be processed well. Therefore, we propose to take the ResNet network as the encoder basic network. Using dilated convolution to extract context information, and designing a multi-scale feature fusion method in the decoder to make full use of features from each level to enrich representative ability of feature points, so that it can classify image pixels well. Extensive experiments demonstrate that our method shows superior performance over other methods on the PASCAL VOC2012 [10]validation dataset.