BiSeNet with Depthwise Attention Spatial Path for Semantic Segmentation

2022 International Workshop on Intelligent Systems (IWIS) Pub Date : 2022-08-17 DOI:10.1109/IWIS56333.2022.9920717

S. Kim, Kanghyun Jo

{"title":"BiSeNet with Depthwise Attention Spatial Path for Semantic Segmentation","authors":"S. Kim, Kanghyun Jo","doi":"10.1109/IWIS56333.2022.9920717","DOIUrl":null,"url":null,"abstract":"This paper proposes a new structure to obtain similar results while reducing the computational amount of BiSeNet for Real-Time Semantic Segmentation. Among the Spatial Path and Context Path of BiSeNet, the study was conducted focusing on the large size kernel of the Spatial Path. Spatial Path has rich spatial information by creating a feature map 1/8 times the size of the original image through three convolution operations. The convolution operation used at this time is performed in the order of 7×7, 3×3, and 3×3. When a general convolution is used for a kernel of such a large size, the calculated cost increases due to a large number of parameters. To solve this problem, this paper uses Depthwise Separable Convolution. At this time, in Depthwise Separable Convolution, loss occurs in Spatial Information. To solve this information loss, an attention mechanism [1] was applied by elementwise summing between the input and output feature maps of depthwise separable convolution. To solve the dimensional difference between input and output, PPM: Pooling Pointwise Module is used. PPM uses Maxpooling to change the Spatial Dimension of input features and Channel Dimension through Pointwise Convolution (lx1 Convolution) [2]. This paper propose to use Depthwise Attention Spatial Path for BiSeNet using these methods. Through our proposed methods, mIoU in SS, SSC, MSF, and MSCF were 72.7%, 74.1 %, 74.3%, and 76.1 %. Proposed network can segment the part that the original one can't when using our Depthwise Attention Spatial Path.","PeriodicalId":340399,"journal":{"name":"2022 International Workshop on Intelligent Systems (IWIS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Workshop on Intelligent Systems (IWIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWIS56333.2022.9920717","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

This paper proposes a new structure to obtain similar results while reducing the computational amount of BiSeNet for Real-Time Semantic Segmentation. Among the Spatial Path and Context Path of BiSeNet, the study was conducted focusing on the large size kernel of the Spatial Path. Spatial Path has rich spatial information by creating a feature map 1/8 times the size of the original image through three convolution operations. The convolution operation used at this time is performed in the order of 7×7, 3×3, and 3×3. When a general convolution is used for a kernel of such a large size, the calculated cost increases due to a large number of parameters. To solve this problem, this paper uses Depthwise Separable Convolution. At this time, in Depthwise Separable Convolution, loss occurs in Spatial Information. To solve this information loss, an attention mechanism [1] was applied by elementwise summing between the input and output feature maps of depthwise separable convolution. To solve the dimensional difference between input and output, PPM: Pooling Pointwise Module is used. PPM uses Maxpooling to change the Spatial Dimension of input features and Channel Dimension through Pointwise Convolution (lx1 Convolution) [2]. This paper propose to use Depthwise Attention Spatial Path for BiSeNet using these methods. Through our proposed methods, mIoU in SS, SSC, MSF, and MSCF were 72.7%, 74.1 %, 74.3%, and 76.1 %. Proposed network can segment the part that the original one can't when using our Depthwise Attention Spatial Path.

查看原文本刊更多论文

基于深度注意空间路径的BiSeNet语义分割

本文提出了一种新的结构来获得相似的结果，同时减少了BiSeNet实时语义分割的计算量。在BiSeNet的空间路径和上下文路径中，重点研究了空间路径的大尺寸核。空间路径通过三次卷积运算，生成大小为原图像1/8倍的特征图，具有丰富的空间信息。此时使用的卷积运算按7×7、3×3、3×3的顺序执行。当对如此大的核使用一般卷积时，由于大量的参数，计算成本会增加。为了解决这一问题，本文采用了深度可分离卷积。此时，在深度可分卷积中，空间信息发生了损失。为了解决这种信息丢失问题，我们采用了一种注意力机制[1]，将深度可分离卷积的输入和输出特征映射进行元素求和。为了解决输入和输出之间的尺寸差异，使用PPM: Pooling Pointwise Module。PPM使用Maxpooling通过Pointwise Convolution (lx1 Convolution)改变输入特征的Spatial Dimension和Channel Dimension[2]。在此基础上，本文提出了对BiSeNet进行深度注意空间路径的方法。通过我们提出的方法，SS、SSC、MSF和MSCF的mIoU分别为72.7%、74.1%、74.3%和76.1%。利用我们的深度注意空间路径，我们提出的网络可以分割原有网络无法分割的部分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Workshop on Intelligent Systems (IWIS)

自引率

0.00%

发文量