自动驾驶汽车语义分割的融合注意网络

Chuyao Wang, N. Aouf
{"title":"自动驾驶汽车语义分割的融合注意网络","authors":"Chuyao Wang, N. Aouf","doi":"10.1109/iv51971.2022.9827377","DOIUrl":null,"url":null,"abstract":"Semantic segmentation is vital for autonomous car scene understanding. It provides more precise subject information than raw RGB images and this, in turn, boosts the performance of autonomous driving. Recently, self-attention methods show great improvement in image semantic segmentation. Attention maps help scene parsing with abundant relationships of every pixel in an image. However, it is computationally demanding. Besides, existing works focus either on channel attention, ignoring the pixel position factors, or on spatial attention, disregarding the impacts of the channels on each other. To address these problems, we present Fusion Attention Network based on self-attention mechanism to harvest rich contextual dependencies. This model consists of two chains: pyramid fusion spatial attention and fusion channel attention. We apply pyramid sampling in the spatial attention module to reduce the computation for spatial attention maps. Channel attention has a similar structure to the spatial attention. We also introduce a fusion technique to calculate contextual dependencies using features from both attention chains. We concatenate the results from spatial and channel attention modules as the enhanced attention map, leading to better semantic segmentation results. We conduct extensive experiments on popular datasets with different settings in addition to an ablation study to prove the efficiency of our approach. Our model achieves better results, on Cityscapes [7], compared to state-of-the-art methods, and also show good generalization capability on PASCAL VOC 2012 [9].","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Fusion Attention Network for Autonomous Cars Semantic Segmentation\",\"authors\":\"Chuyao Wang, N. Aouf\",\"doi\":\"10.1109/iv51971.2022.9827377\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic segmentation is vital for autonomous car scene understanding. It provides more precise subject information than raw RGB images and this, in turn, boosts the performance of autonomous driving. Recently, self-attention methods show great improvement in image semantic segmentation. Attention maps help scene parsing with abundant relationships of every pixel in an image. However, it is computationally demanding. Besides, existing works focus either on channel attention, ignoring the pixel position factors, or on spatial attention, disregarding the impacts of the channels on each other. To address these problems, we present Fusion Attention Network based on self-attention mechanism to harvest rich contextual dependencies. This model consists of two chains: pyramid fusion spatial attention and fusion channel attention. We apply pyramid sampling in the spatial attention module to reduce the computation for spatial attention maps. Channel attention has a similar structure to the spatial attention. We also introduce a fusion technique to calculate contextual dependencies using features from both attention chains. We concatenate the results from spatial and channel attention modules as the enhanced attention map, leading to better semantic segmentation results. We conduct extensive experiments on popular datasets with different settings in addition to an ablation study to prove the efficiency of our approach. Our model achieves better results, on Cityscapes [7], compared to state-of-the-art methods, and also show good generalization capability on PASCAL VOC 2012 [9].\",\"PeriodicalId\":184622,\"journal\":{\"name\":\"2022 IEEE Intelligent Vehicles Symposium (IV)\",\"volume\":\"111 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Intelligent Vehicles Symposium (IV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iv51971.2022.9827377\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Intelligent Vehicles Symposium (IV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iv51971.2022.9827377","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

语义分割对于自动驾驶汽车场景的理解至关重要。它提供了比原始RGB图像更精确的主题信息,这反过来又提高了自动驾驶的性能。近年来,自注意方法在图像语义分割方面取得了很大的进步。注意图利用图像中每个像素的丰富关系帮助场景解析。然而,它的计算要求很高。此外,现有的作品要么关注通道注意力,忽略了像素位置因素,要么关注空间注意力,忽略了通道之间的相互影响。为了解决这些问题,我们提出了基于自注意机制的融合注意网络来获取丰富的上下文依赖关系。该模型由金字塔型融合空间注意和融合通道注意两条链组成。为了减少空间注意图的计算量,我们在空间注意模块中采用金字塔抽样。通道注意与空间注意具有相似的结构。我们还引入了一种融合技术,利用两个注意链的特征来计算上下文依赖关系。我们将空间注意模块和通道注意模块的结果连接为增强的注意图,从而得到更好的语义分割结果。除了消融研究外,我们还在不同设置的流行数据集上进行了大量实验,以证明我们方法的有效性。与现有的方法相比,我们的模型在cityscape[7]上取得了更好的结果,并且在PASCAL VOC 2012[9]上也表现出了良好的泛化能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fusion Attention Network for Autonomous Cars Semantic Segmentation
Semantic segmentation is vital for autonomous car scene understanding. It provides more precise subject information than raw RGB images and this, in turn, boosts the performance of autonomous driving. Recently, self-attention methods show great improvement in image semantic segmentation. Attention maps help scene parsing with abundant relationships of every pixel in an image. However, it is computationally demanding. Besides, existing works focus either on channel attention, ignoring the pixel position factors, or on spatial attention, disregarding the impacts of the channels on each other. To address these problems, we present Fusion Attention Network based on self-attention mechanism to harvest rich contextual dependencies. This model consists of two chains: pyramid fusion spatial attention and fusion channel attention. We apply pyramid sampling in the spatial attention module to reduce the computation for spatial attention maps. Channel attention has a similar structure to the spatial attention. We also introduce a fusion technique to calculate contextual dependencies using features from both attention chains. We concatenate the results from spatial and channel attention modules as the enhanced attention map, leading to better semantic segmentation results. We conduct extensive experiments on popular datasets with different settings in addition to an ablation study to prove the efficiency of our approach. Our model achieves better results, on Cityscapes [7], compared to state-of-the-art methods, and also show good generalization capability on PASCAL VOC 2012 [9].
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信