A Multi-scale Contextual Attention Mechanism For Convolutional Neural Networks

2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC) Pub Date : 2022-11-19 DOI:10.1109/yac57282.2022.10023920

Yun Xie, Chanting Cao, MingChao Liao, Yao Yu

{"title":"A Multi-scale Contextual Attention Mechanism For Convolutional Neural Networks","authors":"Yun Xie, Chanting Cao, MingChao Liao, Yao Yu","doi":"10.1109/yac57282.2022.10023920","DOIUrl":null,"url":null,"abstract":"In recent years, attention mechanism has been widely studied in the field of computer vision, which can effectively improve the performance of visual tasks. In the past, many classical attention models have studied the modeling of nonlinear relationships in the spatial or channel dimensions of feature maps, ignoring the use of contextual relationships to capture the information interaction of the three dimensions to obtain a global attention feature map. In this paper, we investigate an effective multi-scale contextual attention mechanism, which can obtain feature information of different receptive fields through the combination of multi-branch conventional convolution and dilated convolution, which can increase the image receptive field, and combine global features and detailed features to effectively use contextual information. In addition, since the input tensors interact with each other on the three dimensions of the feature map and was adjusted by an adaptive parameter, this also makes the three-dimensional attention weights we obtain more differentiated. Our MCA model is simple and can be flexibly embedded in a variety of classical backbone networks, and experimental evaluation of the proposed attention mechanism on common datasets for image classification and object detection also proves the effectiveness of our attention meachine.","PeriodicalId":272227,"journal":{"name":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/yac57282.2022.10023920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, attention mechanism has been widely studied in the field of computer vision, which can effectively improve the performance of visual tasks. In the past, many classical attention models have studied the modeling of nonlinear relationships in the spatial or channel dimensions of feature maps, ignoring the use of contextual relationships to capture the information interaction of the three dimensions to obtain a global attention feature map. In this paper, we investigate an effective multi-scale contextual attention mechanism, which can obtain feature information of different receptive fields through the combination of multi-branch conventional convolution and dilated convolution, which can increase the image receptive field, and combine global features and detailed features to effectively use contextual information. In addition, since the input tensors interact with each other on the three dimensions of the feature map and was adjusted by an adaptive parameter, this also makes the three-dimensional attention weights we obtain more differentiated. Our MCA model is simple and can be flexibly embedded in a variety of classical backbone networks, and experimental evaluation of the proposed attention mechanism on common datasets for image classification and object detection also proves the effectiveness of our attention meachine.

查看原文本刊更多论文

卷积神经网络的多尺度语境注意机制

近年来，注意机制在计算机视觉领域得到了广泛的研究，它可以有效地提高视觉任务的性能。过去，许多经典的注意力模型研究的是在特征图的空间或通道维度上建立非线性关系的建模，而忽略了利用上下文关系来捕捉三个维度的信息交互，从而获得全局的注意力特征图。本文研究了一种有效的多尺度上下文注意机制，该机制通过多分支常规卷积和扩展卷积相结合来获取不同感受野的特征信息，从而增加图像的感受野，并结合全局特征和细节特征来有效地利用上下文信息。此外，由于输入张量在特征图的三个维度上相互作用，并通过自适应参数进行调整，这也使得我们得到的三维注意权值更具差异性。我们的MCA模型简单，可以灵活地嵌入到各种经典骨干网中，并且在图像分类和目标检测的常用数据集上对所提出的注意机制进行了实验评估，也证明了我们的注意机的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)

自引率

0.00%

发文量