拉普拉斯注意：一种即插即用的算法，不会增加视觉任务的模型复杂度

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology Pub Date : 2024-12-26 DOI:10.1049/cit2.12402

Xiaolei Chen, Yubing Lu, Runyu Wen

{"title":"拉普拉斯注意：一种即插即用的算法，不会增加视觉任务的模型复杂度","authors":"Xiaolei Chen, Yubing Lu, Runyu Wen","doi":"10.1049/cit2.12402","DOIUrl":null,"url":null,"abstract":"<p>Most prevailing attention mechanism modules in contemporary research are convolution-based modules, and while these modules contribute to enhancing the accuracy of deep learning networks in visual tasks, they concurrently augment the overall model complexity. To address the problem, this paper proposes a plug-and-play algorithm that does not increase the complexity of the model, Laplacian attention (LA). The LA algorithm first calculates the similarity distance between feature points in the feature space and feature channel and constructs the residual Laplacian matrix between feature points through the similarity distance and Gaussian kernel. This construction serves to segregate non-similar feature points while aggregating those with similarities. Ultimately, the LA algorithm allocates the outputs of the feature channel and the feature space adaptively to derive the final LA outputs. Crucially, the LA algorithm is confined to the forward computation process and does not involve backpropagation or any parameter learning. The LA algorithm undergoes comprehensive experimentation on three distinct datasets—namely Cifar-10, miniImageNet, and Pascal VOC 2012. The experimental results demonstrate that, compared with the advanced attention mechanism modules in recent years, such as SENet, CBAM, ECANet, coordinate attention, and triplet attention, the LA algorithm exhibits superior performance across image classification, object detection and semantic segmentation tasks.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"545-556"},"PeriodicalIF":7.3000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12402","citationCount":"0","resultStr":"{\"title\":\"Laplacian attention: A plug-and-play algorithm without increasing model complexity for vision tasks\",\"authors\":\"Xiaolei Chen, Yubing Lu, Runyu Wen\",\"doi\":\"10.1049/cit2.12402\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Most prevailing attention mechanism modules in contemporary research are convolution-based modules, and while these modules contribute to enhancing the accuracy of deep learning networks in visual tasks, they concurrently augment the overall model complexity. To address the problem, this paper proposes a plug-and-play algorithm that does not increase the complexity of the model, Laplacian attention (LA). The LA algorithm first calculates the similarity distance between feature points in the feature space and feature channel and constructs the residual Laplacian matrix between feature points through the similarity distance and Gaussian kernel. This construction serves to segregate non-similar feature points while aggregating those with similarities. Ultimately, the LA algorithm allocates the outputs of the feature channel and the feature space adaptively to derive the final LA outputs. Crucially, the LA algorithm is confined to the forward computation process and does not involve backpropagation or any parameter learning. The LA algorithm undergoes comprehensive experimentation on three distinct datasets—namely Cifar-10, miniImageNet, and Pascal VOC 2012. The experimental results demonstrate that, compared with the advanced attention mechanism modules in recent years, such as SENet, CBAM, ECANet, coordinate attention, and triplet attention, the LA algorithm exhibits superior performance across image classification, object detection and semantic segmentation tasks.</p>\",\"PeriodicalId\":46211,\"journal\":{\"name\":\"CAAI Transactions on Intelligence Technology\",\"volume\":\"10 2\",\"pages\":\"545-556\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2024-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12402\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CAAI Transactions on Intelligence Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.12402\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.12402","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

当代研究中流行的注意力机制模块大多是基于卷积的模块，虽然这些模块有助于提高深度学习网络在视觉任务中的准确性，但同时也增加了整体模型的复杂性。为了解决这个问题，本文提出了一种不会增加模型复杂度的即插即用算法--拉普拉斯注意（Laplacian attention，LA）。拉普拉斯注意算法首先计算特征空间和特征通道中特征点之间的相似性距离，然后通过相似性距离和高斯核构建特征点之间的残差拉普拉斯矩阵。这种构造的作用是隔离非相似特征点，同时聚合具有相似性的特征点。最后，LA 算法会自适应地分配特征通道和特征空间的输出，从而得出最终的 LA 输出。最重要的是，LA 算法仅限于前向计算过程，不涉及反向传播或任何参数学习。LA 算法在三个不同的数据集（即 Cifar-10、miniImageNet 和 Pascal VOC 2012）上进行了全面的实验。实验结果表明，与近年来先进的注意力机制模块（如 SENet、CBAM、ECANet、坐标注意力和三重注意力）相比，LA 算法在图像分类、物体检测和语义分割任务中表现出更优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Laplacian attention: A plug-and-play algorithm without increasing model complexity for vision tasks

查看原文本刊更多论文

Laplacian attention: A plug-and-play algorithm without increasing model complexity for vision tasks

Most prevailing attention mechanism modules in contemporary research are convolution-based modules, and while these modules contribute to enhancing the accuracy of deep learning networks in visual tasks, they concurrently augment the overall model complexity. To address the problem, this paper proposes a plug-and-play algorithm that does not increase the complexity of the model, Laplacian attention (LA). The LA algorithm first calculates the similarity distance between feature points in the feature space and feature channel and constructs the residual Laplacian matrix between feature points through the similarity distance and Gaussian kernel. This construction serves to segregate non-similar feature points while aggregating those with similarities. Ultimately, the LA algorithm allocates the outputs of the feature channel and the feature space adaptively to derive the final LA outputs. Crucially, the LA algorithm is confined to the forward computation process and does not involve backpropagation or any parameter learning. The LA algorithm undergoes comprehensive experimentation on three distinct datasets—namely Cifar-10, miniImageNet, and Pascal VOC 2012. The experimental results demonstrate that, compared with the advanced attention mechanism modules in recent years, such as SENet, CBAM, ECANet, coordinate attention, and triplet attention, the LA algorithm exhibits superior performance across image classification, object detection and semantic segmentation tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

CAAI Transactions on Intelligence Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

11.00

自引率

3.90%

发文量

134

审稿时长

35 weeks

期刊介绍： CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.