Self Supervision for Attention Networks

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI:10.1109/WACV48630.2021.00077

Badri N. Patro, Vinay P. Namboodiri

{"title":"Self Supervision for Attention Networks","authors":"Badri N. Patro, Vinay P. Namboodiri","doi":"10.1109/WACV48630.2021.00077","DOIUrl":null,"url":null,"abstract":"In recent years, the attention mechanism has become a fairly popular concept and has proven to be successful in many machine learning applications. However, deep learning models do not employ supervision for these attention mechanisms which can improve the model’s performance significantly. Therefore, in this paper, we tackle this limitation and propose a novel method to improve the attention mechanism by inducing \"self-supervision\". We devise a technique to generate desirable attention maps for any model that utilizes an attention module. This is achieved by examining the model’s output for different regions sampled from the input and obtaining the attention probability distributions that enhance the proficiency of the model. The attention distributions thus obtained are used for supervision. We rely on the fact, that attenuation of the unimportant parts, allows a model to attend to more salient regions, thus strengthening the prediction accuracy. The quantitative and qualitative results published in this paper show that this method successfully improves the attention mechanism as well as the model’s accuracy. In addition to the task of Visual Question Answering(VQA), we also show results on the task of Image classification and Text classification to prove that our method can be generalized to any vision and language model that uses an attention module.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"17 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV48630.2021.00077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

In recent years, the attention mechanism has become a fairly popular concept and has proven to be successful in many machine learning applications. However, deep learning models do not employ supervision for these attention mechanisms which can improve the model’s performance significantly. Therefore, in this paper, we tackle this limitation and propose a novel method to improve the attention mechanism by inducing "self-supervision". We devise a technique to generate desirable attention maps for any model that utilizes an attention module. This is achieved by examining the model’s output for different regions sampled from the input and obtaining the attention probability distributions that enhance the proficiency of the model. The attention distributions thus obtained are used for supervision. We rely on the fact, that attenuation of the unimportant parts, allows a model to attend to more salient regions, thus strengthening the prediction accuracy. The quantitative and qualitative results published in this paper show that this method successfully improves the attention mechanism as well as the model’s accuracy. In addition to the task of Visual Question Answering(VQA), we also show results on the task of Image classification and Text classification to prove that our method can be generalized to any vision and language model that uses an attention module.

查看原文本刊更多论文

注意网络的自我监督

近年来，注意力机制已经成为一个相当流行的概念，并在许多机器学习应用中被证明是成功的。然而，深度学习模型没有对这些注意机制进行监督，这可以显著提高模型的性能。因此，本文针对这一局限性，提出了一种通过诱导“自我监督”来改善注意机制的新方法。我们设计了一种技术，可以为任何使用注意模块的模型生成理想的注意图。这是通过检查从输入中采样的不同区域的模型输出并获得提高模型熟练程度的注意力概率分布来实现的。由此获得的注意力分布用于监督。我们依靠这样一个事实，即不重要部分的衰减，使模型能够关注更突出的区域，从而加强预测的准确性。本文发表的定量和定性结果表明，该方法成功地改善了注意机制，提高了模型的准确性。除了视觉问答任务，我们还展示了图像分类和文本分类任务的结果，以证明我们的方法可以推广到任何使用注意力模块的视觉和语言模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量