客体差异注意:一种简单的视觉问答关系注意

Chenfei Wu, Jinlai Liu, Xiaojie Wang, Xuan Dong
{"title":"客体差异注意:一种简单的视觉问答关系注意","authors":"Chenfei Wu, Jinlai Liu, Xiaojie Wang, Xuan Dong","doi":"10.1145/3240508.3240513","DOIUrl":null,"url":null,"abstract":"Attention mechanism has greatly promoted the development of Visual Question Answering (VQA). Attention distribution, which weights differently on objects (such as image regions or bounding boxes) in an image according to their importance for answering a question, plays a crucial role in attention mechanism. Most of the existing work focuses on fusing image features and text features to calculate the attention distribution without comparisons between different image objects. As a major property of attention, selectivity depends on comparisons between different objects. Comparisons provide more information for assigning attentions better. For achieving this, we propose an object-difference attention (ODA) which calculates the probability of attention by implementing difference operator between different image objects in an image under the guidance of questions in hand. Experimental results on three publicly available datasets show our ODA based VQA model achieves the state-of-the-art results. Furthermore, a general form of relational attention is proposed. Besides ODA, several other relational attentions are given. Experimental results show those relational attentions have strengths on different types of questions.","PeriodicalId":339857,"journal":{"name":"Proceedings of the 26th ACM international conference on Multimedia","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"Object-Difference Attention: A Simple Relational Attention for Visual Question Answering\",\"authors\":\"Chenfei Wu, Jinlai Liu, Xiaojie Wang, Xuan Dong\",\"doi\":\"10.1145/3240508.3240513\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Attention mechanism has greatly promoted the development of Visual Question Answering (VQA). Attention distribution, which weights differently on objects (such as image regions or bounding boxes) in an image according to their importance for answering a question, plays a crucial role in attention mechanism. Most of the existing work focuses on fusing image features and text features to calculate the attention distribution without comparisons between different image objects. As a major property of attention, selectivity depends on comparisons between different objects. Comparisons provide more information for assigning attentions better. For achieving this, we propose an object-difference attention (ODA) which calculates the probability of attention by implementing difference operator between different image objects in an image under the guidance of questions in hand. Experimental results on three publicly available datasets show our ODA based VQA model achieves the state-of-the-art results. Furthermore, a general form of relational attention is proposed. Besides ODA, several other relational attentions are given. Experimental results show those relational attentions have strengths on different types of questions.\",\"PeriodicalId\":339857,\"journal\":{\"name\":\"Proceedings of the 26th ACM international conference on Multimedia\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 26th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3240508.3240513\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3240508.3240513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

摘要

注意机制极大地促进了视觉问答(VQA)的发展。注意分配在注意机制中起着至关重要的作用,它根据图像中物体(如图像区域或边界框)对回答问题的重要性给予不同的权重。现有的工作大多集中在融合图像特征和文本特征来计算注意力分布,而没有进行不同图像对象之间的比较。作为注意力的一个主要特性,选择性取决于不同对象之间的比较。比较为更好地分配注意力提供了更多的信息。为了实现这一目标,我们提出了一种目标差分注意(ODA)方法,该方法在手头问题的指导下,通过对图像中不同图像对象之间的差分算子来计算关注的概率。在三个公开数据集上的实验结果表明,我们的基于ODA的VQA模型达到了最先进的结果。在此基础上,提出了关系注意的一般形式。除了ODA之外,还提出了其他几个相关的注意事项。实验结果表明,这些关联关注在不同类型的问题上都有优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Object-Difference Attention: A Simple Relational Attention for Visual Question Answering
Attention mechanism has greatly promoted the development of Visual Question Answering (VQA). Attention distribution, which weights differently on objects (such as image regions or bounding boxes) in an image according to their importance for answering a question, plays a crucial role in attention mechanism. Most of the existing work focuses on fusing image features and text features to calculate the attention distribution without comparisons between different image objects. As a major property of attention, selectivity depends on comparisons between different objects. Comparisons provide more information for assigning attentions better. For achieving this, we propose an object-difference attention (ODA) which calculates the probability of attention by implementing difference operator between different image objects in an image under the guidance of questions in hand. Experimental results on three publicly available datasets show our ODA based VQA model achieves the state-of-the-art results. Furthermore, a general form of relational attention is proposed. Besides ODA, several other relational attentions are given. Experimental results show those relational attentions have strengths on different types of questions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信