Affective Visual Question Answering Network

Nelson Ruwa, Qi-rong Mao, Liangjun Wang, Ming Dong
{"title":"Affective Visual Question Answering Network","authors":"Nelson Ruwa, Qi-rong Mao, Liangjun Wang, Ming Dong","doi":"10.1109/MIPR.2018.00038","DOIUrl":null,"url":null,"abstract":"Visual Question Answering (VQA) has recently attracted considerable attention from researchers in the trending field of deep learning. The need to improve VQA models by focusing on local regions of images, has resulted in the development of various attention models. This paper proposes the Affective Visual Question Answering Network (AVQAN), an attention model that combines the locality of the image features, the question and the mood detected from the specific region of the image to produce an affective answer using a preprocessed image dataset. The experimental results depict that AVQAN enriches the analysis and understanding of images by adding affective information to the answer, while still managing to maintain the accuracy levels within the range of recent ordinary VQA baseline models. The proposed model significantly contributes towards the development of rapidly improving emotion-aware machines that are becoming increasingly vital in everyday life.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIPR.2018.00038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Visual Question Answering (VQA) has recently attracted considerable attention from researchers in the trending field of deep learning. The need to improve VQA models by focusing on local regions of images, has resulted in the development of various attention models. This paper proposes the Affective Visual Question Answering Network (AVQAN), an attention model that combines the locality of the image features, the question and the mood detected from the specific region of the image to produce an affective answer using a preprocessed image dataset. The experimental results depict that AVQAN enriches the analysis and understanding of images by adding affective information to the answer, while still managing to maintain the accuracy levels within the range of recent ordinary VQA baseline models. The proposed model significantly contributes towards the development of rapidly improving emotion-aware machines that are becoming increasingly vital in everyday life.
情感视觉问答网络
视觉问答(Visual Question answer, VQA)最近引起了深度学习研究人员的广泛关注。通过关注图像的局部区域来改进VQA模型的需求导致了各种注意力模型的发展。本文提出了情感视觉问答网络(AVQAN),这是一种注意力模型,它结合了图像特征的局部性、从图像的特定区域检测到的问题和情绪,使用预处理的图像数据集产生情感答案。实验结果表明,AVQAN通过在答案中添加情感信息丰富了对图像的分析和理解,同时仍然能够保持在最近普通VQA基线模型范围内的精度水平。所提出的模型极大地促进了快速改进的情绪感知机器的发展,这些机器在日常生活中变得越来越重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信