Attention Analysis in Caption Generation

2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI) Pub Date : 2019-07-01 DOI:10.1109/IIAI-AAI.2019.00029

Maaki Shozu, H. Yanagimoto

引用次数: 1

Abstract

Caption Generation is one of the fundamental tasks combining computer vision and natural language processing. To achieve this goal, neural networks are employed to implement a caption generation system. In this paper, we proposed a caption generation system combining a CNN-based object detection system and a language model with a recurrent neural network. Especially, a vector which is sent from the object detection system to the language model is generated using an attention mechanism. Attention visualization can help us to understand the system focuses on a part of the input image in generating a caption. In the experiments, we evaluate the performance of the proposed system and discuss the effects of the attention mechanism in the image caption. Especially, the attention contributes to the improvement of caption generation but the attention is uncorrelated to system interpretation.

查看原文本刊更多论文

标题生成中的注意力分析

标题生成是计算机视觉与自然语言处理相结合的基本任务之一。为了实现这一目标，采用神经网络实现标题生成系统。本文提出了一种结合基于cnn的目标检测系统和基于递归神经网络的语言模型的字幕生成系统。特别地，利用注意机制生成了从目标检测系统发送到语言模型的向量。注意力可视化可以帮助我们理解系统在生成标题时关注输入图像的一部分。在实验中，我们评估了该系统的性能，并讨论了注意机制在图像标题中的作用。特别是，注意力有助于改进标题生成，但注意力与系统解释无关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI)

自引率

0.00%

发文量