Attentive and Adversarial Learning for Video Summarization

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI:10.1109/WACV.2019.00173

Tsu-Jui Fu, Shao-Heng Tai, Hwann-Tzong Chen

引用次数: 53

Abstract

This paper aims to address the video summarization problem via attention-aware and adversarial training. We formulate the problem as a sequence-to-sequence task, where the input sequence is an original video and the output sequence is its summarization. We propose a GAN-based training framework, which combines the merits of unsupervised and supervised video summarization approaches. The generator is an attention-aware Ptr-Net that generates the cutting points of summarization fragments. The discriminator is a 3D CNN classifier to judge whether a fragment is from a ground-truth or a generated summarization. The experiments show that our method achieves state-of-the-art results on SumMe, TVSum, YouTube, and LoL datasets with 1.5% to 5.6% improvements. Our Ptr-Net generator can overcome the unbalanced training-test length in the seq2seq problem, and our discriminator is effective in leveraging unpaired summarizations to achieve better performance.

查看原文本刊更多论文

视频摘要的注意和对抗性学习

本文旨在通过注意感知和对抗训练来解决视频摘要问题。我们将问题表述为序列到序列的任务，其中输入序列是原始视频，输出序列是其摘要。我们提出了一个基于gan的训练框架，它结合了无监督和有监督视频摘要方法的优点。该生成器是一个注意感知的Ptr-Net，用于生成摘要片段的切点。鉴别器是一个3D CNN分类器，用来判断一个片段是来自ground truth还是一个生成的摘要。实验表明，我们的方法在SumMe、TVSum、YouTube和LoL数据集上取得了最先进的结果，提高了1.5%到5.6%。我们的Ptr-Net生成器可以克服seq2seq问题中训练-测试长度不平衡的问题，我们的鉴别器可以有效地利用非配对摘要来获得更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量