Content adaptive video summarization using spatio-temporal features

2017 IEEE International Conference on Image Processing (ICIP) Pub Date : 2017-09-15 DOI:10.1109/ICIP.2017.8297034

Hyunwoo Nam, C. Yoo

引用次数: 4

Abstract

This paper proposes a video summarization method based on novel spatio-temporal features that combine motion magnitude, object class prediction, and saturation. Motion magnitude measures how much motion there is in a video. Object class prediction provides information about an object in a video. Saturation measures the colorfulness of a video. Con-volutional neural networks (CNNs) are incorporated for object class prediction. The sum of the normalized features per shot are ranked in descending order, and the summary is determined by the highest ranking shots. This ranking can be conditioned on the object class, and the high-ranking shots for different object classes are also proposed as a summary of the input video. The performance of the summarization method is evaluated on the SumMe datasets, and the results reveal that the proposed method achieves better performance than the summary of worst human and most other state-of-the-art video summarization methods.

查看原文本刊更多论文

基于时空特征的内容自适应视频摘要

本文提出了一种结合运动幅度、目标类别预测和饱和度的新型时空特征的视频摘要方法。运动幅度衡量的是视频中有多少运动。对象类预测提供视频中对象的信息。饱和度衡量的是视频的色彩。结合卷积神经网络(cnn)进行目标类预测。每个镜头的归一化特征之和按降序排序，汇总由排名最高的镜头确定。这种排序可以以对象类别为条件，并且还提出了不同对象类别的高级镜头作为输入视频的总结。在SumMe数据集上对该方法的性能进行了评估，结果表明，该方法比最糟糕的人类总结和大多数其他最先进的视频摘要方法取得了更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE International Conference on Image Processing (ICIP)

自引率

0.00%

发文量