Video Summarization Using Feature Vector Clustering

EngRN: Engineering Design Process (Topic) Pub Date : 2020-11-21 DOI:10.2139/ssrn.3734732

Y. Dhamecha, S. Gadekara, S. Deshmukh, Y. Haribhakta

{"title":"Video Summarization Using Feature Vector Clustering","authors":"Y. Dhamecha, S. Gadekara, S. Deshmukh, Y. Haribhakta","doi":"10.2139/ssrn.3734732","DOIUrl":null,"url":null,"abstract":"With ever-growing utilization of online and offline videos and increasing video content, Video Summarization serves as the best aid for video browsing. It involves domain explicit semantic comprehension of a video and understanding of user expectations. Generally, video summarization systems include extracting video features, analyzing the visual variations and selecting video frames. Over the years, various methodologies have been developed for the same. Different supervised and unsupervised algorithms have been established and these models have been trained on various factors or various rewards. The challenges these methods face stand as a motivation for the approach this paper discusses. Like in many cases, summary frames may be repeated if some scene or concept appears more than once. This paper presents a novel approach based on clustering of video frames based on their feature vectors. The clustering takes into consideration the semantic factor of video frames. Each concept cluster gives a representative frame which then forms the summary set, here concept cluster refers to the independent entity present in a video which can be easily distinguished by another concept or entity. This entity can be a scene of a mountain or different persons. It also aims to increase system performance by removing the redundancy. The system is developed using a CNN for feature extraction and a clustering algorithm that takes into consideration the similarity factor between these vectors. The model is evaluated on the measures Precision and Recall and tested on the VSUMM dataset. The results outperform some of the established methodologies and serve the summarization purpose.","PeriodicalId":11974,"journal":{"name":"EngRN: Engineering Design Process (Topic)","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EngRN: Engineering Design Process (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3734732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

With ever-growing utilization of online and offline videos and increasing video content, Video Summarization serves as the best aid for video browsing. It involves domain explicit semantic comprehension of a video and understanding of user expectations. Generally, video summarization systems include extracting video features, analyzing the visual variations and selecting video frames. Over the years, various methodologies have been developed for the same. Different supervised and unsupervised algorithms have been established and these models have been trained on various factors or various rewards. The challenges these methods face stand as a motivation for the approach this paper discusses. Like in many cases, summary frames may be repeated if some scene or concept appears more than once. This paper presents a novel approach based on clustering of video frames based on their feature vectors. The clustering takes into consideration the semantic factor of video frames. Each concept cluster gives a representative frame which then forms the summary set, here concept cluster refers to the independent entity present in a video which can be easily distinguished by another concept or entity. This entity can be a scene of a mountain or different persons. It also aims to increase system performance by removing the redundancy. The system is developed using a CNN for feature extraction and a clustering algorithm that takes into consideration the similarity factor between these vectors. The model is evaluated on the measures Precision and Recall and tested on the VSUMM dataset. The results outperform some of the established methodologies and serve the summarization purpose.

查看原文本刊更多论文

基于特征向量聚类的视频摘要

随着在线和离线视频的使用率越来越高，视频内容越来越多，视频摘要成为视频浏览的最佳辅助工具。它包括对视频的领域显式语义理解和对用户期望的理解。一般来说，视频摘要系统包括提取视频特征、分析视觉变化和选择视频帧。多年来，针对同一问题开发了各种方法。已经建立了不同的监督和无监督算法，这些模型已经根据不同的因素或不同的奖励进行了训练。这些方法面临的挑战是本文讨论方法的动机。就像在许多情况下一样，如果某些场景或概念不止一次出现，总结框架可能会重复。本文提出了一种基于特征向量的视频帧聚类方法。聚类考虑了视频帧的语义因素。每个概念集群给出一个代表性框架，然后形成总结集，这里的概念集群指的是视频中存在的独立实体，可以很容易地被另一个概念或实体区分开来。这个实体可以是一座山的场景或不同的人。它还旨在通过消除冗余来提高系统性能。该系统使用CNN进行特征提取，并使用考虑这些向量之间相似性因素的聚类算法开发。该模型在Precision和Recall两个指标上进行了评估，并在VSUMM数据集上进行了测试。结果优于一些既定的方法，并服务于总结的目的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

EngRN: Engineering Design Process (Topic)

自引率

0.00%

发文量