Motion Based Video Skimming

I. Alam, Devesh Jalan, Priti Shaw, Partha Pratim Mohanta
{"title":"Motion Based Video Skimming","authors":"I. Alam, Devesh Jalan, Priti Shaw, Partha Pratim Mohanta","doi":"10.1109/CALCON49167.2020.9106488","DOIUrl":null,"url":null,"abstract":"Automatic video summarization is a sustainable method that provides efficient browsing and searching mechanism for long videos. Video skimming is one of the popular ways to represent a summary of a full-length video. This work describes an unsupervised technique that automatically extracts the important clips from an input video and generates a summarized version of that video. The proposed scheme of video skimming is composed of three parts: extraction of motion based features, selection of important clips, detection, and removal of the shot boundary, if any, within a clip. Each frame is represented by a 32-dimensional feature vector that is generated using the slope and magnitude of the motion vectors. A set of representative frames of the entire video is obtained using the k-means clustering followed by the Maximal Spanning Tree (MxST). These representative frames are the center point of the clips to be generated. A window is considered around these representative frames and the clip is formed. A shot boundary may exist within the clip. To detect such a shot boundary, a method is proposed considering the variations present in the pixel intensities of the frames of a clip. The variation among the frames is captured using the standard deviation of the distribution of the pixel intensities. The clips are reformed in case the boundary is detected. Finally, the skim is generated by concatenating extracted video clips in a sequential manner. The obtained video summaries are concise and the proper representation of the input videos. The experiment is performed on two benchmark datasets namely SumMe and TVSum. Experimental results show that the proposed method outperforms the state-of-the-art methods.","PeriodicalId":318478,"journal":{"name":"2020 IEEE Calcutta Conference (CALCON)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Calcutta Conference (CALCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CALCON49167.2020.9106488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Automatic video summarization is a sustainable method that provides efficient browsing and searching mechanism for long videos. Video skimming is one of the popular ways to represent a summary of a full-length video. This work describes an unsupervised technique that automatically extracts the important clips from an input video and generates a summarized version of that video. The proposed scheme of video skimming is composed of three parts: extraction of motion based features, selection of important clips, detection, and removal of the shot boundary, if any, within a clip. Each frame is represented by a 32-dimensional feature vector that is generated using the slope and magnitude of the motion vectors. A set of representative frames of the entire video is obtained using the k-means clustering followed by the Maximal Spanning Tree (MxST). These representative frames are the center point of the clips to be generated. A window is considered around these representative frames and the clip is formed. A shot boundary may exist within the clip. To detect such a shot boundary, a method is proposed considering the variations present in the pixel intensities of the frames of a clip. The variation among the frames is captured using the standard deviation of the distribution of the pixel intensities. The clips are reformed in case the boundary is detected. Finally, the skim is generated by concatenating extracted video clips in a sequential manner. The obtained video summaries are concise and the proper representation of the input videos. The experiment is performed on two benchmark datasets namely SumMe and TVSum. Experimental results show that the proposed method outperforms the state-of-the-art methods.
基于动作的视频浏览
自动视频摘要是一种可持续的方法,为长视频提供了高效的浏览和搜索机制。视频略读是一种流行的方式来表示一个完整的视频的摘要。这项工作描述了一种无监督技术,该技术自动从输入视频中提取重要片段,并生成该视频的摘要版本。提出的视频浏览方案由三个部分组成:基于运动特征的提取、重要片段的选择、检测和去除片段内的镜头边界(如果有的话)。每个帧由一个32维特征向量表示,该特征向量是使用运动向量的斜率和幅度生成的。通过k-means聚类和最大生成树(maximum Spanning Tree, MxST),得到了一组具有代表性的视频帧。这些代表性帧是要生成的剪辑的中心点。在这些有代表性的框架周围考虑一个窗口,并形成剪辑。镜头边界可能存在于剪辑中。为了检测这样的镜头边界,提出了一种考虑到剪辑帧的像素强度变化的方法。使用像素强度分布的标准差捕获帧之间的变化。如果检测到边界,则对剪辑进行重组。最后,通过按顺序将提取的视频片段连接起来生成略读。得到的视频摘要简洁,对输入的视频进行了恰当的表示。实验在SumMe和TVSum两个基准数据集上进行。实验结果表明,该方法优于现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信