HyV-Summ: Social media video summarization on custom dataset using hybrid techniques

IF 6.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2024-11-08 DOI:10.1016/j.neucom.2024.128852

Jayanta Paul , Anuska Roy , Abhijit Mitra, Jaya Sil

{"title":"HyV-Summ: Social media video summarization on custom dataset using hybrid techniques","authors":"Jayanta Paul , Anuska Roy , Abhijit Mitra, Jaya Sil","doi":"10.1016/j.neucom.2024.128852","DOIUrl":null,"url":null,"abstract":"<div><div>The proliferation of social networking platforms such as YouTube, Facebook, Instagram, and X has led to an exponential growth in multimedia content, with billions of videos uploaded every hour. Efficient management of such vast amount of data necessitates advanced summarization techniques in order to eliminate irrelevant and redundant information. A summarized video, containing the most distinct frames or key frames, provides a concise representation of the original content. Existing deep learning and non-deep learning techniques for video summarization have certain limitations. Deep learning methods are complex and resource-intensive, while non-deep learning algorithms often fail to extract informative features from vast social media videos. This paper addresses the issue by proposing a novel hybrid technique, named Hybrid Video Summarization (<strong>HyV-Summ</strong>), which integrates deep and non-deep learning techniques to leverage their respective strengths by focusing only on social media content. We developed a custom dataset, <strong>SocialSum</strong> to train our proposed model <strong>HyV-Summ</strong>, since existing benchmark datasets like TVSum and SumMe contain diverse types of content not specific to social media videos. We provide a comparative analysis of existing techniques and datasets with our proposed techniques and dataset. The results demonstrate that HyV-Summ outperforms existing techniques, such as Long Short Term Memory (LSTM)-based and Generative Adversarial Network (GAN)-based summarization by achieving higher F1-scores while applied on both the SocialSum dataset and available datasets.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128852"},"PeriodicalIF":6.5000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224016230","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The proliferation of social networking platforms such as YouTube, Facebook, Instagram, and X has led to an exponential growth in multimedia content, with billions of videos uploaded every hour. Efficient management of such vast amount of data necessitates advanced summarization techniques in order to eliminate irrelevant and redundant information. A summarized video, containing the most distinct frames or key frames, provides a concise representation of the original content. Existing deep learning and non-deep learning techniques for video summarization have certain limitations. Deep learning methods are complex and resource-intensive, while non-deep learning algorithms often fail to extract informative features from vast social media videos. This paper addresses the issue by proposing a novel hybrid technique, named Hybrid Video Summarization (HyV-Summ), which integrates deep and non-deep learning techniques to leverage their respective strengths by focusing only on social media content. We developed a custom dataset, SocialSum to train our proposed model HyV-Summ, since existing benchmark datasets like TVSum and SumMe contain diverse types of content not specific to social media videos. We provide a comparative analysis of existing techniques and datasets with our proposed techniques and dataset. The results demonstrate that HyV-Summ outperforms existing techniques, such as Long Short Term Memory (LSTM)-based and Generative Adversarial Network (GAN)-based summarization by achieving higher F1-scores while applied on both the SocialSum dataset and available datasets.

查看原文本刊更多论文

HyV-Summ：利用混合技术在定制数据集上进行社交媒体视频摘要

随着 YouTube、Facebook、Instagram 和 X 等社交网络平台的普及，多媒体内容呈指数级增长，每小时上传的视频数量高达数十亿。要有效管理如此海量的数据，就必须采用先进的摘要技术，以消除不相关的冗余信息。概括后的视频包含最独特的帧或关键帧，能够简明扼要地呈现原始内容。现有的用于视频摘要的深度学习和非深度学习技术都有一定的局限性。深度学习方法复杂且资源密集，而非深度学习算法往往无法从海量社交媒体视频中提取信息特征。为了解决这个问题，本文提出了一种新颖的混合技术，名为混合视频摘要（HyV-Summ），它整合了深度学习和非深度学习技术，只关注社交媒体内容，从而发挥了它们各自的优势。由于 TVSum 和 SumMe 等现有基准数据集包含各种类型的内容，而非社交媒体视频的特定内容，因此我们开发了一个定制数据集 SocialSum 来训练我们提出的模型 HyV-Summ。我们将现有技术和数据集与我们提出的技术和数据集进行了对比分析。结果表明，HyV-Summ 的性能优于现有技术，如基于长短期记忆（LSTM）和基于生成对抗网络（GAN）的摘要技术，在应用于 SocialSum 数据集和现有数据集时都能获得更高的 F1 分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.