Video key concept extraction using Convolution Neural Network

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC) Pub Date : 2024-02-07 DOI:10.1109/ICAIC60265.2024.10433799

T. H. Sardar, Ruhul Amin Hazarika, Bishwajeet Pandey, Guru Prasad M S, Sk Mahmudul Hassan, Radhakrishna Dodmane, Hardik A. Gohel

{"title":"Video key concept extraction using Convolution Neural Network","authors":"T. H. Sardar, Ruhul Amin Hazarika, Bishwajeet Pandey, Guru Prasad M S, Sk Mahmudul Hassan, Radhakrishna Dodmane, Hardik A. Gohel","doi":"10.1109/ICAIC60265.2024.10433799","DOIUrl":null,"url":null,"abstract":"Objectives: This work aims to develop an automated video summarising methodology and timestamping that uses natural language processing (NLP) tools to extract significant video information.Methods: The methodology comprises extracting the audio from the video, splitting it into chunks by the size of the pauses, and transcribing the audio using Google's speech recognition. The transcribed text is tokenised to create a summary, sentence and word frequencies are calculated, and the most relevant sentences are selected. The summary quality is assessed using ROUGE criteria, and the most important keywords are extracted from the transcript using RAKE.Findings: Our proposed method successfully extracts key points from video lectures and creates text summaries. Timestamping these key points provides valuable context and facilitates navigation within the lecture. Our method combines video-to-text conversion and text summarisation with timestamping key concepts, offering a novel approach to video lecture analysis. Existing video analysis methods focus on keyword extraction or summarisation, while our method offers a more comprehensive approach. Our timestamped key points provide a unique feature compared to other methods. Our method enhances existing video reports by (i) providing concise summaries of key concepts and (ii) enabling quick access to specific information through timestamps. (iii) Facilitating information retrieval through a searchable index. Further research directions: (i) Improve the accuracy of the multi-stage processing pipeline. (ii) Develop techniques to handle diverse accents and pronunciations. (iii) Explore applications of the proposed method to other video genres and types.Application/Improvements: This approach is practical in giving accurate video summaries, saving viewers time and effort when comprehending the main concepts presented in a video.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"71 9","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIC60265.2024.10433799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: This work aims to develop an automated video summarising methodology and timestamping that uses natural language processing (NLP) tools to extract significant video information.Methods: The methodology comprises extracting the audio from the video, splitting it into chunks by the size of the pauses, and transcribing the audio using Google's speech recognition. The transcribed text is tokenised to create a summary, sentence and word frequencies are calculated, and the most relevant sentences are selected. The summary quality is assessed using ROUGE criteria, and the most important keywords are extracted from the transcript using RAKE.Findings: Our proposed method successfully extracts key points from video lectures and creates text summaries. Timestamping these key points provides valuable context and facilitates navigation within the lecture. Our method combines video-to-text conversion and text summarisation with timestamping key concepts, offering a novel approach to video lecture analysis. Existing video analysis methods focus on keyword extraction or summarisation, while our method offers a more comprehensive approach. Our timestamped key points provide a unique feature compared to other methods. Our method enhances existing video reports by (i) providing concise summaries of key concepts and (ii) enabling quick access to specific information through timestamps. (iii) Facilitating information retrieval through a searchable index. Further research directions: (i) Improve the accuracy of the multi-stage processing pipeline. (ii) Develop techniques to handle diverse accents and pronunciations. (iii) Explore applications of the proposed method to other video genres and types.Application/Improvements: This approach is practical in giving accurate video summaries, saving viewers time and effort when comprehending the main concepts presented in a video.

查看原文本刊更多论文

利用卷积神经网络提取视频关键概念

目标：这项工作旨在开发一种自动视频摘要方法和时间戳，利用自然语言处理（NLP）工具提取重要的视频信息：方法：该方法包括从视频中提取音频，根据停顿的大小将其分割成若干块，然后使用谷歌语音识别功能转录音频。对转录文本进行标记化以创建摘要，计算句子和单词频率，并选择最相关的句子。使用 ROUGE 标准评估摘要质量，并使用 RAKE.Findings 从转录文本中提取最重要的关键词：我们提出的方法成功地从视频讲座中提取了关键点并创建了文本摘要。为这些关键点添加时间戳可提供有价值的上下文，并方便在讲座中进行导航。我们的方法将视频到文本的转换、文本摘要与关键概念的时间戳相结合，为视频讲座分析提供了一种新方法。现有的视频分析方法侧重于关键字提取或总结，而我们的方法提供了一种更全面的方法。与其他方法相比，我们的时间戳关键点具有独特的功能。我们的方法通过以下方式增强了现有的视频报告：(i) 提供关键概念的简明摘要；(ii) 通过时间戳快速访问特定信息。(iii) 通过可搜索索引促进信息检索。进一步的研究方向：(i) 提高多阶段处理管道的准确性。(ii) 开发处理不同口音和发音的技术。(iii) 探索将建议的方法应用于其他视频流派和类型：这种方法可以提供准确的视频摘要，节省观众理解视频中主要概念的时间和精力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

自引率

0.00%

发文量