利用卷积神经网络提取视频关键概念

T. H. Sardar, Ruhul Amin Hazarika, Bishwajeet Pandey, Guru Prasad M S, Sk Mahmudul Hassan, Radhakrishna Dodmane, Hardik A. Gohel
{"title":"利用卷积神经网络提取视频关键概念","authors":"T. H. Sardar, Ruhul Amin Hazarika, Bishwajeet Pandey, Guru Prasad M S, Sk Mahmudul Hassan, Radhakrishna Dodmane, Hardik A. Gohel","doi":"10.1109/ICAIC60265.2024.10433799","DOIUrl":null,"url":null,"abstract":"Objectives: This work aims to develop an automated video summarising methodology and timestamping that uses natural language processing (NLP) tools to extract significant video information.Methods: The methodology comprises extracting the audio from the video, splitting it into chunks by the size of the pauses, and transcribing the audio using Google's speech recognition. The transcribed text is tokenised to create a summary, sentence and word frequencies are calculated, and the most relevant sentences are selected. The summary quality is assessed using ROUGE criteria, and the most important keywords are extracted from the transcript using RAKE.Findings: Our proposed method successfully extracts key points from video lectures and creates text summaries. Timestamping these key points provides valuable context and facilitates navigation within the lecture. Our method combines video-to-text conversion and text summarisation with timestamping key concepts, offering a novel approach to video lecture analysis. Existing video analysis methods focus on keyword extraction or summarisation, while our method offers a more comprehensive approach. Our timestamped key points provide a unique feature compared to other methods. Our method enhances existing video reports by (i) providing concise summaries of key concepts and (ii) enabling quick access to specific information through timestamps. (iii) Facilitating information retrieval through a searchable index. Further research directions: (i) Improve the accuracy of the multi-stage processing pipeline. (ii) Develop techniques to handle diverse accents and pronunciations. (iii) Explore applications of the proposed method to other video genres and types.Application/Improvements: This approach is practical in giving accurate video summaries, saving viewers time and effort when comprehending the main concepts presented in a video.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"71 9","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video key concept extraction using Convolution Neural Network\",\"authors\":\"T. H. Sardar, Ruhul Amin Hazarika, Bishwajeet Pandey, Guru Prasad M S, Sk Mahmudul Hassan, Radhakrishna Dodmane, Hardik A. Gohel\",\"doi\":\"10.1109/ICAIC60265.2024.10433799\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objectives: This work aims to develop an automated video summarising methodology and timestamping that uses natural language processing (NLP) tools to extract significant video information.Methods: The methodology comprises extracting the audio from the video, splitting it into chunks by the size of the pauses, and transcribing the audio using Google's speech recognition. The transcribed text is tokenised to create a summary, sentence and word frequencies are calculated, and the most relevant sentences are selected. The summary quality is assessed using ROUGE criteria, and the most important keywords are extracted from the transcript using RAKE.Findings: Our proposed method successfully extracts key points from video lectures and creates text summaries. Timestamping these key points provides valuable context and facilitates navigation within the lecture. Our method combines video-to-text conversion and text summarisation with timestamping key concepts, offering a novel approach to video lecture analysis. Existing video analysis methods focus on keyword extraction or summarisation, while our method offers a more comprehensive approach. Our timestamped key points provide a unique feature compared to other methods. Our method enhances existing video reports by (i) providing concise summaries of key concepts and (ii) enabling quick access to specific information through timestamps. (iii) Facilitating information retrieval through a searchable index. Further research directions: (i) Improve the accuracy of the multi-stage processing pipeline. (ii) Develop techniques to handle diverse accents and pronunciations. (iii) Explore applications of the proposed method to other video genres and types.Application/Improvements: This approach is practical in giving accurate video summaries, saving viewers time and effort when comprehending the main concepts presented in a video.\",\"PeriodicalId\":517265,\"journal\":{\"name\":\"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)\",\"volume\":\"71 9\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIC60265.2024.10433799\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIC60265.2024.10433799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目标:这项工作旨在开发一种自动视频摘要方法和时间戳,利用自然语言处理(NLP)工具提取重要的视频信息:方法:该方法包括从视频中提取音频,根据停顿的大小将其分割成若干块,然后使用谷歌语音识别功能转录音频。对转录文本进行标记化以创建摘要,计算句子和单词频率,并选择最相关的句子。使用 ROUGE 标准评估摘要质量,并使用 RAKE.Findings 从转录文本中提取最重要的关键词:我们提出的方法成功地从视频讲座中提取了关键点并创建了文本摘要。为这些关键点添加时间戳可提供有价值的上下文,并方便在讲座中进行导航。我们的方法将视频到文本的转换、文本摘要与关键概念的时间戳相结合,为视频讲座分析提供了一种新方法。现有的视频分析方法侧重于关键字提取或总结,而我们的方法提供了一种更全面的方法。与其他方法相比,我们的时间戳关键点具有独特的功能。我们的方法通过以下方式增强了现有的视频报告:(i) 提供关键概念的简明摘要;(ii) 通过时间戳快速访问特定信息。(iii) 通过可搜索索引促进信息检索。进一步的研究方向:(i) 提高多阶段处理管道的准确性。(ii) 开发处理不同口音和发音的技术。(iii) 探索将建议的方法应用于其他视频流派和类型:这种方法可以提供准确的视频摘要,节省观众理解视频中主要概念的时间和精力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Video key concept extraction using Convolution Neural Network
Objectives: This work aims to develop an automated video summarising methodology and timestamping that uses natural language processing (NLP) tools to extract significant video information.Methods: The methodology comprises extracting the audio from the video, splitting it into chunks by the size of the pauses, and transcribing the audio using Google's speech recognition. The transcribed text is tokenised to create a summary, sentence and word frequencies are calculated, and the most relevant sentences are selected. The summary quality is assessed using ROUGE criteria, and the most important keywords are extracted from the transcript using RAKE.Findings: Our proposed method successfully extracts key points from video lectures and creates text summaries. Timestamping these key points provides valuable context and facilitates navigation within the lecture. Our method combines video-to-text conversion and text summarisation with timestamping key concepts, offering a novel approach to video lecture analysis. Existing video analysis methods focus on keyword extraction or summarisation, while our method offers a more comprehensive approach. Our timestamped key points provide a unique feature compared to other methods. Our method enhances existing video reports by (i) providing concise summaries of key concepts and (ii) enabling quick access to specific information through timestamps. (iii) Facilitating information retrieval through a searchable index. Further research directions: (i) Improve the accuracy of the multi-stage processing pipeline. (ii) Develop techniques to handle diverse accents and pronunciations. (iii) Explore applications of the proposed method to other video genres and types.Application/Improvements: This approach is practical in giving accurate video summaries, saving viewers time and effort when comprehending the main concepts presented in a video.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信