视频检索系统中图像关键帧的提取算法

A. Afonin, Iryna Oksiuta
{"title":"视频检索系统中图像关键帧的提取算法","authors":"A. Afonin, Iryna Oksiuta","doi":"10.18523/2617-3808.2022.5.62-67","DOIUrl":null,"url":null,"abstract":"As a part of this work, there was a study of image processing algorithms used in video search systems.With the development of search engines and an increase in the types of queries possible for searching, the need for indexing an increasing amount of diverse information is growing. New data in the form of images and videos require new processing techniques to extract key content descriptions. In video search engines, according to this description, users can find the video files most relevant to the search query. The search query, in turn, can be of various types: text, search by image, search by video file to find a similar one, etc. Therefore, it is necessary to accurately describe the objects in the video in order to assign appropriate labels to the video file in the search engine database.In this article, we focused on the algorithm for extracting key frames of faces from a video sequence, since one of the important objects in the video are people themselves. This algorithm allows you to perform the initial processing of the file and save the identified frames with faces in order to later process this data with the help of the face recognition algorithm and assign the appropriate labels. An alternative application for this algorithm is the current processing of video files to form datasets of faces for the development and training of new computer vision models. The main criteria for such an algorithm were: the accuracy of face detection, the ability to distinguish keyframes of all people from each other, comprehensive evaluation of candidate frames and sorting by the relevance of the entire set for each face.After an analysis of existing solutions for specific stages of the algorithm, the article proposes a sequence of steps for the algorithm for extracting key frames of faces from a video file. An important step is to assess the quality of all candidates and sort them by quality. For this, the work defines various metrics for assessing the quality of the frame, which affect the overall assessment and, accordingly, the sorting order. The article also describes the basic version of the interface for using the proposed algorithm.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithm for Extraction of Keyframes of Images in Video Retrieval Systems\",\"authors\":\"A. Afonin, Iryna Oksiuta\",\"doi\":\"10.18523/2617-3808.2022.5.62-67\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a part of this work, there was a study of image processing algorithms used in video search systems.With the development of search engines and an increase in the types of queries possible for searching, the need for indexing an increasing amount of diverse information is growing. New data in the form of images and videos require new processing techniques to extract key content descriptions. In video search engines, according to this description, users can find the video files most relevant to the search query. The search query, in turn, can be of various types: text, search by image, search by video file to find a similar one, etc. Therefore, it is necessary to accurately describe the objects in the video in order to assign appropriate labels to the video file in the search engine database.In this article, we focused on the algorithm for extracting key frames of faces from a video sequence, since one of the important objects in the video are people themselves. This algorithm allows you to perform the initial processing of the file and save the identified frames with faces in order to later process this data with the help of the face recognition algorithm and assign the appropriate labels. An alternative application for this algorithm is the current processing of video files to form datasets of faces for the development and training of new computer vision models. The main criteria for such an algorithm were: the accuracy of face detection, the ability to distinguish keyframes of all people from each other, comprehensive evaluation of candidate frames and sorting by the relevance of the entire set for each face.After an analysis of existing solutions for specific stages of the algorithm, the article proposes a sequence of steps for the algorithm for extracting key frames of faces from a video file. An important step is to assess the quality of all candidates and sort them by quality. For this, the work defines various metrics for assessing the quality of the frame, which affect the overall assessment and, accordingly, the sorting order. The article also describes the basic version of the interface for using the proposed algorithm.\",\"PeriodicalId\":433538,\"journal\":{\"name\":\"NaUKMA Research Papers. Computer Science\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NaUKMA Research Papers. Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18523/2617-3808.2022.5.62-67\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NaUKMA Research Papers. Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18523/2617-3808.2022.5.62-67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

作为这项工作的一部分,对视频搜索系统中使用的图像处理算法进行了研究。随着搜索引擎的发展和搜索查询类型的增加,对索引越来越多的不同信息的需求也在增长。图像和视频形式的新数据需要新的处理技术来提取关键内容描述。在视频搜索引擎中,根据这个描述,用户可以找到与搜索查询最相关的视频文件。搜索查询,反过来,可以是各种类型的:文本,搜索图像,搜索视频文件,以找到一个类似的,等等。因此,有必要准确地描述视频中的对象,以便在搜索引擎数据库中为视频文件分配适当的标签。在本文中,我们重点研究了从视频序列中提取人脸关键帧的算法,因为视频中重要的对象之一就是人本身。该算法允许您执行文件的初始处理并保存带有人脸的已识别帧,以便稍后在人脸识别算法的帮助下处理该数据并分配适当的标签。该算法的另一种应用是当前处理视频文件以形成人脸数据集,用于开发和训练新的计算机视觉模型。这种算法的主要标准是:人脸检测的准确性、区分所有人关键帧的能力、对候选帧的综合评价以及对每个人脸的整个集的相关性进行排序。在分析了算法各阶段的现有解决方案后,本文提出了从视频文件中提取人脸关键帧的算法的一系列步骤。重要的一步是评估所有候选人的素质,并按素质进行分类。为此,工作定义了评估框架质量的各种度量,这些度量会影响总体评估,并相应地影响排序顺序。本文还描述了使用所提出算法的接口的基本版本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Algorithm for Extraction of Keyframes of Images in Video Retrieval Systems
As a part of this work, there was a study of image processing algorithms used in video search systems.With the development of search engines and an increase in the types of queries possible for searching, the need for indexing an increasing amount of diverse information is growing. New data in the form of images and videos require new processing techniques to extract key content descriptions. In video search engines, according to this description, users can find the video files most relevant to the search query. The search query, in turn, can be of various types: text, search by image, search by video file to find a similar one, etc. Therefore, it is necessary to accurately describe the objects in the video in order to assign appropriate labels to the video file in the search engine database.In this article, we focused on the algorithm for extracting key frames of faces from a video sequence, since one of the important objects in the video are people themselves. This algorithm allows you to perform the initial processing of the file and save the identified frames with faces in order to later process this data with the help of the face recognition algorithm and assign the appropriate labels. An alternative application for this algorithm is the current processing of video files to form datasets of faces for the development and training of new computer vision models. The main criteria for such an algorithm were: the accuracy of face detection, the ability to distinguish keyframes of all people from each other, comprehensive evaluation of candidate frames and sorting by the relevance of the entire set for each face.After an analysis of existing solutions for specific stages of the algorithm, the article proposes a sequence of steps for the algorithm for extracting key frames of faces from a video file. An important step is to assess the quality of all candidates and sort them by quality. For this, the work defines various metrics for assessing the quality of the frame, which affect the overall assessment and, accordingly, the sorting order. The article also describes the basic version of the interface for using the proposed algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信