多模态视频索引(MVI):一种基于机器学习和半自动标注的大型视频集索引方法

Int. J. Image Graph. Pub Date : 2021-06-19 DOI:10.1142/s021946782250022x

Mohamed Hamroun, K. Tamine, B. Crespin

{"title":"多模态视频索引(MVI):一种基于机器学习和半自动标注的大型视频集索引方法","authors":"Mohamed Hamroun, K. Tamine, B. Crespin","doi":"10.1142/s021946782250022x","DOIUrl":null,"url":null,"abstract":"Indexing video by the concept is one of the most appropriate solutions for such problems. It is based on an association between a concept and its corresponding visual sound, or textual features. This kind of association is not a trivial task. It requires knowledge about the concept and its context. In this paper, we investigate a new concept detection approach to improve the performance of content-based multimedia documents retrieval systems. To achieve this goal, we are going to tackle the problem from different plans and make four contributions at various stages of the indexing process. We propose a new method for multimodal indexation based on (i) a new weakly supervised semi-automatic method based on the genetic algorithm (ii) the detection of concepts from the text in the videos (iii) the enrichment of the basic concepts thanks to the usage of our method DCM. Subsequently, the semantic and enriched concepts allow a better multimodal indexation and the construction of an ontology. Finally, the different contributions are tested and evaluated on a large dataset (TRECVID 2015).","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multimodal Video Indexing (MVI): A New Method Based on Machine Learning and Semi-Automatic Annotation on Large Video Collections\",\"authors\":\"Mohamed Hamroun, K. Tamine, B. Crespin\",\"doi\":\"10.1142/s021946782250022x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Indexing video by the concept is one of the most appropriate solutions for such problems. It is based on an association between a concept and its corresponding visual sound, or textual features. This kind of association is not a trivial task. It requires knowledge about the concept and its context. In this paper, we investigate a new concept detection approach to improve the performance of content-based multimedia documents retrieval systems. To achieve this goal, we are going to tackle the problem from different plans and make four contributions at various stages of the indexing process. We propose a new method for multimodal indexation based on (i) a new weakly supervised semi-automatic method based on the genetic algorithm (ii) the detection of concepts from the text in the videos (iii) the enrichment of the basic concepts thanks to the usage of our method DCM. Subsequently, the semantic and enriched concepts allow a better multimodal indexation and the construction of an ontology. Finally, the different contributions are tested and evaluated on a large dataset (TRECVID 2015).\",\"PeriodicalId\":177479,\"journal\":{\"name\":\"Int. J. Image Graph.\",\"volume\":\"65 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Image Graph.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s021946782250022x\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Image Graph.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s021946782250022x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

对视频进行概念索引是解决这类问题最合适的方法之一。它是基于一个概念与其相应的视觉、声音或文本特征之间的联系。这种联系不是一件小事。它需要对概念及其背景有所了解。本文研究了一种新的概念检测方法，以提高基于内容的多媒体文档检索系统的性能。为了实现这一目标，我们将从不同的计划来解决这个问题，并在索引过程的不同阶段做出四个贡献。我们提出了一种新的多模态索引方法，基于(i)一种新的基于遗传算法的弱监督半自动方法(ii)从视频文本中检测概念(iii)由于使用我们的方法DCM而丰富了基本概念。随后，语义和丰富的概念允许更好的多模态索引和本体的构建。最后，在一个大型数据集(TRECVID 2015)上对不同的贡献进行测试和评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multimodal Video Indexing (MVI): A New Method Based on Machine Learning and Semi-Automatic Annotation on Large Video Collections

Indexing video by the concept is one of the most appropriate solutions for such problems. It is based on an association between a concept and its corresponding visual sound, or textual features. This kind of association is not a trivial task. It requires knowledge about the concept and its context. In this paper, we investigate a new concept detection approach to improve the performance of content-based multimedia documents retrieval systems. To achieve this goal, we are going to tackle the problem from different plans and make four contributions at various stages of the indexing process. We propose a new method for multimodal indexation based on (i) a new weakly supervised semi-automatic method based on the genetic algorithm (ii) the detection of concepts from the text in the videos (iii) the enrichment of the basic concepts thanks to the usage of our method DCM. Subsequently, the semantic and enriched concepts allow a better multimodal indexation and the construction of an ontology. Finally, the different contributions are tested and evaluated on a large dataset (TRECVID 2015).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Int. J. Image Graph.

自引率

0.00%

发文量