用于深度视频理解的多模态分析的混合改进

ACM Multimedia Asia Pub Date : 2021-12-01 DOI:10.1145/3469877.3493599

Beibei Zhang, Fan Yu, Yaqun Fang, Tongwei Ren, Gangshan Wu

{"title":"用于深度视频理解的多模态分析的混合改进","authors":"Beibei Zhang, Fan Yu, Yaqun Fang, Tongwei Ren, Gangshan Wu","doi":"10.1145/3469877.3493599","DOIUrl":null,"url":null,"abstract":"The Deep Video Understanding Challenge (DVU) is a task that focuses on comprehending long duration videos which involve many entities. Its main goal is to build relationship and interaction knowledge graph between entities to answer relevant questions. In this paper, we improved the joint learning method which we previously proposed in many aspects, including few shot learning, optical flow feature, entity recognition, and video description matching. We verified the effectiveness of these measures through experiments.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Hybrid Improvements in Multimodal Analysis for Deep Video Understanding\",\"authors\":\"Beibei Zhang, Fan Yu, Yaqun Fang, Tongwei Ren, Gangshan Wu\",\"doi\":\"10.1145/3469877.3493599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Deep Video Understanding Challenge (DVU) is a task that focuses on comprehending long duration videos which involve many entities. Its main goal is to build relationship and interaction knowledge graph between entities to answer relevant questions. In this paper, we improved the joint learning method which we previously proposed in many aspects, including few shot learning, optical flow feature, entity recognition, and video description matching. We verified the effectiveness of these measures through experiments.\",\"PeriodicalId\":210974,\"journal\":{\"name\":\"ACM Multimedia Asia\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Multimedia Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3469877.3493599\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469877.3493599","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

深度视频理解挑战(Deep Video Understanding Challenge, DVU)是一项专注于理解涉及多个实体的长时间视频的任务。其主要目标是建立实体之间的关系和交互知识图谱，以回答相关问题。本文对之前提出的联合学习方法进行了改进，包括少镜头学习、光流特征、实体识别、视频描述匹配等方面。我们通过实验验证了这些措施的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hybrid Improvements in Multimodal Analysis for Deep Video Understanding

The Deep Video Understanding Challenge (DVU) is a task that focuses on comprehending long duration videos which involve many entities. Its main goal is to build relationship and interaction knowledge graph between entities to answer relevant questions. In this paper, we improved the joint learning method which we previously proposed in many aspects, including few shot learning, optical flow feature, entity recognition, and video description matching. We verified the effectiveness of these measures through experiments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Multimedia Asia

自引率

0.00%

发文量