Deep Video Understanding: Representation Learning, Action Recognition, and Language Generation

Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild Pub Date : 2018-10-15 DOI:10.1145/3265987.3265994

Tao Mei

引用次数: 0

Abstract

Analyzing videos is one of the fundamental problems of computer vision and multimedia analysis for decades. The task is very challenging as video is an information-intensive media with large variations and complexities. Thanks to the recent development of deep learning techniques, researchers in both computer vision and multimedia communities are now able to boost the performance of video analysis significantly and initiate new research directions to analyze video content. This talk will cover recent advances under the umbrella of video understanding, which start from basic networks that are widely adopted in state-of-the-art deep learning pipelines, to fundamental challenges of video representation learning and video classification/recognition, finally to an emerging area of video and language.

查看原文本刊更多论文

深度视频理解:表示学习、动作识别和语言生成

几十年来，视频分析一直是计算机视觉和多媒体分析的基本问题之一。由于视频是一种信息密集型媒体，具有很大的变化和复杂性，因此这项任务非常具有挑战性。由于深度学习技术的最新发展，计算机视觉和多媒体社区的研究人员现在能够显着提高视频分析的性能，并开创新的研究方向来分析视频内容。本次演讲将涵盖视频理解领域的最新进展，从最先进的深度学习管道中广泛采用的基本网络开始，到视频表示学习和视频分类/识别的基本挑战，最后是视频和语言的新兴领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild

自引率

0.00%

发文量