Discriminative figure-centric models for joint action localization and recognition

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI:10.1109/ICCV.2011.6126472

Tian Lan, Yang Wang, Greg Mori

引用次数: 246

Abstract

In this paper we develop an algorithm for action recognition and localization in videos. The algorithm uses a figure-centric visual word representation. Different from previous approaches it does not require reliable human detection and tracking as input. Instead, the person location is treated as a latent variable that is inferred simultaneously with action recognition. A spatial model for an action is learned in a discriminative fashion under a figure-centric representation. Temporal smoothness over video sequences is also enforced. We present results on the UCF-Sports dataset, verifying the effectiveness of our model in situations where detection and tracking of individuals is challenging.

查看原文本刊更多论文

联合动作定位与识别的判别图形中心模型

本文提出了一种视频动作识别与定位算法。该算法使用以图形为中心的视觉单词表示。与以前的方法不同，它不需要可靠的人工检测和跟踪作为输入。相反，人的位置被视为与动作识别同时推断的潜在变量。动作的空间模型是在以图形为中心的表征下以判别方式学习的。视频序列的时间平滑性也被强制执行。我们在UCF-Sports数据集上展示了结果，验证了我们的模型在个体检测和跟踪具有挑战性的情况下的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 International Conference on Computer Vision

自引率

0.00%

发文量