Spatial-temporal motion information integration for action detection and recognition in non-static background

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI) Pub Date : 2013-10-24 DOI:10.1109/IRI.2013.6642527

Dianting Liu, M. Shyu, Guiru Zhao

{"title":"Spatial-temporal motion information integration for action detection and recognition in non-static background","authors":"Dianting Liu, M. Shyu, Guiru Zhao","doi":"10.1109/IRI.2013.6642527","DOIUrl":null,"url":null,"abstract":"Various motion detection methods have been proposed in the past decade, but there are seldom attempts to investigate the advantages and disadvantages of different detection mechanisms so that they can complement each other to achieve a better performance. Toward such a demand, this paper proposes a human action detection and recognition framework to bridge the semantic gap between low-level pixel intensity change and the high-level understanding of the meaning of an action. To achieve a robust estimation of the region of action with the complexities of an uncontrolled background, we propose the combination of the optical flow field and Harris3D corner detector to obtain a new spatial-temporal estimation in the video sequences. The action detection method, considering the integrated motion information, works well with the dynamic background and camera motion, and demonstrates the advantage of the proposed method of integrating multiple spatial-temporal cues. Then the local features (SIFT and STIP) extracted from the estimated region of action are used to learn the Universal Background Model (UBM) for the action recognition task. The experimental results on KTH and UCF YouTube Action (UCF11) data sets show that the proposed action detection and recognition framework can not only better estimate the region of action but also achieve better recognition accuracy comparing with the peer work.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2013.6642527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 25

Abstract

Various motion detection methods have been proposed in the past decade, but there are seldom attempts to investigate the advantages and disadvantages of different detection mechanisms so that they can complement each other to achieve a better performance. Toward such a demand, this paper proposes a human action detection and recognition framework to bridge the semantic gap between low-level pixel intensity change and the high-level understanding of the meaning of an action. To achieve a robust estimation of the region of action with the complexities of an uncontrolled background, we propose the combination of the optical flow field and Harris3D corner detector to obtain a new spatial-temporal estimation in the video sequences. The action detection method, considering the integrated motion information, works well with the dynamic background and camera motion, and demonstrates the advantage of the proposed method of integrating multiple spatial-temporal cues. Then the local features (SIFT and STIP) extracted from the estimated region of action are used to learn the Universal Background Model (UBM) for the action recognition task. The experimental results on KTH and UCF YouTube Action (UCF11) data sets show that the proposed action detection and recognition framework can not only better estimate the region of action but also achieve better recognition accuracy comparing with the peer work.

查看原文本刊更多论文

基于时空运动信息集成的非静态背景下动作检测与识别

在过去的十年里，人们提出了各种各样的运动检测方法，但很少有人尝试研究不同检测机制的优缺点，使它们能够相互补充以达到更好的性能。针对这一需求，本文提出了一种人体动作检测与识别框架，以弥合低级像素强度变化与高级动作意义理解之间的语义鸿沟。为了在不受控制背景的复杂性下实现对动作区域的鲁棒估计，我们提出将光流场与Harris3D角点检测器相结合，在视频序列中获得一种新的时空估计。该方法综合考虑了运动信息，能够很好地处理动态背景和摄像机运动，证明了该方法综合多个时空线索的优势。然后利用动作估计区域提取的局部特征(SIFT和STIP)学习动作识别任务的通用背景模型(Universal Background Model, UBM)。在KTH和UCF YouTube动作(UCF11)数据集上的实验结果表明，与同类工作相比，所提出的动作检测和识别框架不仅可以更好地估计动作区域，而且具有更好的识别精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)

自引率

0.00%

发文量