人体部位在大规模人体行为识别中的作用研究

2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2013-11-01 DOI:10.1109/DICTA.2013.6691507

O. V. R. Murthy, Ibrahim Radwan, Abhinav Dhall, Roland Göcke

{"title":"人体部位在大规模人体行为识别中的作用研究","authors":"O. V. R. Murthy, Ibrahim Radwan, Abhinav Dhall, Roland Göcke","doi":"10.1109/DICTA.2013.6691507","DOIUrl":null,"url":null,"abstract":"Automatic analysis of human behaviour in large collections of videos is gaining interest, even more so with the advent of file sharing sites such as YouTube. Human behaviour analysis methods can be categorised into three classes based on the type of features. The three representations are local, region of interest and densely sampled based representations. Local feature representation, such as Spatio-Temporal Interest Points (STIP), are quite popular for modelling temporal aspects in human action recognition. Region of Interest (ROI) based feature representations try to capture and represent human body part regions. Densely sampled representations capture information at uniformly spaced intervals spread in space and temporal directions of the given video. In this paper, we investigate the effect of human body part (ROI) information in large scale action recognition. Further, we also investigate the effect of its fusion with Harris 3D points (local representation) information and densely sampled representations. All experiments use a Bag-of-Words framework. We present our results on large class benchmark databases such as the UCF50 and HMDB51 datasets.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"On the Effect of Human Body Parts in Large Scale Human Behaviour Recognition\",\"authors\":\"O. V. R. Murthy, Ibrahim Radwan, Abhinav Dhall, Roland Göcke\",\"doi\":\"10.1109/DICTA.2013.6691507\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic analysis of human behaviour in large collections of videos is gaining interest, even more so with the advent of file sharing sites such as YouTube. Human behaviour analysis methods can be categorised into three classes based on the type of features. The three representations are local, region of interest and densely sampled based representations. Local feature representation, such as Spatio-Temporal Interest Points (STIP), are quite popular for modelling temporal aspects in human action recognition. Region of Interest (ROI) based feature representations try to capture and represent human body part regions. Densely sampled representations capture information at uniformly spaced intervals spread in space and temporal directions of the given video. In this paper, we investigate the effect of human body part (ROI) information in large scale action recognition. Further, we also investigate the effect of its fusion with Harris 3D points (local representation) information and densely sampled representations. All experiments use a Bag-of-Words framework. We present our results on large class benchmark databases such as the UCF50 and HMDB51 datasets.\",\"PeriodicalId\":231632,\"journal\":{\"name\":\"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DICTA.2013.6691507\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2013.6691507","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

对大量视频中人类行为的自动分析正引起人们的兴趣，随着YouTube等文件共享网站的出现，人们的兴趣更加浓厚。人类行为分析方法可以根据特征的类型分为三类。这三种表示是局部的、感兴趣的区域和基于密集采样的表示。局部特征表示，如时空兴趣点(STIP)，是人类行为识别中非常流行的时间方面建模方法。基于感兴趣区域(ROI)的特征表示试图捕获和表示人体部位区域。密集采样表示以均匀间隔捕获信息，分布在给定视频的空间和时间方向上。本文研究了人体部位信息在大规模动作识别中的作用。此外，我们还研究了其与Harris 3D点(局部表示)信息和密集采样表示融合的效果。所有的实验都使用单词袋框架。我们在大型类基准数据库(如UCF50和HMDB51数据集)上展示了我们的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On the Effect of Human Body Parts in Large Scale Human Behaviour Recognition

Automatic analysis of human behaviour in large collections of videos is gaining interest, even more so with the advent of file sharing sites such as YouTube. Human behaviour analysis methods can be categorised into three classes based on the type of features. The three representations are local, region of interest and densely sampled based representations. Local feature representation, such as Spatio-Temporal Interest Points (STIP), are quite popular for modelling temporal aspects in human action recognition. Region of Interest (ROI) based feature representations try to capture and represent human body part regions. Densely sampled representations capture information at uniformly spaced intervals spread in space and temporal directions of the given video. In this paper, we investigate the effect of human body part (ROI) information in large scale action recognition. Further, we also investigate the effect of its fusion with Harris 3D points (local representation) information and densely sampled representations. All experiments use a Bag-of-Words framework. We present our results on large class benchmark databases such as the UCF50 and HMDB51 datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量