O. V. R. Murthy, Ibrahim Radwan, Abhinav Dhall, Roland Göcke
{"title":"人体部位在大规模人体行为识别中的作用研究","authors":"O. V. R. Murthy, Ibrahim Radwan, Abhinav Dhall, Roland Göcke","doi":"10.1109/DICTA.2013.6691507","DOIUrl":null,"url":null,"abstract":"Automatic analysis of human behaviour in large collections of videos is gaining interest, even more so with the advent of file sharing sites such as YouTube. Human behaviour analysis methods can be categorised into three classes based on the type of features. The three representations are local, region of interest and densely sampled based representations. Local feature representation, such as Spatio-Temporal Interest Points (STIP), are quite popular for modelling temporal aspects in human action recognition. Region of Interest (ROI) based feature representations try to capture and represent human body part regions. Densely sampled representations capture information at uniformly spaced intervals spread in space and temporal directions of the given video. In this paper, we investigate the effect of human body part (ROI) information in large scale action recognition. Further, we also investigate the effect of its fusion with Harris 3D points (local representation) information and densely sampled representations. All experiments use a Bag-of-Words framework. We present our results on large class benchmark databases such as the UCF50 and HMDB51 datasets.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"On the Effect of Human Body Parts in Large Scale Human Behaviour Recognition\",\"authors\":\"O. V. R. Murthy, Ibrahim Radwan, Abhinav Dhall, Roland Göcke\",\"doi\":\"10.1109/DICTA.2013.6691507\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic analysis of human behaviour in large collections of videos is gaining interest, even more so with the advent of file sharing sites such as YouTube. Human behaviour analysis methods can be categorised into three classes based on the type of features. The three representations are local, region of interest and densely sampled based representations. Local feature representation, such as Spatio-Temporal Interest Points (STIP), are quite popular for modelling temporal aspects in human action recognition. Region of Interest (ROI) based feature representations try to capture and represent human body part regions. Densely sampled representations capture information at uniformly spaced intervals spread in space and temporal directions of the given video. In this paper, we investigate the effect of human body part (ROI) information in large scale action recognition. Further, we also investigate the effect of its fusion with Harris 3D points (local representation) information and densely sampled representations. All experiments use a Bag-of-Words framework. We present our results on large class benchmark databases such as the UCF50 and HMDB51 datasets.\",\"PeriodicalId\":231632,\"journal\":{\"name\":\"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DICTA.2013.6691507\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2013.6691507","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the Effect of Human Body Parts in Large Scale Human Behaviour Recognition
Automatic analysis of human behaviour in large collections of videos is gaining interest, even more so with the advent of file sharing sites such as YouTube. Human behaviour analysis methods can be categorised into three classes based on the type of features. The three representations are local, region of interest and densely sampled based representations. Local feature representation, such as Spatio-Temporal Interest Points (STIP), are quite popular for modelling temporal aspects in human action recognition. Region of Interest (ROI) based feature representations try to capture and represent human body part regions. Densely sampled representations capture information at uniformly spaced intervals spread in space and temporal directions of the given video. In this paper, we investigate the effect of human body part (ROI) information in large scale action recognition. Further, we also investigate the effect of its fusion with Harris 3D points (local representation) information and densely sampled representations. All experiments use a Bag-of-Words framework. We present our results on large class benchmark databases such as the UCF50 and HMDB51 datasets.