{"title":"使用基于方向梯度的深度运动图直方图识别动作","authors":"Xiaodong Yang, Chenyang Zhang, Yingli Tian","doi":"10.1145/2393347.2396382","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an effective method to recognize human actions from sequences of depth maps, which provide additional body shape and motion information for action recognition. In our approach, we project depth maps onto three orthogonal planes and accumulate global activities through entire video sequences to generate the Depth Motion Maps (DMM). Histograms of Oriented Gradients (HOG) are then computed from DMM as the representation of an action video. The recognition results on Microsoft Research (MSR) Action3D dataset show that our approach significantly outperforms the state-of-the-art methods, although our representation is much more compact. In addition, we investigate how many frames are required in our framework to recognize actions on the MSR Action3D dataset. We observe that a short sub-sequence of 30-35 frames is sufficient to achieve comparable results to that operating on entire video sequences.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"565","resultStr":"{\"title\":\"Recognizing actions using depth motion maps-based histograms of oriented gradients\",\"authors\":\"Xiaodong Yang, Chenyang Zhang, Yingli Tian\",\"doi\":\"10.1145/2393347.2396382\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an effective method to recognize human actions from sequences of depth maps, which provide additional body shape and motion information for action recognition. In our approach, we project depth maps onto three orthogonal planes and accumulate global activities through entire video sequences to generate the Depth Motion Maps (DMM). Histograms of Oriented Gradients (HOG) are then computed from DMM as the representation of an action video. The recognition results on Microsoft Research (MSR) Action3D dataset show that our approach significantly outperforms the state-of-the-art methods, although our representation is much more compact. In addition, we investigate how many frames are required in our framework to recognize actions on the MSR Action3D dataset. We observe that a short sub-sequence of 30-35 frames is sufficient to achieve comparable results to that operating on entire video sequences.\",\"PeriodicalId\":212654,\"journal\":{\"name\":\"Proceedings of the 20th ACM international conference on Multimedia\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"565\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 20th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2393347.2396382\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2393347.2396382","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recognizing actions using depth motion maps-based histograms of oriented gradients
In this paper, we propose an effective method to recognize human actions from sequences of depth maps, which provide additional body shape and motion information for action recognition. In our approach, we project depth maps onto three orthogonal planes and accumulate global activities through entire video sequences to generate the Depth Motion Maps (DMM). Histograms of Oriented Gradients (HOG) are then computed from DMM as the representation of an action video. The recognition results on Microsoft Research (MSR) Action3D dataset show that our approach significantly outperforms the state-of-the-art methods, although our representation is much more compact. In addition, we investigate how many frames are required in our framework to recognize actions on the MSR Action3D dataset. We observe that a short sub-sequence of 30-35 frames is sufficient to achieve comparable results to that operating on entire video sequences.