{"title":"Spatio-temporal cuboid pyramid for action recognition using depth motion sequences","authors":"Xiaopeng Ji, Jun Cheng, Wei Feng","doi":"10.1109/ICACI.2016.7449827","DOIUrl":null,"url":null,"abstract":"In this paper, we present an effective method to recognize human actions from sequences of depth maps, which are captured by a consume depth sensor. In our approach, we first project each frame of a depth sequence onto three orthogonal planes and generate the depth motion sequence (DMS) between two consecutive frames from the three projected views. Then we propose a spatio-temporal cuboid pyramid (STCP) to subdivide the DMS volumes into a set of spatial cuboids on scaled temporal levels. And a cuboid fusion scheme is presented to concatenate the histograms of oriented gradients (HOG) features extracted from the spatial cuboid. The proposed approach is evaluated on three public benchmark datasets, i.e., MSRAction3D, MSRGesture3D and MSRActionPairs dataset. The experimental results demonstrate that the proposed method achieves state-of-the-art performance.","PeriodicalId":211040,"journal":{"name":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACI.2016.7449827","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
In this paper, we present an effective method to recognize human actions from sequences of depth maps, which are captured by a consume depth sensor. In our approach, we first project each frame of a depth sequence onto three orthogonal planes and generate the depth motion sequence (DMS) between two consecutive frames from the three projected views. Then we propose a spatio-temporal cuboid pyramid (STCP) to subdivide the DMS volumes into a set of spatial cuboids on scaled temporal levels. And a cuboid fusion scheme is presented to concatenate the histograms of oriented gradients (HOG) features extracted from the spatial cuboid. The proposed approach is evaluated on three public benchmark datasets, i.e., MSRAction3D, MSRGesture3D and MSRActionPairs dataset. The experimental results demonstrate that the proposed method achieves state-of-the-art performance.