{"title":"CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTION MAPS AND STABLE JOINT DISTANCE MAPS","authors":"Junyou He, Hailun Xia, Chunyan Feng, Yunfei Chu","doi":"10.1109/GlobalSIP.2018.8646404","DOIUrl":null,"url":null,"abstract":"Human action recognition has a wide range of applications including biometrics and surveillance. Existing methods mostly focus on a single modality, insufficient to characterize variations among different motions. To address this problem, we present a CNN-based human action recognition framework by fusing depth and skeleton modalities. The proposed Adaptive Multiscale Depth Motion Maps (AM-DMMs) are calculated from depth maps to capture shape, motion cues. Moreover, adaptive temporal windows ensure that AM-DMMs are robust to motion speed variations. A compact and effective method is also proposed to encode the spatio-temporal information of each skeleton sequence into three maps, referred to as Stable Joint Distance Maps (SJDMs) which describe different spatial relationships between the joints. A multi-channel CNN is adopted to exploit the discriminative features from texture color images encoded from AM-DMMs and SJDMs for effective recognition. The proposed method has been evaluated on UTD-MHAD Dataset and achieves the state-of-the-art result.","PeriodicalId":119131,"journal":{"name":"2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GlobalSIP.2018.8646404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Human action recognition has a wide range of applications including biometrics and surveillance. Existing methods mostly focus on a single modality, insufficient to characterize variations among different motions. To address this problem, we present a CNN-based human action recognition framework by fusing depth and skeleton modalities. The proposed Adaptive Multiscale Depth Motion Maps (AM-DMMs) are calculated from depth maps to capture shape, motion cues. Moreover, adaptive temporal windows ensure that AM-DMMs are robust to motion speed variations. A compact and effective method is also proposed to encode the spatio-temporal information of each skeleton sequence into three maps, referred to as Stable Joint Distance Maps (SJDMs) which describe different spatial relationships between the joints. A multi-channel CNN is adopted to exploit the discriminative features from texture color images encoded from AM-DMMs and SJDMs for effective recognition. The proposed method has been evaluated on UTD-MHAD Dataset and achieves the state-of-the-art result.