{"title":"基于cnn的自适应多尺度深度运动图和稳定关节距离图的动作识别","authors":"Junyou He, Hailun Xia, Chunyan Feng, Yunfei Chu","doi":"10.1109/GlobalSIP.2018.8646404","DOIUrl":null,"url":null,"abstract":"Human action recognition has a wide range of applications including biometrics and surveillance. Existing methods mostly focus on a single modality, insufficient to characterize variations among different motions. To address this problem, we present a CNN-based human action recognition framework by fusing depth and skeleton modalities. The proposed Adaptive Multiscale Depth Motion Maps (AM-DMMs) are calculated from depth maps to capture shape, motion cues. Moreover, adaptive temporal windows ensure that AM-DMMs are robust to motion speed variations. A compact and effective method is also proposed to encode the spatio-temporal information of each skeleton sequence into three maps, referred to as Stable Joint Distance Maps (SJDMs) which describe different spatial relationships between the joints. A multi-channel CNN is adopted to exploit the discriminative features from texture color images encoded from AM-DMMs and SJDMs for effective recognition. The proposed method has been evaluated on UTD-MHAD Dataset and achieves the state-of-the-art result.","PeriodicalId":119131,"journal":{"name":"2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTION MAPS AND STABLE JOINT DISTANCE MAPS\",\"authors\":\"Junyou He, Hailun Xia, Chunyan Feng, Yunfei Chu\",\"doi\":\"10.1109/GlobalSIP.2018.8646404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human action recognition has a wide range of applications including biometrics and surveillance. Existing methods mostly focus on a single modality, insufficient to characterize variations among different motions. To address this problem, we present a CNN-based human action recognition framework by fusing depth and skeleton modalities. The proposed Adaptive Multiscale Depth Motion Maps (AM-DMMs) are calculated from depth maps to capture shape, motion cues. Moreover, adaptive temporal windows ensure that AM-DMMs are robust to motion speed variations. A compact and effective method is also proposed to encode the spatio-temporal information of each skeleton sequence into three maps, referred to as Stable Joint Distance Maps (SJDMs) which describe different spatial relationships between the joints. A multi-channel CNN is adopted to exploit the discriminative features from texture color images encoded from AM-DMMs and SJDMs for effective recognition. The proposed method has been evaluated on UTD-MHAD Dataset and achieves the state-of-the-art result.\",\"PeriodicalId\":119131,\"journal\":{\"name\":\"2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GlobalSIP.2018.8646404\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GlobalSIP.2018.8646404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTION MAPS AND STABLE JOINT DISTANCE MAPS
Human action recognition has a wide range of applications including biometrics and surveillance. Existing methods mostly focus on a single modality, insufficient to characterize variations among different motions. To address this problem, we present a CNN-based human action recognition framework by fusing depth and skeleton modalities. The proposed Adaptive Multiscale Depth Motion Maps (AM-DMMs) are calculated from depth maps to capture shape, motion cues. Moreover, adaptive temporal windows ensure that AM-DMMs are robust to motion speed variations. A compact and effective method is also proposed to encode the spatio-temporal information of each skeleton sequence into three maps, referred to as Stable Joint Distance Maps (SJDMs) which describe different spatial relationships between the joints. A multi-channel CNN is adopted to exploit the discriminative features from texture color images encoded from AM-DMMs and SJDMs for effective recognition. The proposed method has been evaluated on UTD-MHAD Dataset and achieves the state-of-the-art result.