{"title":"使用数据可视化和卷积神经网络的三维动作识别","authors":"Mengyuan Liu, Chen Chen, Hong Liu","doi":"10.1109/ICME.2017.8019438","DOIUrl":null,"url":null,"abstract":"It remains a challenge to efficiently represent spatial-temporal data for 3D action recognition. To solve this problem, this paper presents a new skeleton-based action representation using data visualization and convolutional neural networks, which contains four main stages. First, skeletons from an action sequence are mapped as a set of five dimensional points, containing three dimensions of location, one dimension of time label and one dimension of joint label. Second, these points are encoded as a series of color images, by visualizing points as RGB pixels. Third, convolutional neural networks are adopted to extract deep features from color images. Finally, action class score is calculated by fusing selected deep features. Extensive experiments on three benchmark datasets show that our method achieves state-of-the-art results.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"3D action recognition using data visualization and convolutional neural networks\",\"authors\":\"Mengyuan Liu, Chen Chen, Hong Liu\",\"doi\":\"10.1109/ICME.2017.8019438\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It remains a challenge to efficiently represent spatial-temporal data for 3D action recognition. To solve this problem, this paper presents a new skeleton-based action representation using data visualization and convolutional neural networks, which contains four main stages. First, skeletons from an action sequence are mapped as a set of five dimensional points, containing three dimensions of location, one dimension of time label and one dimension of joint label. Second, these points are encoded as a series of color images, by visualizing points as RGB pixels. Third, convolutional neural networks are adopted to extract deep features from color images. Finally, action class score is calculated by fusing selected deep features. Extensive experiments on three benchmark datasets show that our method achieves state-of-the-art results.\",\"PeriodicalId\":330977,\"journal\":{\"name\":\"2017 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2017.8019438\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2017.8019438","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
3D action recognition using data visualization and convolutional neural networks
It remains a challenge to efficiently represent spatial-temporal data for 3D action recognition. To solve this problem, this paper presents a new skeleton-based action representation using data visualization and convolutional neural networks, which contains four main stages. First, skeletons from an action sequence are mapped as a set of five dimensional points, containing three dimensions of location, one dimension of time label and one dimension of joint label. Second, these points are encoded as a series of color images, by visualizing points as RGB pixels. Third, convolutional neural networks are adopted to extract deep features from color images. Finally, action class score is calculated by fusing selected deep features. Extensive experiments on three benchmark datasets show that our method achieves state-of-the-art results.