{"title":"基于提取投影平面的三流深度网络人体动作识别","authors":"S. Sahoo, S. Ari","doi":"10.1109/ICCECE48148.2020.9223095","DOIUrl":null,"url":null,"abstract":"Human actions are challenging to recognize as it varies its shape from different angle of perception. To tackle this challenge, a multi view camera set up can be arranged, however, it is not cost effective. To handle this issue, a multi stream deep learning network is proposed in this work which is trained on different 3D projected planes. The extracted projected planes which represents different angle of perception, are used as an alternative to multi view action recognition. The projected planes are such that they represents top, side and front view for the action videos. The projected planes are then fed to a three stream deep convolutional neural network. The network uses transfer learning technique to avoid training from scratch. Finally, the scores from three streams are fused to provide the final score to recognize the query video. To evaluate the proposed work, the challenging KTH dataset is used which is widely used and publicly available. The results show that the proposed work performs better compared to state-of-the-art techniques.","PeriodicalId":129001,"journal":{"name":"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Three Stream Deep Network on Extracted Projected Planes for Human Action Recognition\",\"authors\":\"S. Sahoo, S. Ari\",\"doi\":\"10.1109/ICCECE48148.2020.9223095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human actions are challenging to recognize as it varies its shape from different angle of perception. To tackle this challenge, a multi view camera set up can be arranged, however, it is not cost effective. To handle this issue, a multi stream deep learning network is proposed in this work which is trained on different 3D projected planes. The extracted projected planes which represents different angle of perception, are used as an alternative to multi view action recognition. The projected planes are such that they represents top, side and front view for the action videos. The projected planes are then fed to a three stream deep convolutional neural network. The network uses transfer learning technique to avoid training from scratch. Finally, the scores from three streams are fused to provide the final score to recognize the query video. To evaluate the proposed work, the challenging KTH dataset is used which is widely used and publicly available. The results show that the proposed work performs better compared to state-of-the-art techniques.\",\"PeriodicalId\":129001,\"journal\":{\"name\":\"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCECE48148.2020.9223095\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE48148.2020.9223095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Three Stream Deep Network on Extracted Projected Planes for Human Action Recognition
Human actions are challenging to recognize as it varies its shape from different angle of perception. To tackle this challenge, a multi view camera set up can be arranged, however, it is not cost effective. To handle this issue, a multi stream deep learning network is proposed in this work which is trained on different 3D projected planes. The extracted projected planes which represents different angle of perception, are used as an alternative to multi view action recognition. The projected planes are such that they represents top, side and front view for the action videos. The projected planes are then fed to a three stream deep convolutional neural network. The network uses transfer learning technique to avoid training from scratch. Finally, the scores from three streams are fused to provide the final score to recognize the query video. To evaluate the proposed work, the challenging KTH dataset is used which is widely used and publicly available. The results show that the proposed work performs better compared to state-of-the-art techniques.