基于提取投影平面的三流深度网络人体动作识别

2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE) Pub Date : 2020-01-01 DOI:10.1109/ICCECE48148.2020.9223095

S. Sahoo, S. Ari

{"title":"基于提取投影平面的三流深度网络人体动作识别","authors":"S. Sahoo, S. Ari","doi":"10.1109/ICCECE48148.2020.9223095","DOIUrl":null,"url":null,"abstract":"Human actions are challenging to recognize as it varies its shape from different angle of perception. To tackle this challenge, a multi view camera set up can be arranged, however, it is not cost effective. To handle this issue, a multi stream deep learning network is proposed in this work which is trained on different 3D projected planes. The extracted projected planes which represents different angle of perception, are used as an alternative to multi view action recognition. The projected planes are such that they represents top, side and front view for the action videos. The projected planes are then fed to a three stream deep convolutional neural network. The network uses transfer learning technique to avoid training from scratch. Finally, the scores from three streams are fused to provide the final score to recognize the query video. To evaluate the proposed work, the challenging KTH dataset is used which is widely used and publicly available. The results show that the proposed work performs better compared to state-of-the-art techniques.","PeriodicalId":129001,"journal":{"name":"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Three Stream Deep Network on Extracted Projected Planes for Human Action Recognition\",\"authors\":\"S. Sahoo, S. Ari\",\"doi\":\"10.1109/ICCECE48148.2020.9223095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human actions are challenging to recognize as it varies its shape from different angle of perception. To tackle this challenge, a multi view camera set up can be arranged, however, it is not cost effective. To handle this issue, a multi stream deep learning network is proposed in this work which is trained on different 3D projected planes. The extracted projected planes which represents different angle of perception, are used as an alternative to multi view action recognition. The projected planes are such that they represents top, side and front view for the action videos. The projected planes are then fed to a three stream deep convolutional neural network. The network uses transfer learning technique to avoid training from scratch. Finally, the scores from three streams are fused to provide the final score to recognize the query video. To evaluate the proposed work, the challenging KTH dataset is used which is widely used and publicly available. The results show that the proposed work performs better compared to state-of-the-art techniques.\",\"PeriodicalId\":129001,\"journal\":{\"name\":\"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCECE48148.2020.9223095\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE48148.2020.9223095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

人类的行为具有挑战性，因为它从不同的感知角度变化其形状。为了应对这一挑战，可以安排一个多视角相机设置，然而，这并不符合成本效益。为了解决这个问题，本文提出了一个多流深度学习网络，该网络在不同的三维投影平面上进行训练。提取的投影平面代表不同的感知角度，作为多视图动作识别的替代方案。投影平面是这样的，它们代表了动作视频的顶部，侧面和正面视图。然后将投影平面馈送到三流深度卷积神经网络。该网络采用迁移学习技术，避免从头开始训练。最后，将三个流的分数融合为最终分数来识别查询视频。为了评估建议的工作，使用了广泛使用且公开可用的具有挑战性的KTH数据集。结果表明，与最先进的技术相比，所提出的工作性能更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Three Stream Deep Network on Extracted Projected Planes for Human Action Recognition

Human actions are challenging to recognize as it varies its shape from different angle of perception. To tackle this challenge, a multi view camera set up can be arranged, however, it is not cost effective. To handle this issue, a multi stream deep learning network is proposed in this work which is trained on different 3D projected planes. The extracted projected planes which represents different angle of perception, are used as an alternative to multi view action recognition. The projected planes are such that they represents top, side and front view for the action videos. The projected planes are then fed to a three stream deep convolutional neural network. The network uses transfer learning technique to avoid training from scratch. Finally, the scores from three streams are fused to provide the final score to recognize the query video. To evaluate the proposed work, the challenging KTH dataset is used which is widely used and publicly available. The results show that the proposed work performs better compared to state-of-the-art techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference on Computer, Electrical & Communication Engineering (ICCECE)

自引率

0.00%

发文量