基于改进时空卷积神经网络的交警手势识别

2020 16th International Conference on Computational Intelligence and Security (CIS) Pub Date : 2020-11-01 DOI:10.1109/CIS52066.2020.00032

Zhixuan Wu, Nan Ma, Y. Cheung, Jiahong Li, Qin He, Yongqiang Yao, Guoping Zhang

{"title":"基于改进时空卷积神经网络的交警手势识别","authors":"Zhixuan Wu, Nan Ma, Y. Cheung, Jiahong Li, Qin He, Yongqiang Yao, Guoping Zhang","doi":"10.1109/CIS52066.2020.00032","DOIUrl":null,"url":null,"abstract":"In the era of artificial intelligence, human action recognition is a hot spot in the field of vision research, which makes the interaction between human and machine possible. Many intelligent applications benefit from human action recognition. Traditional traffic police gesture recognition methods often ignore the spatial and temporal information, so its timeliness in human computer interaction is limited. We propose a method that is Spatio-Temporal Convolutional Neural Networks (ST-CNN) which can detect and identify traffic police gestures. The method can identify traffic police gestures by using the correlation between spatial and temporal. Specifically, we use the convolutional neural network for feature extraction by taking into account both the spatial and temporal characteristics of the human actions. After the extraction of spatial and temporal features, the improved LSTM network can be used to effectively fuse, classify and recognize various features, so as to achieve the goal of human action recognition. We can make full use of the spatial and temporal information of the video and select effective features to reduce the computational load of the network. A large number of experiments on the Chinese traffic police gesture dataset show that our method is superior.","PeriodicalId":106959,"journal":{"name":"2020 16th International Conference on Computational Intelligence and Security (CIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improved Spatio-Temporal Convolutional Neural Networks for Traffic Police Gestures Recognition\",\"authors\":\"Zhixuan Wu, Nan Ma, Y. Cheung, Jiahong Li, Qin He, Yongqiang Yao, Guoping Zhang\",\"doi\":\"10.1109/CIS52066.2020.00032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the era of artificial intelligence, human action recognition is a hot spot in the field of vision research, which makes the interaction between human and machine possible. Many intelligent applications benefit from human action recognition. Traditional traffic police gesture recognition methods often ignore the spatial and temporal information, so its timeliness in human computer interaction is limited. We propose a method that is Spatio-Temporal Convolutional Neural Networks (ST-CNN) which can detect and identify traffic police gestures. The method can identify traffic police gestures by using the correlation between spatial and temporal. Specifically, we use the convolutional neural network for feature extraction by taking into account both the spatial and temporal characteristics of the human actions. After the extraction of spatial and temporal features, the improved LSTM network can be used to effectively fuse, classify and recognize various features, so as to achieve the goal of human action recognition. We can make full use of the spatial and temporal information of the video and select effective features to reduce the computational load of the network. A large number of experiments on the Chinese traffic police gesture dataset show that our method is superior.\",\"PeriodicalId\":106959,\"journal\":{\"name\":\"2020 16th International Conference on Computational Intelligence and Security (CIS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 16th International Conference on Computational Intelligence and Security (CIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIS52066.2020.00032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 16th International Conference on Computational Intelligence and Security (CIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS52066.2020.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在人工智能时代，人体动作识别是视觉领域的研究热点，它使人与机器之间的交互成为可能。许多智能应用都受益于人类行为识别。传统的交警手势识别方法往往忽略了时空信息，在人机交互中的时效性受到限制。我们提出了一种时空卷积神经网络(ST-CNN)检测和识别交警手势的方法。该方法利用交警手势的时空相关性来识别交警手势。具体来说，我们使用卷积神经网络进行特征提取，同时考虑到人类行为的空间和时间特征。在提取时空特征后，改进的LSTM网络可以有效地融合、分类和识别各种特征，从而达到人体动作识别的目的。我们可以充分利用视频的时空信息，选择有效的特征来减少网络的计算量。在中国交警手势数据集上的大量实验表明，我们的方法是优越的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improved Spatio-Temporal Convolutional Neural Networks for Traffic Police Gestures Recognition

In the era of artificial intelligence, human action recognition is a hot spot in the field of vision research, which makes the interaction between human and machine possible. Many intelligent applications benefit from human action recognition. Traditional traffic police gesture recognition methods often ignore the spatial and temporal information, so its timeliness in human computer interaction is limited. We propose a method that is Spatio-Temporal Convolutional Neural Networks (ST-CNN) which can detect and identify traffic police gestures. The method can identify traffic police gestures by using the correlation between spatial and temporal. Specifically, we use the convolutional neural network for feature extraction by taking into account both the spatial and temporal characteristics of the human actions. After the extraction of spatial and temporal features, the improved LSTM network can be used to effectively fuse, classify and recognize various features, so as to achieve the goal of human action recognition. We can make full use of the spatial and temporal information of the video and select effective features to reduce the computational load of the network. A large number of experiments on the Chinese traffic police gesture dataset show that our method is superior.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 16th International Conference on Computational Intelligence and Security (CIS)

自引率

0.00%

发文量