基于手导向时空特征的连续手势识别

2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-01 DOI:10.1109/ICCVW.2017.361

Zhipeng Liu, Xiujuan Chai, Zhuang Liu, Xilin Chen

{"title":"基于手导向时空特征的连续手势识别","authors":"Zhipeng Liu, Xiujuan Chai, Zhuang Liu, Xilin Chen","doi":"10.1109/ICCVW.2017.361","DOIUrl":null,"url":null,"abstract":"In this paper, an efficient spotting-recognition framework is proposed to tackle the large scale continuous gesture recognition problem with the RGB-D data input. Concretely, continuous gestures are firstly segmented into isolated gestures based on the accurate hand positions obtained by two streams Faster R-CNN hand detector In the subsequent recognition stage, firstly, towards the gesture representation, a specific hand-oriented spatiotemporal (ST) feature is extracted for each isolated gesture video by 3D convolutional network (C3D). In this feature, only the hand regions and face location are considered, which can effectively block the negative influence of the distractors, such as the background, cloth and the body and so on. Next, the extracted features from calibrated RGB and depth channels are fused to boost the representative power and the final classification is achieved by using the simple linear SVM. Extensive experiments are conducted on the validation and testing sets of the Continuous Gesture Datasets (ConGD) to validate the effectiveness of the proposed recognition framework. Our method achieves the promising performance with the mean Jaccard Index of 0.6103 and outperforms other results in the ChaLearn LAP Large-scale Continuous Gesture Recognition Challenge.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"Continuous Gesture Recognition with Hand-Oriented Spatiotemporal Feature\",\"authors\":\"Zhipeng Liu, Xiujuan Chai, Zhuang Liu, Xilin Chen\",\"doi\":\"10.1109/ICCVW.2017.361\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, an efficient spotting-recognition framework is proposed to tackle the large scale continuous gesture recognition problem with the RGB-D data input. Concretely, continuous gestures are firstly segmented into isolated gestures based on the accurate hand positions obtained by two streams Faster R-CNN hand detector In the subsequent recognition stage, firstly, towards the gesture representation, a specific hand-oriented spatiotemporal (ST) feature is extracted for each isolated gesture video by 3D convolutional network (C3D). In this feature, only the hand regions and face location are considered, which can effectively block the negative influence of the distractors, such as the background, cloth and the body and so on. Next, the extracted features from calibrated RGB and depth channels are fused to boost the representative power and the final classification is achieved by using the simple linear SVM. Extensive experiments are conducted on the validation and testing sets of the Continuous Gesture Datasets (ConGD) to validate the effectiveness of the proposed recognition framework. Our method achieves the promising performance with the mean Jaccard Index of 0.6103 and outperforms other results in the ChaLearn LAP Large-scale Continuous Gesture Recognition Challenge.\",\"PeriodicalId\":149766,\"journal\":{\"name\":\"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCVW.2017.361\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCVW.2017.361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

摘要

针对RGB-D数据输入的大规模连续手势识别问题，提出了一种高效的点识别框架。在后续的识别阶段，首先针对手势表示，利用三维卷积网络(3D convolutional network, C3D)对每个孤立的手势视频提取一个特定的面向手部的时空(ST)特征。在该特征中，只考虑手部区域和面部位置，可以有效地阻挡背景、衣服、身体等干扰物的负面影响。然后，将校正后的RGB通道和深度通道提取的特征进行融合以增强代表性，最后使用简单线性支持向量机实现最终分类。在连续手势数据集(cond)的验证和测试集上进行了大量实验，以验证所提出的识别框架的有效性。我们的方法在ChaLearn LAP大规模连续手势识别挑战赛中取得了令人满意的成绩，平均Jaccard指数为0.6103，优于其他结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Continuous Gesture Recognition with Hand-Oriented Spatiotemporal Feature

In this paper, an efficient spotting-recognition framework is proposed to tackle the large scale continuous gesture recognition problem with the RGB-D data input. Concretely, continuous gestures are firstly segmented into isolated gestures based on the accurate hand positions obtained by two streams Faster R-CNN hand detector In the subsequent recognition stage, firstly, towards the gesture representation, a specific hand-oriented spatiotemporal (ST) feature is extracted for each isolated gesture video by 3D convolutional network (C3D). In this feature, only the hand regions and face location are considered, which can effectively block the negative influence of the distractors, such as the background, cloth and the body and so on. Next, the extracted features from calibrated RGB and depth channels are fused to boost the representative power and the final classification is achieved by using the simple linear SVM. Extensive experiments are conducted on the validation and testing sets of the Continuous Gesture Datasets (ConGD) to validate the effectiveness of the proposed recognition framework. Our method achieves the promising performance with the mean Jaccard Index of 0.6103 and outperforms other results in the ChaLearn LAP Large-scale Continuous Gesture Recognition Challenge.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Conference on Computer Vision Workshops (ICCVW)

自引率

0.00%

发文量