模态-卷积:基于卷积神经网络的多模态手势识别

Da Huo, Yufeng Chen, Fengxia Li, Zhengchao Lei
{"title":"模态-卷积:基于卷积神经网络的多模态手势识别","authors":"Da Huo, Yufeng Chen, Fengxia Li, Zhengchao Lei","doi":"10.1109/ICCSE.2017.8085515","DOIUrl":null,"url":null,"abstract":"We proposed a novel method of feature extraction for multi-modal images called modality-convolution. It extracts both the intra- and inter-modality information. Whats more, it completes the data fusion at pixel-level so that the complementarity of information contained in multi-modal data is fully utilized. Based on the modality-convolution, we describe a modality-CNN for multi-modal gesture recognition. For extracting the features in RGB-D images, the modality-CNN is adopted in the gesture recognition framework. The framework use DBN to present the skeleton data. Then, the probability obtained by the two networks are fused and put into the HMM to carry out dynamic gestures classification. We use the Jaccar Index to calculate the accuracy of gesture recognition. A comparative experiment on ChaLearn LAP 2014 gesture datasets shows that the modality-convolution is able to extract the inter- and intra-modality information effectively, which is helpful to improve the accuracy.","PeriodicalId":256055,"journal":{"name":"2017 12th International Conference on Computer Science and Education (ICCSE)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Modality-convolutions: Multi-modal gesture recognition based on convolutional neural network\",\"authors\":\"Da Huo, Yufeng Chen, Fengxia Li, Zhengchao Lei\",\"doi\":\"10.1109/ICCSE.2017.8085515\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We proposed a novel method of feature extraction for multi-modal images called modality-convolution. It extracts both the intra- and inter-modality information. Whats more, it completes the data fusion at pixel-level so that the complementarity of information contained in multi-modal data is fully utilized. Based on the modality-convolution, we describe a modality-CNN for multi-modal gesture recognition. For extracting the features in RGB-D images, the modality-CNN is adopted in the gesture recognition framework. The framework use DBN to present the skeleton data. Then, the probability obtained by the two networks are fused and put into the HMM to carry out dynamic gestures classification. We use the Jaccar Index to calculate the accuracy of gesture recognition. A comparative experiment on ChaLearn LAP 2014 gesture datasets shows that the modality-convolution is able to extract the inter- and intra-modality information effectively, which is helpful to improve the accuracy.\",\"PeriodicalId\":256055,\"journal\":{\"name\":\"2017 12th International Conference on Computer Science and Education (ICCSE)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 12th International Conference on Computer Science and Education (ICCSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSE.2017.8085515\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 12th International Conference on Computer Science and Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE.2017.8085515","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

我们提出了一种新的多模态图像特征提取方法,称为模态卷积。它同时提取模态内和模态间的信息。并在像素级完成数据融合,充分利用多模态数据所含信息的互补性。在模态卷积的基础上,提出了一种用于多模态手势识别的模态cnn。为了提取RGB-D图像中的特征,在手势识别框架中采用了模态- cnn。该框架使用DBN来表示骨架数据。然后,将两个网络得到的概率融合到HMM中进行动态手势分类。我们使用Jaccar指数来计算手势识别的准确性。在ChaLearn LAP 2014手势数据集上的对比实验表明,模态卷积能够有效地提取模态间和模态内信息,有助于提高识别精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Modality-convolutions: Multi-modal gesture recognition based on convolutional neural network
We proposed a novel method of feature extraction for multi-modal images called modality-convolution. It extracts both the intra- and inter-modality information. Whats more, it completes the data fusion at pixel-level so that the complementarity of information contained in multi-modal data is fully utilized. Based on the modality-convolution, we describe a modality-CNN for multi-modal gesture recognition. For extracting the features in RGB-D images, the modality-CNN is adopted in the gesture recognition framework. The framework use DBN to present the skeleton data. Then, the probability obtained by the two networks are fused and put into the HMM to carry out dynamic gestures classification. We use the Jaccar Index to calculate the accuracy of gesture recognition. A comparative experiment on ChaLearn LAP 2014 gesture datasets shows that the modality-convolution is able to extract the inter- and intra-modality information effectively, which is helpful to improve the accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信