CrossFeat:手势识别的多尺度交叉特征聚合网络

2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS) Pub Date : 2020-11-26 DOI:10.1109/ICIIS51140.2020.9342652

Gopa Bhaumik, Monu Verma, M. C. Govil, S. Vipparthi

{"title":"CrossFeat:手势识别的多尺度交叉特征聚合网络","authors":"Gopa Bhaumik, Monu Verma, M. C. Govil, S. Vipparthi","doi":"10.1109/ICIIS51140.2020.9342652","DOIUrl":null,"url":null,"abstract":"Hand gestures are considered as an effective means of communication in the field of Human-computer interaction. However, the design of an efficient hand gesture recognition (HGR) system is still a challenging task owing to a plethora of complexities such as cluttered background, illumination changes, and occlusion in a real-world environment. The paper proposes a lightweight CNN based network named CrossFeat: Multi-scale Cross Feature Aggregation network for explicit hand gesture recognition (HGR). CrossFeat employs multi-scale convolutional layers and preserves the spatial features from the hand gesture region. The use of multi-scale filters: 1 × 1, 3 × 3, 5 × 5 and 7 × 7 allow the network to learn granular and coarse edges from the different regions of the hand gestures. These complementary features enhance the learning ability of the network. Moreover, the cross-layer connectivity enables the gradient information to reach the top layers and prevent it from diminishing in the upstream layers. The proposed network is investigated on three benchmark datasets: ASL Finger Spelling, NUS-I and NUS-II. The experimental results and analysis show that the aggregation of multi-scale and cross features enhances the performance of the HGR system compared to the existing networks.","PeriodicalId":352858,"journal":{"name":"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"CrossFeat: Multi-scale Cross Feature Aggregation Network for Hand Gesture Recognition\",\"authors\":\"Gopa Bhaumik, Monu Verma, M. C. Govil, S. Vipparthi\",\"doi\":\"10.1109/ICIIS51140.2020.9342652\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hand gestures are considered as an effective means of communication in the field of Human-computer interaction. However, the design of an efficient hand gesture recognition (HGR) system is still a challenging task owing to a plethora of complexities such as cluttered background, illumination changes, and occlusion in a real-world environment. The paper proposes a lightweight CNN based network named CrossFeat: Multi-scale Cross Feature Aggregation network for explicit hand gesture recognition (HGR). CrossFeat employs multi-scale convolutional layers and preserves the spatial features from the hand gesture region. The use of multi-scale filters: 1 × 1, 3 × 3, 5 × 5 and 7 × 7 allow the network to learn granular and coarse edges from the different regions of the hand gestures. These complementary features enhance the learning ability of the network. Moreover, the cross-layer connectivity enables the gradient information to reach the top layers and prevent it from diminishing in the upstream layers. The proposed network is investigated on three benchmark datasets: ASL Finger Spelling, NUS-I and NUS-II. The experimental results and analysis show that the aggregation of multi-scale and cross features enhances the performance of the HGR system compared to the existing networks.\",\"PeriodicalId\":352858,\"journal\":{\"name\":\"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIIS51140.2020.9342652\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIIS51140.2020.9342652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在人机交互领域，手势被认为是一种有效的交流手段。然而，设计一个高效的手势识别(HGR)系统仍然是一项具有挑战性的任务，因为在现实环境中存在大量的复杂性，如杂乱的背景，照明变化和遮挡。本文提出了一种基于CNN的轻量级网络CrossFeat:多尺度交叉特征聚合网络，用于显式手势识别(HGR)。CrossFeat采用多尺度卷积层，并保留手势区域的空间特征。使用多尺度过滤器:1 × 1、3 × 3、5 × 5和7 × 7允许网络从手势的不同区域学习颗粒和粗边。这些互补的特征增强了网络的学习能力。此外，跨层连通性使梯度信息能够到达顶层，并防止其在上游层中衰减。在美国手语手指拼写、NUS-I和NUS-II三个基准数据集上对所提出的网络进行了研究。实验结果和分析表明，与现有网络相比，多尺度和交叉特征的聚合提高了HGR系统的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CrossFeat: Multi-scale Cross Feature Aggregation Network for Hand Gesture Recognition

Hand gestures are considered as an effective means of communication in the field of Human-computer interaction. However, the design of an efficient hand gesture recognition (HGR) system is still a challenging task owing to a plethora of complexities such as cluttered background, illumination changes, and occlusion in a real-world environment. The paper proposes a lightweight CNN based network named CrossFeat: Multi-scale Cross Feature Aggregation network for explicit hand gesture recognition (HGR). CrossFeat employs multi-scale convolutional layers and preserves the spatial features from the hand gesture region. The use of multi-scale filters: 1 × 1, 3 × 3, 5 × 5 and 7 × 7 allow the network to learn granular and coarse edges from the different regions of the hand gestures. These complementary features enhance the learning ability of the network. Moreover, the cross-layer connectivity enables the gradient information to reach the top layers and prevent it from diminishing in the upstream layers. The proposed network is investigated on three benchmark datasets: ASL Finger Spelling, NUS-I and NUS-II. The experimental results and analysis show that the aggregation of multi-scale and cross features enhances the performance of the HGR system compared to the existing networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)

自引率

0.00%

发文量