Gopa Bhaumik, Monu Verma, M. C. Govil, S. Vipparthi
{"title":"CrossFeat:手势识别的多尺度交叉特征聚合网络","authors":"Gopa Bhaumik, Monu Verma, M. C. Govil, S. Vipparthi","doi":"10.1109/ICIIS51140.2020.9342652","DOIUrl":null,"url":null,"abstract":"Hand gestures are considered as an effective means of communication in the field of Human-computer interaction. However, the design of an efficient hand gesture recognition (HGR) system is still a challenging task owing to a plethora of complexities such as cluttered background, illumination changes, and occlusion in a real-world environment. The paper proposes a lightweight CNN based network named CrossFeat: Multi-scale Cross Feature Aggregation network for explicit hand gesture recognition (HGR). CrossFeat employs multi-scale convolutional layers and preserves the spatial features from the hand gesture region. The use of multi-scale filters: 1 × 1, 3 × 3, 5 × 5 and 7 × 7 allow the network to learn granular and coarse edges from the different regions of the hand gestures. These complementary features enhance the learning ability of the network. Moreover, the cross-layer connectivity enables the gradient information to reach the top layers and prevent it from diminishing in the upstream layers. The proposed network is investigated on three benchmark datasets: ASL Finger Spelling, NUS-I and NUS-II. The experimental results and analysis show that the aggregation of multi-scale and cross features enhances the performance of the HGR system compared to the existing networks.","PeriodicalId":352858,"journal":{"name":"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"CrossFeat: Multi-scale Cross Feature Aggregation Network for Hand Gesture Recognition\",\"authors\":\"Gopa Bhaumik, Monu Verma, M. C. Govil, S. Vipparthi\",\"doi\":\"10.1109/ICIIS51140.2020.9342652\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hand gestures are considered as an effective means of communication in the field of Human-computer interaction. However, the design of an efficient hand gesture recognition (HGR) system is still a challenging task owing to a plethora of complexities such as cluttered background, illumination changes, and occlusion in a real-world environment. The paper proposes a lightweight CNN based network named CrossFeat: Multi-scale Cross Feature Aggregation network for explicit hand gesture recognition (HGR). CrossFeat employs multi-scale convolutional layers and preserves the spatial features from the hand gesture region. The use of multi-scale filters: 1 × 1, 3 × 3, 5 × 5 and 7 × 7 allow the network to learn granular and coarse edges from the different regions of the hand gestures. These complementary features enhance the learning ability of the network. Moreover, the cross-layer connectivity enables the gradient information to reach the top layers and prevent it from diminishing in the upstream layers. The proposed network is investigated on three benchmark datasets: ASL Finger Spelling, NUS-I and NUS-II. The experimental results and analysis show that the aggregation of multi-scale and cross features enhances the performance of the HGR system compared to the existing networks.\",\"PeriodicalId\":352858,\"journal\":{\"name\":\"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIIS51140.2020.9342652\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIIS51140.2020.9342652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CrossFeat: Multi-scale Cross Feature Aggregation Network for Hand Gesture Recognition
Hand gestures are considered as an effective means of communication in the field of Human-computer interaction. However, the design of an efficient hand gesture recognition (HGR) system is still a challenging task owing to a plethora of complexities such as cluttered background, illumination changes, and occlusion in a real-world environment. The paper proposes a lightweight CNN based network named CrossFeat: Multi-scale Cross Feature Aggregation network for explicit hand gesture recognition (HGR). CrossFeat employs multi-scale convolutional layers and preserves the spatial features from the hand gesture region. The use of multi-scale filters: 1 × 1, 3 × 3, 5 × 5 and 7 × 7 allow the network to learn granular and coarse edges from the different regions of the hand gestures. These complementary features enhance the learning ability of the network. Moreover, the cross-layer connectivity enables the gradient information to reach the top layers and prevent it from diminishing in the upstream layers. The proposed network is investigated on three benchmark datasets: ASL Finger Spelling, NUS-I and NUS-II. The experimental results and analysis show that the aggregation of multi-scale and cross features enhances the performance of the HGR system compared to the existing networks.