{"title":"MSE-GCN: A Multiscale Spatiotemporal Feature Aggregation Enhanced Efficient Graph Convolutional Network for Dynamic Sign Language Recognition","authors":"Neelma Naz;Hasan Sajid;Sara Ali;Osman Hasan;Muhammad Khurram Ehsan","doi":"10.1109/TETCI.2024.3509500","DOIUrl":null,"url":null,"abstract":"Graph convolution networks have emerged as an active area of research for skeleton-based sign language recognition (SLR). One essential problem in this approach is to efficiently extract the most discriminative features capable of modeling short-range and long-range spatial and temporal information over all skeleton joints while ensuring low inference costs. To address this issue, we propose a novel multi-scale efficient graph convolutional network (MSE-GCN) for skeleton-based SLR. The proposed network makes use of separable convolution layers set in a multi-scale setting and embedded in a multi branch (MB) network along with an early fusion scheme, resulting in an accurate, computationally efficient, and faster system. In addition, we have proposed a novel hybrid attention module, named Spatial Temporal Joint Part attention (ST-JPA) to distinguish the most important body parts as well as most informative joints in the specific frames from the whole sign sequence. The performance of proposed network (MSE-GCN) is evaluated on five challenging sign language datasets, WLASL-100, WLASL-300, WLASL-1000, MINDS-Libras, and LIBRAS-UFOP achieving state-of-the-art (SOTA) accuracies of 85.27%, 81.59%, 71.75%, 97.442 ± 1.01%, and 88.59±3.60%, respectively while incurring lower computational costs.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 4","pages":"2979-2994"},"PeriodicalIF":5.3000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10799160/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Graph convolution networks have emerged as an active area of research for skeleton-based sign language recognition (SLR). One essential problem in this approach is to efficiently extract the most discriminative features capable of modeling short-range and long-range spatial and temporal information over all skeleton joints while ensuring low inference costs. To address this issue, we propose a novel multi-scale efficient graph convolutional network (MSE-GCN) for skeleton-based SLR. The proposed network makes use of separable convolution layers set in a multi-scale setting and embedded in a multi branch (MB) network along with an early fusion scheme, resulting in an accurate, computationally efficient, and faster system. In addition, we have proposed a novel hybrid attention module, named Spatial Temporal Joint Part attention (ST-JPA) to distinguish the most important body parts as well as most informative joints in the specific frames from the whole sign sequence. The performance of proposed network (MSE-GCN) is evaluated on five challenging sign language datasets, WLASL-100, WLASL-300, WLASL-1000, MINDS-Libras, and LIBRAS-UFOP achieving state-of-the-art (SOTA) accuracies of 85.27%, 81.59%, 71.75%, 97.442 ± 1.01%, and 88.59±3.60%, respectively while incurring lower computational costs.
期刊介绍:
The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys.
TETCI is an electronics only publication. TETCI publishes six issues per year.
Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.