{"title":"基于骨骼动作识别的双阶段融合自适应池化","authors":"Cong Wu , Xiao-Jun Wu , Tianyang Xu , Josef Kittler","doi":"10.1016/j.neunet.2025.107615","DOIUrl":null,"url":null,"abstract":"<div><div>Pooling is essential in computer vision; however, for skeleton-based action recognition, (1) the unique structure of the skeleton limits the applicability of existing pooling strategies, and (2) the high compactness and low redundancy of the skeleton make information loss after pooling more likely to degrade accuracy. Considering these factors, in this paper, we propose an Improved Graph Pooling Network, referred to as IGPN. First, our method incorporates a region-awareness pooling strategy based on structural partitioning. Specifically, we use the correlation matrix of the original features to adaptively adjust the information weights across different regions of the newly generated feature, allowing for more flexible and effective processing. To prevent the irreversible loss of discriminative information caused by pooling, we introduce dual-stage fusion strategy that includes cross fusion module and information supplement module, which respectively complement feature-level and data-level information. As a plug-and-play structure, the proposed operation can be seamlessly integrated with existing graph conventional networks. Based on our innovations, we develop IGPN-Light, optimised for efficiency, and IGPN-Heavy, optimised for accuracy. Extensive evaluations on several challenging benchmarks demonstrate the effectiveness of our solution. For instance, in cross-subject evaluation on the NTU-RGB+D 60 dataset, IGPN-Light achieves significant accuracy improvements over the baseline while reducing FLOPs (floating-point operations per second) by 60<span><math><mo>∼</mo></math></span>70%. Meanwhile, IGPN-Heavy further boosts performance by prioritising accuracy over efficiency.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107615"},"PeriodicalIF":6.0000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive pooling with dual-stage fusion for skeleton-based action recognition\",\"authors\":\"Cong Wu , Xiao-Jun Wu , Tianyang Xu , Josef Kittler\",\"doi\":\"10.1016/j.neunet.2025.107615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Pooling is essential in computer vision; however, for skeleton-based action recognition, (1) the unique structure of the skeleton limits the applicability of existing pooling strategies, and (2) the high compactness and low redundancy of the skeleton make information loss after pooling more likely to degrade accuracy. Considering these factors, in this paper, we propose an Improved Graph Pooling Network, referred to as IGPN. First, our method incorporates a region-awareness pooling strategy based on structural partitioning. Specifically, we use the correlation matrix of the original features to adaptively adjust the information weights across different regions of the newly generated feature, allowing for more flexible and effective processing. To prevent the irreversible loss of discriminative information caused by pooling, we introduce dual-stage fusion strategy that includes cross fusion module and information supplement module, which respectively complement feature-level and data-level information. As a plug-and-play structure, the proposed operation can be seamlessly integrated with existing graph conventional networks. Based on our innovations, we develop IGPN-Light, optimised for efficiency, and IGPN-Heavy, optimised for accuracy. Extensive evaluations on several challenging benchmarks demonstrate the effectiveness of our solution. For instance, in cross-subject evaluation on the NTU-RGB+D 60 dataset, IGPN-Light achieves significant accuracy improvements over the baseline while reducing FLOPs (floating-point operations per second) by 60<span><math><mo>∼</mo></math></span>70%. Meanwhile, IGPN-Heavy further boosts performance by prioritising accuracy over efficiency.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"190 \",\"pages\":\"Article 107615\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025004952\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025004952","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Adaptive pooling with dual-stage fusion for skeleton-based action recognition
Pooling is essential in computer vision; however, for skeleton-based action recognition, (1) the unique structure of the skeleton limits the applicability of existing pooling strategies, and (2) the high compactness and low redundancy of the skeleton make information loss after pooling more likely to degrade accuracy. Considering these factors, in this paper, we propose an Improved Graph Pooling Network, referred to as IGPN. First, our method incorporates a region-awareness pooling strategy based on structural partitioning. Specifically, we use the correlation matrix of the original features to adaptively adjust the information weights across different regions of the newly generated feature, allowing for more flexible and effective processing. To prevent the irreversible loss of discriminative information caused by pooling, we introduce dual-stage fusion strategy that includes cross fusion module and information supplement module, which respectively complement feature-level and data-level information. As a plug-and-play structure, the proposed operation can be seamlessly integrated with existing graph conventional networks. Based on our innovations, we develop IGPN-Light, optimised for efficiency, and IGPN-Heavy, optimised for accuracy. Extensive evaluations on several challenging benchmarks demonstrate the effectiveness of our solution. For instance, in cross-subject evaluation on the NTU-RGB+D 60 dataset, IGPN-Light achieves significant accuracy improvements over the baseline while reducing FLOPs (floating-point operations per second) by 6070%. Meanwhile, IGPN-Heavy further boosts performance by prioritising accuracy over efficiency.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.