基于骨骼动作识别的双阶段融合自适应池化

IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Cong Wu , Xiao-Jun Wu , Tianyang Xu , Josef Kittler
{"title":"基于骨骼动作识别的双阶段融合自适应池化","authors":"Cong Wu ,&nbsp;Xiao-Jun Wu ,&nbsp;Tianyang Xu ,&nbsp;Josef Kittler","doi":"10.1016/j.neunet.2025.107615","DOIUrl":null,"url":null,"abstract":"<div><div>Pooling is essential in computer vision; however, for skeleton-based action recognition, (1) the unique structure of the skeleton limits the applicability of existing pooling strategies, and (2) the high compactness and low redundancy of the skeleton make information loss after pooling more likely to degrade accuracy. Considering these factors, in this paper, we propose an Improved Graph Pooling Network, referred to as IGPN. First, our method incorporates a region-awareness pooling strategy based on structural partitioning. Specifically, we use the correlation matrix of the original features to adaptively adjust the information weights across different regions of the newly generated feature, allowing for more flexible and effective processing. To prevent the irreversible loss of discriminative information caused by pooling, we introduce dual-stage fusion strategy that includes cross fusion module and information supplement module, which respectively complement feature-level and data-level information. As a plug-and-play structure, the proposed operation can be seamlessly integrated with existing graph conventional networks. Based on our innovations, we develop IGPN-Light, optimised for efficiency, and IGPN-Heavy, optimised for accuracy. Extensive evaluations on several challenging benchmarks demonstrate the effectiveness of our solution. For instance, in cross-subject evaluation on the NTU-RGB+D 60 dataset, IGPN-Light achieves significant accuracy improvements over the baseline while reducing FLOPs (floating-point operations per second) by 60<span><math><mo>∼</mo></math></span>70%. Meanwhile, IGPN-Heavy further boosts performance by prioritising accuracy over efficiency.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107615"},"PeriodicalIF":6.0000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive pooling with dual-stage fusion for skeleton-based action recognition\",\"authors\":\"Cong Wu ,&nbsp;Xiao-Jun Wu ,&nbsp;Tianyang Xu ,&nbsp;Josef Kittler\",\"doi\":\"10.1016/j.neunet.2025.107615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Pooling is essential in computer vision; however, for skeleton-based action recognition, (1) the unique structure of the skeleton limits the applicability of existing pooling strategies, and (2) the high compactness and low redundancy of the skeleton make information loss after pooling more likely to degrade accuracy. Considering these factors, in this paper, we propose an Improved Graph Pooling Network, referred to as IGPN. First, our method incorporates a region-awareness pooling strategy based on structural partitioning. Specifically, we use the correlation matrix of the original features to adaptively adjust the information weights across different regions of the newly generated feature, allowing for more flexible and effective processing. To prevent the irreversible loss of discriminative information caused by pooling, we introduce dual-stage fusion strategy that includes cross fusion module and information supplement module, which respectively complement feature-level and data-level information. As a plug-and-play structure, the proposed operation can be seamlessly integrated with existing graph conventional networks. Based on our innovations, we develop IGPN-Light, optimised for efficiency, and IGPN-Heavy, optimised for accuracy. Extensive evaluations on several challenging benchmarks demonstrate the effectiveness of our solution. For instance, in cross-subject evaluation on the NTU-RGB+D 60 dataset, IGPN-Light achieves significant accuracy improvements over the baseline while reducing FLOPs (floating-point operations per second) by 60<span><math><mo>∼</mo></math></span>70%. Meanwhile, IGPN-Heavy further boosts performance by prioritising accuracy over efficiency.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"190 \",\"pages\":\"Article 107615\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025004952\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025004952","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

池化在计算机视觉中是必不可少的;然而,对于基于骨架的动作识别,(1)骨架的独特结构限制了现有池化策略的适用性;(2)骨架的高紧凑性和低冗余性使得池化后的信息丢失更容易降低准确率。考虑到这些因素,在本文中,我们提出了一种改进的图池网络,简称IGPN。首先,我们的方法结合了基于结构划分的区域感知池策略。具体来说,我们利用原始特征的相关矩阵自适应地调整新生成特征的不同区域间的信息权重,从而实现更灵活有效的处理。为了防止池化导致的判别信息的不可逆丢失,我们引入了双阶段融合策略,包括交叉融合模块和信息补充模块,分别对特征级和数据级信息进行补充。作为一种即插即用的结构,所提出的操作可以与现有的图形传统网络无缝集成。基于我们的创新,我们开发了优化效率的IGPN-Light和优化精度的IGPN-Heavy。对几个具有挑战性的基准的广泛评估证明了我们的解决方案的有效性。例如,在NTU-RGB+D 60数据集的跨学科评估中,IGPN-Light在基线上实现了显着的精度改进,同时将FLOPs(每秒浮点操作数)降低了60 ~ 70%。同时,IGPN-Heavy通过优先考虑精度而不是效率来进一步提高性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Adaptive pooling with dual-stage fusion for skeleton-based action recognition
Pooling is essential in computer vision; however, for skeleton-based action recognition, (1) the unique structure of the skeleton limits the applicability of existing pooling strategies, and (2) the high compactness and low redundancy of the skeleton make information loss after pooling more likely to degrade accuracy. Considering these factors, in this paper, we propose an Improved Graph Pooling Network, referred to as IGPN. First, our method incorporates a region-awareness pooling strategy based on structural partitioning. Specifically, we use the correlation matrix of the original features to adaptively adjust the information weights across different regions of the newly generated feature, allowing for more flexible and effective processing. To prevent the irreversible loss of discriminative information caused by pooling, we introduce dual-stage fusion strategy that includes cross fusion module and information supplement module, which respectively complement feature-level and data-level information. As a plug-and-play structure, the proposed operation can be seamlessly integrated with existing graph conventional networks. Based on our innovations, we develop IGPN-Light, optimised for efficiency, and IGPN-Heavy, optimised for accuracy. Extensive evaluations on several challenging benchmarks demonstrate the effectiveness of our solution. For instance, in cross-subject evaluation on the NTU-RGB+D 60 dataset, IGPN-Light achieves significant accuracy improvements over the baseline while reducing FLOPs (floating-point operations per second) by 6070%. Meanwhile, IGPN-Heavy further boosts performance by prioritising accuracy over efficiency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信