基于少镜头骨架的动作识别的双注意焦点网络

IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jie Liu , Chongben Tao , Zhongwei Shen , Cong Wu , Tianyang Xu , Xizhao Luo , Feng Cao , Zhen Gao , Zufeng Zhang , Sai Xu
{"title":"基于少镜头骨架的动作识别的双注意焦点网络","authors":"Jie Liu ,&nbsp;Chongben Tao ,&nbsp;Zhongwei Shen ,&nbsp;Cong Wu ,&nbsp;Tianyang Xu ,&nbsp;Xizhao Luo ,&nbsp;Feng Cao ,&nbsp;Zhen Gao ,&nbsp;Zufeng Zhang ,&nbsp;Sai Xu","doi":"10.1016/j.knosys.2025.114549","DOIUrl":null,"url":null,"abstract":"<div><div>Few-shot action recognition is a challenging yet practically significant problem that involves developing a model capable of learning discriminative features from a small number of labeled samples to recognize new action categories. Current methods typically infer spatial relationships either within or across skeletons to learn action representations, but this often results in features with insufficient discriminability and ineffective attention to critical body parts. To address these limitations, we propose DAF-Net, a novel framework that employs focal attention to jointly model intra-skeleton and inter-skeleton relationships, enhancing discriminative feature learning in few-shot skeleton-based action recognition. Unlike traditional methods that focus solely on intra-skeleton dependencies or inter-skeleton structures, DAF-Net dynamically integrates both components via focal attention, enhancing key body part representation and refining features, particularly in data-scarce conditions. Furthermore, DAF-Net incorporates an enhanced prototype generation strategy, optimizing class prototype formation via cosine similarity weighting to further improve feature discriminability in multi-shot scenarios. In temporal matching, cosine similarity evaluates local feature similarity within skeleton sequences, capturing directional variations of specific joints over time. Extensive experiments on three benchmark datasets (NTU-T, NTU-S, and Kinetics-skeleton) confirm significant performance gains, validating the effectiveness of DAF-Net.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114549"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual attention focus network for few-shot skeleton-based action recognition\",\"authors\":\"Jie Liu ,&nbsp;Chongben Tao ,&nbsp;Zhongwei Shen ,&nbsp;Cong Wu ,&nbsp;Tianyang Xu ,&nbsp;Xizhao Luo ,&nbsp;Feng Cao ,&nbsp;Zhen Gao ,&nbsp;Zufeng Zhang ,&nbsp;Sai Xu\",\"doi\":\"10.1016/j.knosys.2025.114549\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Few-shot action recognition is a challenging yet practically significant problem that involves developing a model capable of learning discriminative features from a small number of labeled samples to recognize new action categories. Current methods typically infer spatial relationships either within or across skeletons to learn action representations, but this often results in features with insufficient discriminability and ineffective attention to critical body parts. To address these limitations, we propose DAF-Net, a novel framework that employs focal attention to jointly model intra-skeleton and inter-skeleton relationships, enhancing discriminative feature learning in few-shot skeleton-based action recognition. Unlike traditional methods that focus solely on intra-skeleton dependencies or inter-skeleton structures, DAF-Net dynamically integrates both components via focal attention, enhancing key body part representation and refining features, particularly in data-scarce conditions. Furthermore, DAF-Net incorporates an enhanced prototype generation strategy, optimizing class prototype formation via cosine similarity weighting to further improve feature discriminability in multi-shot scenarios. In temporal matching, cosine similarity evaluates local feature similarity within skeleton sequences, capturing directional variations of specific joints over time. Extensive experiments on three benchmark datasets (NTU-T, NTU-S, and Kinetics-skeleton) confirm significant performance gains, validating the effectiveness of DAF-Net.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"330 \",\"pages\":\"Article 114549\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125015886\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015886","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

少量动作识别是一个具有挑战性但又具有实际意义的问题,它涉及到开发一个能够从少量标记样本中学习判别特征以识别新动作类别的模型。目前的方法通常通过推断骨骼内部或骨骼之间的空间关系来学习动作表征,但这通常会导致特征的可辨别性不足,并且对关键身体部位的关注无效。为了解决这些限制,我们提出了DAF-Net,这是一个新的框架,它使用焦点关注来共同建模骨架内和骨架间的关系,增强了基于少数镜头骨架的动作识别中的判别特征学习。与仅关注骨架内依赖关系或骨架间结构的传统方法不同,DAF-Net通过焦点关注动态集成这两个组件,增强关键身体部位的表示并精炼特征,特别是在数据稀缺的条件下。此外,DAF-Net还采用了一种增强的原型生成策略,通过余弦相似度加权优化类原型的形成,进一步提高了多镜头场景下的特征可辨别性。在时间匹配中,余弦相似性评估骨骼序列中的局部特征相似性,捕获特定关节随时间的方向变化。在三个基准数据集(NTU-T、NTU-S和Kinetics-skeleton)上进行的大量实验证实了显著的性能提升,验证了DAF-Net的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dual attention focus network for few-shot skeleton-based action recognition
Few-shot action recognition is a challenging yet practically significant problem that involves developing a model capable of learning discriminative features from a small number of labeled samples to recognize new action categories. Current methods typically infer spatial relationships either within or across skeletons to learn action representations, but this often results in features with insufficient discriminability and ineffective attention to critical body parts. To address these limitations, we propose DAF-Net, a novel framework that employs focal attention to jointly model intra-skeleton and inter-skeleton relationships, enhancing discriminative feature learning in few-shot skeleton-based action recognition. Unlike traditional methods that focus solely on intra-skeleton dependencies or inter-skeleton structures, DAF-Net dynamically integrates both components via focal attention, enhancing key body part representation and refining features, particularly in data-scarce conditions. Furthermore, DAF-Net incorporates an enhanced prototype generation strategy, optimizing class prototype formation via cosine similarity weighting to further improve feature discriminability in multi-shot scenarios. In temporal matching, cosine similarity evaluates local feature similarity within skeleton sequences, capturing directional variations of specific joints over time. Extensive experiments on three benchmark datasets (NTU-T, NTU-S, and Kinetics-skeleton) confirm significant performance gains, validating the effectiveness of DAF-Net.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信