Jie Liu , Chongben Tao , Zhongwei Shen , Cong Wu , Tianyang Xu , Xizhao Luo , Feng Cao , Zhen Gao , Zufeng Zhang , Sai Xu
{"title":"基于少镜头骨架的动作识别的双注意焦点网络","authors":"Jie Liu , Chongben Tao , Zhongwei Shen , Cong Wu , Tianyang Xu , Xizhao Luo , Feng Cao , Zhen Gao , Zufeng Zhang , Sai Xu","doi":"10.1016/j.knosys.2025.114549","DOIUrl":null,"url":null,"abstract":"<div><div>Few-shot action recognition is a challenging yet practically significant problem that involves developing a model capable of learning discriminative features from a small number of labeled samples to recognize new action categories. Current methods typically infer spatial relationships either within or across skeletons to learn action representations, but this often results in features with insufficient discriminability and ineffective attention to critical body parts. To address these limitations, we propose DAF-Net, a novel framework that employs focal attention to jointly model intra-skeleton and inter-skeleton relationships, enhancing discriminative feature learning in few-shot skeleton-based action recognition. Unlike traditional methods that focus solely on intra-skeleton dependencies or inter-skeleton structures, DAF-Net dynamically integrates both components via focal attention, enhancing key body part representation and refining features, particularly in data-scarce conditions. Furthermore, DAF-Net incorporates an enhanced prototype generation strategy, optimizing class prototype formation via cosine similarity weighting to further improve feature discriminability in multi-shot scenarios. In temporal matching, cosine similarity evaluates local feature similarity within skeleton sequences, capturing directional variations of specific joints over time. Extensive experiments on three benchmark datasets (NTU-T, NTU-S, and Kinetics-skeleton) confirm significant performance gains, validating the effectiveness of DAF-Net.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114549"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual attention focus network for few-shot skeleton-based action recognition\",\"authors\":\"Jie Liu , Chongben Tao , Zhongwei Shen , Cong Wu , Tianyang Xu , Xizhao Luo , Feng Cao , Zhen Gao , Zufeng Zhang , Sai Xu\",\"doi\":\"10.1016/j.knosys.2025.114549\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Few-shot action recognition is a challenging yet practically significant problem that involves developing a model capable of learning discriminative features from a small number of labeled samples to recognize new action categories. Current methods typically infer spatial relationships either within or across skeletons to learn action representations, but this often results in features with insufficient discriminability and ineffective attention to critical body parts. To address these limitations, we propose DAF-Net, a novel framework that employs focal attention to jointly model intra-skeleton and inter-skeleton relationships, enhancing discriminative feature learning in few-shot skeleton-based action recognition. Unlike traditional methods that focus solely on intra-skeleton dependencies or inter-skeleton structures, DAF-Net dynamically integrates both components via focal attention, enhancing key body part representation and refining features, particularly in data-scarce conditions. Furthermore, DAF-Net incorporates an enhanced prototype generation strategy, optimizing class prototype formation via cosine similarity weighting to further improve feature discriminability in multi-shot scenarios. In temporal matching, cosine similarity evaluates local feature similarity within skeleton sequences, capturing directional variations of specific joints over time. Extensive experiments on three benchmark datasets (NTU-T, NTU-S, and Kinetics-skeleton) confirm significant performance gains, validating the effectiveness of DAF-Net.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"330 \",\"pages\":\"Article 114549\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125015886\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015886","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Dual attention focus network for few-shot skeleton-based action recognition
Few-shot action recognition is a challenging yet practically significant problem that involves developing a model capable of learning discriminative features from a small number of labeled samples to recognize new action categories. Current methods typically infer spatial relationships either within or across skeletons to learn action representations, but this often results in features with insufficient discriminability and ineffective attention to critical body parts. To address these limitations, we propose DAF-Net, a novel framework that employs focal attention to jointly model intra-skeleton and inter-skeleton relationships, enhancing discriminative feature learning in few-shot skeleton-based action recognition. Unlike traditional methods that focus solely on intra-skeleton dependencies or inter-skeleton structures, DAF-Net dynamically integrates both components via focal attention, enhancing key body part representation and refining features, particularly in data-scarce conditions. Furthermore, DAF-Net incorporates an enhanced prototype generation strategy, optimizing class prototype formation via cosine similarity weighting to further improve feature discriminability in multi-shot scenarios. In temporal matching, cosine similarity evaluates local feature similarity within skeleton sequences, capturing directional variations of specific joints over time. Extensive experiments on three benchmark datasets (NTU-T, NTU-S, and Kinetics-skeleton) confirm significant performance gains, validating the effectiveness of DAF-Net.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.