GaitBranch：一种结合帧-通道注意机制的多分支精细模型

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Vision and Image Understanding Pub Date : 2025-08-25 DOI:10.1016/j.cviu.2025.104463

Huakang Li , Yidan Qiu , Huimin Zhao , Jin Zhan , Rongjun Chen , Jinchang Ren , Ying Gao , Wing W.Y. Ng

{"title":"GaitBranch：一种结合帧-通道注意机制的多分支精细模型","authors":"Huakang Li , Yidan Qiu , Huimin Zhao , Jin Zhan , Rongjun Chen , Jinchang Ren , Ying Gao , Wing W.Y. Ng","doi":"10.1016/j.cviu.2025.104463","DOIUrl":null,"url":null,"abstract":"<div><div>Accurately representing human motion in video-based gait recognition is challenging due to the difficulty in obtaining an ideal gait silhouette sequence that captures comprehensive information. To address this challenge, we propose GaitBranch, a novel method that emphasizes local key information of human motion in different layers of the neural network. It divides the neural network into multiple branches using the multi-branch refinement (MBR) module and extracts local key frames from various body parts through the frame-channel attention mechanism (FCAM) to form a comprehensive representation of human motion patterns. GaitBranch achieves high gait recognition accuracy on the CASIA-B (98.6%, 96.1%, and 85.5% for normal walking, carrying a bag, and wearing a coat conditions), OU-MVLP (92.3%), and GREW (79.8%) datasets, demonstrating its robustness across different environments. Ablation experiments confirm the efficacy of our method and demonstrate that the performance gains result from the optimized model structure rather than simply increasing parameters.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"260 ","pages":"Article 104463"},"PeriodicalIF":3.5000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GaitBranch: A multi-branch refinement model combined with frame-channel attention mechanism for gait recognition\",\"authors\":\"Huakang Li , Yidan Qiu , Huimin Zhao , Jin Zhan , Rongjun Chen , Jinchang Ren , Ying Gao , Wing W.Y. Ng\",\"doi\":\"10.1016/j.cviu.2025.104463\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurately representing human motion in video-based gait recognition is challenging due to the difficulty in obtaining an ideal gait silhouette sequence that captures comprehensive information. To address this challenge, we propose GaitBranch, a novel method that emphasizes local key information of human motion in different layers of the neural network. It divides the neural network into multiple branches using the multi-branch refinement (MBR) module and extracts local key frames from various body parts through the frame-channel attention mechanism (FCAM) to form a comprehensive representation of human motion patterns. GaitBranch achieves high gait recognition accuracy on the CASIA-B (98.6%, 96.1%, and 85.5% for normal walking, carrying a bag, and wearing a coat conditions), OU-MVLP (92.3%), and GREW (79.8%) datasets, demonstrating its robustness across different environments. Ablation experiments confirm the efficacy of our method and demonstrate that the performance gains result from the optimized model structure rather than simply increasing parameters.</div></div>\",\"PeriodicalId\":50633,\"journal\":{\"name\":\"Computer Vision and Image Understanding\",\"volume\":\"260 \",\"pages\":\"Article 104463\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Vision and Image Understanding\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1077314225001869\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225001869","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

由于难以获得捕获全面信息的理想步态轮廓序列，因此在基于视频的步态识别中准确地表示人体运动具有挑战性。为了解决这一挑战，我们提出了一种新的方法GaitBranch，它在神经网络的不同层中强调人体运动的局部关键信息。利用多分支细化（MBR）模块将神经网络划分为多个分支，并通过帧-通道注意机制（FCAM）从人体各部位提取局部关键帧，形成对人体运动模式的综合表征。GaitBranch在CASIA-B、u - mvlp和GREW数据集（分别为98.6%、96.1%和85.5%，分别为正常行走、背着包和穿外套）、92.3%和79.8%的数据集上取得了较高的步态识别准确率，显示了其在不同环境下的鲁棒性。烧蚀实验证实了该方法的有效性，并证明了性能的提高是通过优化模型结构而不是简单地增加参数来实现的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GaitBranch: A multi-branch refinement model combined with frame-channel attention mechanism for gait recognition

Accurately representing human motion in video-based gait recognition is challenging due to the difficulty in obtaining an ideal gait silhouette sequence that captures comprehensive information. To address this challenge, we propose GaitBranch, a novel method that emphasizes local key information of human motion in different layers of the neural network. It divides the neural network into multiple branches using the multi-branch refinement (MBR) module and extracts local key frames from various body parts through the frame-channel attention mechanism (FCAM) to form a comprehensive representation of human motion patterns. GaitBranch achieves high gait recognition accuracy on the CASIA-B (98.6%, 96.1%, and 85.5% for normal walking, carrying a bag, and wearing a coat conditions), OU-MVLP (92.3%), and GREW (79.8%) datasets, demonstrating its robustness across different environments. Ablation experiments confirm the efficacy of our method and demonstrate that the performance gains result from the optimized model structure rather than simply increasing parameters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems