Huakang Li , Yidan Qiu , Huimin Zhao , Jin Zhan , Rongjun Chen , Jinchang Ren , Ying Gao , Wing W.Y. Ng
{"title":"GaitBranch:一种结合帧-通道注意机制的多分支精细模型","authors":"Huakang Li , Yidan Qiu , Huimin Zhao , Jin Zhan , Rongjun Chen , Jinchang Ren , Ying Gao , Wing W.Y. Ng","doi":"10.1016/j.cviu.2025.104463","DOIUrl":null,"url":null,"abstract":"<div><div>Accurately representing human motion in video-based gait recognition is challenging due to the difficulty in obtaining an ideal gait silhouette sequence that captures comprehensive information. To address this challenge, we propose GaitBranch, a novel method that emphasizes local key information of human motion in different layers of the neural network. It divides the neural network into multiple branches using the multi-branch refinement (MBR) module and extracts local key frames from various body parts through the frame-channel attention mechanism (FCAM) to form a comprehensive representation of human motion patterns. GaitBranch achieves high gait recognition accuracy on the CASIA-B (98.6%, 96.1%, and 85.5% for normal walking, carrying a bag, and wearing a coat conditions), OU-MVLP (92.3%), and GREW (79.8%) datasets, demonstrating its robustness across different environments. Ablation experiments confirm the efficacy of our method and demonstrate that the performance gains result from the optimized model structure rather than simply increasing parameters.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"260 ","pages":"Article 104463"},"PeriodicalIF":3.5000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GaitBranch: A multi-branch refinement model combined with frame-channel attention mechanism for gait recognition\",\"authors\":\"Huakang Li , Yidan Qiu , Huimin Zhao , Jin Zhan , Rongjun Chen , Jinchang Ren , Ying Gao , Wing W.Y. Ng\",\"doi\":\"10.1016/j.cviu.2025.104463\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurately representing human motion in video-based gait recognition is challenging due to the difficulty in obtaining an ideal gait silhouette sequence that captures comprehensive information. To address this challenge, we propose GaitBranch, a novel method that emphasizes local key information of human motion in different layers of the neural network. It divides the neural network into multiple branches using the multi-branch refinement (MBR) module and extracts local key frames from various body parts through the frame-channel attention mechanism (FCAM) to form a comprehensive representation of human motion patterns. GaitBranch achieves high gait recognition accuracy on the CASIA-B (98.6%, 96.1%, and 85.5% for normal walking, carrying a bag, and wearing a coat conditions), OU-MVLP (92.3%), and GREW (79.8%) datasets, demonstrating its robustness across different environments. Ablation experiments confirm the efficacy of our method and demonstrate that the performance gains result from the optimized model structure rather than simply increasing parameters.</div></div>\",\"PeriodicalId\":50633,\"journal\":{\"name\":\"Computer Vision and Image Understanding\",\"volume\":\"260 \",\"pages\":\"Article 104463\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Vision and Image Understanding\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1077314225001869\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225001869","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
GaitBranch: A multi-branch refinement model combined with frame-channel attention mechanism for gait recognition
Accurately representing human motion in video-based gait recognition is challenging due to the difficulty in obtaining an ideal gait silhouette sequence that captures comprehensive information. To address this challenge, we propose GaitBranch, a novel method that emphasizes local key information of human motion in different layers of the neural network. It divides the neural network into multiple branches using the multi-branch refinement (MBR) module and extracts local key frames from various body parts through the frame-channel attention mechanism (FCAM) to form a comprehensive representation of human motion patterns. GaitBranch achieves high gait recognition accuracy on the CASIA-B (98.6%, 96.1%, and 85.5% for normal walking, carrying a bag, and wearing a coat conditions), OU-MVLP (92.3%), and GREW (79.8%) datasets, demonstrating its robustness across different environments. Ablation experiments confirm the efficacy of our method and demonstrate that the performance gains result from the optimized model structure rather than simply increasing parameters.
期刊介绍:
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views.
Research Areas Include:
• Theory
• Early vision
• Data structures and representations
• Shape
• Range
• Motion
• Matching and recognition
• Architecture and languages
• Vision systems