Sixian Chan, Jie Wang, Jiaao Cui, Jie Hu, Zhuorong Li, Jiafa Mao
{"title":"Multi-granularity feature intersection learning for visible-infrared person re-identification","authors":"Sixian Chan, Jie Wang, Jiaao Cui, Jie Hu, Zhuorong Li, Jiafa Mao","doi":"10.1007/s40747-025-01853-5","DOIUrl":null,"url":null,"abstract":"<p>This paper proposes a multi-granularity feature intersection network (MGFINet) for visible-infrared person re-identification (VI-ReID). VI-ReID aims to retrieve images of the same pedestrian from different spectral cameras. The key challenge is to extract pedestrian descriptions with both inter-class discriminability and intra-class similarity. Previous methods ignore the potential loss of details during representation extraction and the presence of data bias in the metric function, limiting further improvements in retrieval performance. Meanwhile, the discrepancy regarding how to calculate the loss for representation learning and metric learning also affects the model’s training. To address the above issues, MGFINet consists of three components: a hierarchical part pooling method (HPP), a hierarchical part restriction method (HPC), and a feature intersection (FI) loss. HPP adopts a hierarchical framework to extract multi-granularity pedestrian representations, and it performs an inter-layer fusion operation to exploit the high-resolution information from shallow layers and the semantic representability from deep layers. Meanwhile, HPP employs part pooling with different step sizes to capture pedestrian details in each layer. Next, HPC spreads the identity loss across all layers to reduce the distance for gradient backpropagation and further optimize fine-grained features in shallow layers. Besides, FI loss combines representation and metric learning by incorporating hyperparameters of classifiers into metric learning, mitigating data bias and reducing the gap between the two learning processes. Finally, extensive experiments evaluated on two public datasets, SYSU-MM01 and RegDB demonstrate the effectiveness of the proposed method.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"42 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01853-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a multi-granularity feature intersection network (MGFINet) for visible-infrared person re-identification (VI-ReID). VI-ReID aims to retrieve images of the same pedestrian from different spectral cameras. The key challenge is to extract pedestrian descriptions with both inter-class discriminability and intra-class similarity. Previous methods ignore the potential loss of details during representation extraction and the presence of data bias in the metric function, limiting further improvements in retrieval performance. Meanwhile, the discrepancy regarding how to calculate the loss for representation learning and metric learning also affects the model’s training. To address the above issues, MGFINet consists of three components: a hierarchical part pooling method (HPP), a hierarchical part restriction method (HPC), and a feature intersection (FI) loss. HPP adopts a hierarchical framework to extract multi-granularity pedestrian representations, and it performs an inter-layer fusion operation to exploit the high-resolution information from shallow layers and the semantic representability from deep layers. Meanwhile, HPP employs part pooling with different step sizes to capture pedestrian details in each layer. Next, HPC spreads the identity loss across all layers to reduce the distance for gradient backpropagation and further optimize fine-grained features in shallow layers. Besides, FI loss combines representation and metric learning by incorporating hyperparameters of classifiers into metric learning, mitigating data bias and reducing the gap between the two learning processes. Finally, extensive experiments evaluated on two public datasets, SYSU-MM01 and RegDB demonstrate the effectiveness of the proposed method.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.