基于细粒度语义表示学习的可见-红外人物再识别

IF 4.3 2区综合性期刊 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Sensors Journal Pub Date : 2025-07-07 DOI:10.1109/JSEN.2025.3584080

Qiang Wang;Meiling Zhang;Xin Li;Hubo Guo;Huijie Fan

{"title":"基于细粒度语义表示学习的可见-红外人物再识别","authors":"Qiang Wang;Meiling Zhang;Xin Li;Hubo Guo;Huijie Fan","doi":"10.1109/JSEN.2025.3584080","DOIUrl":null,"url":null,"abstract":"Visible–infrared person re-identification (VI-ReID) faces great challenges due to the inherent cross-modality discrepancy. The key to reducing the discrepancy is to filter out the interference of modality information and project the pedestrian features in the two modalities into a shared feature space. However, previous works mainly focus on the application of high-level information and pay less attention to the middle-level features exploration. This limits the accuracy and generalization ability of cross-modality recognition. To address this shortcoming, we propose a novel fine-grained semantic representation learning (FSRL) network to explore the identity information in middle-level features for the VI-ReID task. Specifically, we first propose a plug-and-play modality normalization and compensation (MNC) module, which reduces the modality discrepancy while compensating for the missing identity information caused by modal elimination. Second, we propose an intermediate feature aggregation (IFA) module to obtain rich, fine-grained identity information in the middle layer, which guides the model to accurately extract more identity-related features for recognition. Finally, we also introduce the semantic-aligned feature learning (SAFL) module to further extract potential semantic part features from the feature map shared by the modalities to achieve cross-modality semantic alignment. Extensive experiments on the SYSU-MM01, RegDB DataSets (RegDB), and large-scale person re-identification dataset (LLCM) demonstrate the effectiveness and superiority of our proposed method.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"25 16","pages":"31065-31077"},"PeriodicalIF":4.3000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fine-Grained Semantic Representation Learning for Visible–Infrared Person Re-Identification\",\"authors\":\"Qiang Wang;Meiling Zhang;Xin Li;Hubo Guo;Huijie Fan\",\"doi\":\"10.1109/JSEN.2025.3584080\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visible–infrared person re-identification (VI-ReID) faces great challenges due to the inherent cross-modality discrepancy. The key to reducing the discrepancy is to filter out the interference of modality information and project the pedestrian features in the two modalities into a shared feature space. However, previous works mainly focus on the application of high-level information and pay less attention to the middle-level features exploration. This limits the accuracy and generalization ability of cross-modality recognition. To address this shortcoming, we propose a novel fine-grained semantic representation learning (FSRL) network to explore the identity information in middle-level features for the VI-ReID task. Specifically, we first propose a plug-and-play modality normalization and compensation (MNC) module, which reduces the modality discrepancy while compensating for the missing identity information caused by modal elimination. Second, we propose an intermediate feature aggregation (IFA) module to obtain rich, fine-grained identity information in the middle layer, which guides the model to accurately extract more identity-related features for recognition. Finally, we also introduce the semantic-aligned feature learning (SAFL) module to further extract potential semantic part features from the feature map shared by the modalities to achieve cross-modality semantic alignment. Extensive experiments on the SYSU-MM01, RegDB DataSets (RegDB), and large-scale person re-identification dataset (LLCM) demonstrate the effectiveness and superiority of our proposed method.\",\"PeriodicalId\":447,\"journal\":{\"name\":\"IEEE Sensors Journal\",\"volume\":\"25 16\",\"pages\":\"31065-31077\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Sensors Journal\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11072047/\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/11072047/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

由于固有的跨模态差异，可见-红外人体再识别面临着巨大的挑战。减少差异的关键是滤除模态信息的干扰，将两模态中的行人特征投影到共享特征空间中。然而，以往的工作主要集中在高层次信息的应用上，对中层特征的挖掘关注较少。这限制了跨模态识别的准确性和泛化能力。为了解决这一缺点，我们提出了一种新的细粒度语义表示学习（FSRL）网络来探索VI-ReID任务的中间层特征中的身份信息。具体来说，我们首先提出了一个即插即用的模态规范化和补偿（MNC）模块，该模块在补偿模态消除导致的身份信息缺失的同时减少了模态差异。其次，我们提出了一个中间特征聚合（IFA）模块，在中间层获得丰富、细粒度的身份信息，引导模型准确提取更多与身份相关的特征进行识别。最后，我们还引入了语义对齐特征学习（SAFL）模块，从模态共享的特征映射中进一步提取潜在的语义部分特征，实现跨模态语义对齐。在SYSU-MM01、RegDB数据集（RegDB）和大规模人再识别数据集（LLCM）上的大量实验证明了该方法的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fine-Grained Semantic Representation Learning for Visible–Infrared Person Re-Identification

Visible–infrared person re-identification (VI-ReID) faces great challenges due to the inherent cross-modality discrepancy. The key to reducing the discrepancy is to filter out the interference of modality information and project the pedestrian features in the two modalities into a shared feature space. However, previous works mainly focus on the application of high-level information and pay less attention to the middle-level features exploration. This limits the accuracy and generalization ability of cross-modality recognition. To address this shortcoming, we propose a novel fine-grained semantic representation learning (FSRL) network to explore the identity information in middle-level features for the VI-ReID task. Specifically, we first propose a plug-and-play modality normalization and compensation (MNC) module, which reduces the modality discrepancy while compensating for the missing identity information caused by modal elimination. Second, we propose an intermediate feature aggregation (IFA) module to obtain rich, fine-grained identity information in the middle layer, which guides the model to accurately extract more identity-related features for recognition. Finally, we also introduce the semantic-aligned feature learning (SAFL) module to further extract potential semantic part features from the feature map shared by the modalities to achieve cross-modality semantic alignment. Extensive experiments on the SYSU-MM01, RegDB DataSets (RegDB), and large-scale person re-identification dataset (LLCM) demonstrate the effectiveness and superiority of our proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Sensors Journal 工程技术-工程：电子与电气

CiteScore

7.70

自引率

14.00%

发文量

2058

审稿时长

5.2 months

期刊介绍： The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following: -Sensor Phenomenology, Modelling, and Evaluation -Sensor Materials, Processing, and Fabrication -Chemical and Gas Sensors -Microfluidics and Biosensors -Optical Sensors -Physical Sensors: Temperature, Mechanical, Magnetic, and others -Acoustic and Ultrasonic Sensors -Sensor Packaging -Sensor Networks -Sensor Applications -Sensor Systems: Signals, Processing, and Interfaces -Actuators and Sensor Power Systems -Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting -Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data) -Sensors in Industrial Practice