桥接模态差距:一种跨模态特征互补和特征投影网络用于可见-红外人再识别

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Lei Shi , Yumao Ma , Yongcai Tao , Haowen Liu , Lin Wei , Yucheng Shi , Yufei Gao
{"title":"桥接模态差距:一种跨模态特征互补和特征投影网络用于可见-红外人再识别","authors":"Lei Shi ,&nbsp;Yumao Ma ,&nbsp;Yongcai Tao ,&nbsp;Haowen Liu ,&nbsp;Lin Wei ,&nbsp;Yucheng Shi ,&nbsp;Yufei Gao","doi":"10.1016/j.neucom.2025.130367","DOIUrl":null,"url":null,"abstract":"<div><div>Visible-infrared person re-identification (VI-ReID) presents a significant challenge due to the substantial modal differences between infrared (IR) and visible (VIS) images, primarily resulting from their distinct color distributions and textural characteristics. One effective strategy for reducing this modal gap is to utilize feature projection to create a shared embedded space for the modal features. However, a key research question remains: how to effectively align cross-modal features during projection while minimizing the loss of information. To address this challenge, this paper proposed a Cross-Modal Feature Complementation and Feature Projection Network (FCFPN). Specifically, a modal complementation strategy was introduced to bridge the discrepancies between cross-modal features and facilitate their alignment. Additionally, a cross-modal feature projection mechanism was employed to embed modality-correlated features into the shared feature space, thereby mitigating feature loss caused by modality differences. Furthermore, multi-channel and multi-level features were extracted from the shared space to enhance the overall feature representation. Extensive experimental results demonstrated that the proposed FCFPN model effectively mitigated the modal discrepancy, achieving 84.7% Rank-1 accuracy and 86.9% mAP in the indoor test mode of the SYSU-MM01 dataset, and 93.0% Rank-1 accuracy and 87.3% mAP in the VIS-to-IR test mode of the RegDB dataset, thereby outperforming several state-of-the-art methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"642 ","pages":"Article 130367"},"PeriodicalIF":5.5000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bridging modal gaps: A Cross-Modal Feature Complementation and Feature Projection Network for visible-infrared person re-identification\",\"authors\":\"Lei Shi ,&nbsp;Yumao Ma ,&nbsp;Yongcai Tao ,&nbsp;Haowen Liu ,&nbsp;Lin Wei ,&nbsp;Yucheng Shi ,&nbsp;Yufei Gao\",\"doi\":\"10.1016/j.neucom.2025.130367\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visible-infrared person re-identification (VI-ReID) presents a significant challenge due to the substantial modal differences between infrared (IR) and visible (VIS) images, primarily resulting from their distinct color distributions and textural characteristics. One effective strategy for reducing this modal gap is to utilize feature projection to create a shared embedded space for the modal features. However, a key research question remains: how to effectively align cross-modal features during projection while minimizing the loss of information. To address this challenge, this paper proposed a Cross-Modal Feature Complementation and Feature Projection Network (FCFPN). Specifically, a modal complementation strategy was introduced to bridge the discrepancies between cross-modal features and facilitate their alignment. Additionally, a cross-modal feature projection mechanism was employed to embed modality-correlated features into the shared feature space, thereby mitigating feature loss caused by modality differences. Furthermore, multi-channel and multi-level features were extracted from the shared space to enhance the overall feature representation. Extensive experimental results demonstrated that the proposed FCFPN model effectively mitigated the modal discrepancy, achieving 84.7% Rank-1 accuracy and 86.9% mAP in the indoor test mode of the SYSU-MM01 dataset, and 93.0% Rank-1 accuracy and 87.3% mAP in the VIS-to-IR test mode of the RegDB dataset, thereby outperforming several state-of-the-art methods.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"642 \",\"pages\":\"Article 130367\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225010392\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225010392","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

由于红外(IR)和可见光(VIS)图像之间存在着巨大的模态差异,可见-红外人物再识别(VI-ReID)面临着巨大的挑战,这主要是由于它们不同的颜色分布和纹理特征。减少模态差距的一种有效策略是利用特征投影为模态特征创建共享嵌入空间。然而,如何在投影过程中有效地对齐跨模态特征,同时最大限度地减少信息损失是一个关键的研究问题。为了解决这一问题,本文提出了一种跨模态特征互补和特征投影网络(FCFPN)。具体而言,引入了一种模态互补策略来弥合跨模态特征之间的差异,并促进它们的对齐。此外,采用跨模态特征投影机制将模态相关特征嵌入到共享特征空间中,从而减少因模态差异造成的特征损失。进一步,从共享空间中提取多通道、多层次特征,增强特征的整体表征。大量的实验结果表明,所提出的FCFPN模型有效地缓解了模态差异,在SYSU-MM01数据集的室内测试模式下,该模型的Rank-1准确率为84.7%,mAP为86.9%;在RegDB数据集的vis - ir测试模式下,该模型的Rank-1准确率为93.0%,mAP为87.3%,优于几种最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bridging modal gaps: A Cross-Modal Feature Complementation and Feature Projection Network for visible-infrared person re-identification
Visible-infrared person re-identification (VI-ReID) presents a significant challenge due to the substantial modal differences between infrared (IR) and visible (VIS) images, primarily resulting from their distinct color distributions and textural characteristics. One effective strategy for reducing this modal gap is to utilize feature projection to create a shared embedded space for the modal features. However, a key research question remains: how to effectively align cross-modal features during projection while minimizing the loss of information. To address this challenge, this paper proposed a Cross-Modal Feature Complementation and Feature Projection Network (FCFPN). Specifically, a modal complementation strategy was introduced to bridge the discrepancies between cross-modal features and facilitate their alignment. Additionally, a cross-modal feature projection mechanism was employed to embed modality-correlated features into the shared feature space, thereby mitigating feature loss caused by modality differences. Furthermore, multi-channel and multi-level features were extracted from the shared space to enhance the overall feature representation. Extensive experimental results demonstrated that the proposed FCFPN model effectively mitigated the modal discrepancy, achieving 84.7% Rank-1 accuracy and 86.9% mAP in the indoor test mode of the SYSU-MM01 dataset, and 93.0% Rank-1 accuracy and 87.3% mAP in the VIS-to-IR test mode of the RegDB dataset, thereby outperforming several state-of-the-art methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信