Dual-Level Matching With Outlier Filtering for Unsupervised Visible-Infrared Person Re-Identification

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-02-11 DOI:10.1109/TPAMI.2025.3541053

Mang Ye;Zesen Wu;Bo Du

{"title":"Dual-Level Matching With Outlier Filtering for Unsupervised Visible-Infrared Person Re-Identification","authors":"Mang Ye;Zesen Wu;Bo Du","doi":"10.1109/TPAMI.2025.3541053","DOIUrl":null,"url":null,"abstract":"Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task due to the large modality gap. While numerous efforts have been devoted to the supervised setting with a large amount of labeled cross-modality correspondences, few studies have tried to mitigate the modality gap by mining cross-modality correspondences in an unsupervised manner. However, existing works failed to capture the intrinsic relations among samples across two modalities, resulting in limited performance outcomes. In this paper, we propose a novel Progressive Graph Matching (PGM) approach to globally model the cross-modality relationships and instance-level affinities. PGM formulates cross-modality correspondence mining as a graph matching procedure, aiming to integrate global information by minimizing global matching costs. Considering that samples in wrong clusters cannot find reliable cross-modality correspondences by PGM, we further introduce a robust Dual-Level Matching (DLM) mechanism, combining the cluster-level PGM and Nearest Instance-Cluster Searching (NICS) with instance-level affinity optimization. Additionally, we design an Outlier Filter Strategy (OFS) to filter out unreliable cross-modality correspondences based on the dual-level relation constraints. To mitigate false accumulation in cross-modal correspondence learning, an Alternate Cross Contrastive Learning (ACCL) module is proposed to alternately adjust the dominated matching, i.e., visible-to-infrared or infrared-to-visible matching. Empirical results demonstrate the superiority of our unsupervised solution, achieving comparable performance with supervised counterparts.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 5","pages":"3815-3829"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10882953/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task due to the large modality gap. While numerous efforts have been devoted to the supervised setting with a large amount of labeled cross-modality correspondences, few studies have tried to mitigate the modality gap by mining cross-modality correspondences in an unsupervised manner. However, existing works failed to capture the intrinsic relations among samples across two modalities, resulting in limited performance outcomes. In this paper, we propose a novel Progressive Graph Matching (PGM) approach to globally model the cross-modality relationships and instance-level affinities. PGM formulates cross-modality correspondence mining as a graph matching procedure, aiming to integrate global information by minimizing global matching costs. Considering that samples in wrong clusters cannot find reliable cross-modality correspondences by PGM, we further introduce a robust Dual-Level Matching (DLM) mechanism, combining the cluster-level PGM and Nearest Instance-Cluster Searching (NICS) with instance-level affinity optimization. Additionally, we design an Outlier Filter Strategy (OFS) to filter out unreliable cross-modality correspondences based on the dual-level relation constraints. To mitigate false accumulation in cross-modal correspondence learning, an Alternate Cross Contrastive Learning (ACCL) module is proposed to alternately adjust the dominated matching, i.e., visible-to-infrared or infrared-to-visible matching. Empirical results demonstrate the superiority of our unsupervised solution, achieving comparable performance with supervised counterparts.

查看原文本刊更多论文

基于离群值滤波的双级匹配无监督可见红外人物再识别

可见红外人体再识别是一项具有挑战性的跨模态检索任务。虽然许多研究都致力于具有大量标记的跨模态对应的监督设置，但很少有研究试图通过以无监督的方式挖掘跨模态对应来减轻模态差距。然而，现有的工作未能捕捉样本之间的内在关系跨两种模式，导致有限的性能结果。在本文中，我们提出了一种新的渐进图匹配（PGM）方法来全局建模跨模态关系和实例级亲和力。PGM将跨模态对应挖掘作为一个图匹配过程，旨在通过最小化全局匹配成本来整合全局信息。考虑到错误集群中的样本无法通过PGM找到可靠的跨模态对应，我们进一步引入了一种鲁棒的双级匹配（DLM）机制，该机制将集群级PGM和最近实例-集群搜索（NICS）与实例级亲和优化相结合。此外，我们设计了一种基于双级关系约束的离群值过滤策略（OFS）来过滤掉不可靠的跨模态对应。为了减少交叉模态对应学习中的错误积累，提出了交替交叉对比学习（ACCL）模块，交替调整主导匹配，即可见光到红外或红外到可见光的匹配。实证结果证明了我们的无监督解决方案的优越性，实现了与有监督解决方案相当的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量