Yongxiang Li;Yuan Sun;Yang Qin;Dezhong Peng;Xi Peng;Peng Hu
{"title":"Robust Duality Learning for Unsupervised Visible-Infrared Person Re-Identification","authors":"Yongxiang Li;Yuan Sun;Yang Qin;Dezhong Peng;Xi Peng;Peng Hu","doi":"10.1109/TIFS.2025.3536613","DOIUrl":null,"url":null,"abstract":"Unsupervised visible-infrared person re-identification (UVI-ReID) aims at retrieving pedestrian images of the same individual across distinct modalities, presenting challenges due to the inherent heterogeneity gap and the absence of cost-prohibitive annotations. Although existing methods employ self-training with clustering-generated pseudo-labels to bridge this gap, they always implicitly assume that these pseudo-labels are predicted correctly. In practice, however, this presumption is impossible to satisfy due to the difficulty of training a perfect model let alone without any ground truths, resulting in pseudo-labeling errors. Based on the observation, this study introduces a new learning paradigm for UVI-ReID considering Pseudo-Label Noise (PLN), which encompasses three challenges: noise overfitting, error accumulation, and noisy cluster correspondence. To conquer these challenges, we propose a novel robust duality learning framework (RoDE) for UVI-ReID to mitigate the adverse impact of noisy pseudo-labels. Specifically, for noise overfitting, we propose a novel Robust Adaptive Learning mechanism (RAL) to dynamically prioritize clean samples while deprioritizing noisy ones, thus avoiding overemphasizing noise. To circumvent error accumulation of self-training, where the model tends to confirm its mistakes, RoDE alternately trains dual distinct models using pseudo-labels predicted by their counterparts, thereby maintaining diversity and avoiding collapse into noise. However, this will lead to cross-cluster misalignment between the two distinct models, not to mention the misalignment between different modalities, resulting in dual noisy cluster correspondence and thus difficult to optimize. To address this issue, a Cluster Consistency Matching mechanism (CCM) is presented to ensure reliable alignment across distinct modalities as well as across different models by leveraging cross-cluster similarities. Extensive experiments on three benchmark datasets demonstrate the effectiveness of the proposed RoDE.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"1937-1948"},"PeriodicalIF":6.3000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10858072/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Unsupervised visible-infrared person re-identification (UVI-ReID) aims at retrieving pedestrian images of the same individual across distinct modalities, presenting challenges due to the inherent heterogeneity gap and the absence of cost-prohibitive annotations. Although existing methods employ self-training with clustering-generated pseudo-labels to bridge this gap, they always implicitly assume that these pseudo-labels are predicted correctly. In practice, however, this presumption is impossible to satisfy due to the difficulty of training a perfect model let alone without any ground truths, resulting in pseudo-labeling errors. Based on the observation, this study introduces a new learning paradigm for UVI-ReID considering Pseudo-Label Noise (PLN), which encompasses three challenges: noise overfitting, error accumulation, and noisy cluster correspondence. To conquer these challenges, we propose a novel robust duality learning framework (RoDE) for UVI-ReID to mitigate the adverse impact of noisy pseudo-labels. Specifically, for noise overfitting, we propose a novel Robust Adaptive Learning mechanism (RAL) to dynamically prioritize clean samples while deprioritizing noisy ones, thus avoiding overemphasizing noise. To circumvent error accumulation of self-training, where the model tends to confirm its mistakes, RoDE alternately trains dual distinct models using pseudo-labels predicted by their counterparts, thereby maintaining diversity and avoiding collapse into noise. However, this will lead to cross-cluster misalignment between the two distinct models, not to mention the misalignment between different modalities, resulting in dual noisy cluster correspondence and thus difficult to optimize. To address this issue, a Cluster Consistency Matching mechanism (CCM) is presented to ensure reliable alignment across distinct modalities as well as across different models by leveraging cross-cluster similarities. Extensive experiments on three benchmark datasets demonstrate the effectiveness of the proposed RoDE.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features