Visible-Infrared Person Re-Identification With Real-World Label Noise

IF 8.3 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2025-01-06 DOI:10.1109/TCSVT.2025.3526449

Ruiheng Zhang;Zhe Cao;Yan Huang;Shuo Yang;Lixin Xu;Min Xu

{"title":"Visible-Infrared Person Re-Identification With Real-World Label Noise","authors":"Ruiheng Zhang;Zhe Cao;Yan Huang;Shuo Yang;Lixin Xu;Min Xu","doi":"10.1109/TCSVT.2025.3526449","DOIUrl":null,"url":null,"abstract":"In recent years, growing needs for advanced security and traffic management have significantly heightened the prominence of the visible-infrared person re-identification community (VI-ReID), garnering considerable attention. A critical challenge in VI-ReID is the performance degradation attributable to label noise, an issue that becomes even more pronounced in cross-modal scenarios due to an increased likelihood of data confusion. While previous methods have achieved notable successes, they often overlook the complexities of instance-dependent and real-world noise, creating a disconnect from the practical applications of person re-identification. To bridge this gap, our research analyzes the primary sources of label noise in real-world settings, which include a) instantiated identities, b) blurry infrared images, and c) annotators’ errors. In response to these challenges, we develop a Robust Hybrid Loss function (RHL) that enables targeted recognition and retrieval optimization through a more fine-grained division of the noisy dataset. The proposed method categorises data into three sets: clean, obviously noisy, and indistinguishably noisy, with bespoke loss calculations for each category. The identification loss is structured to address the varied nature of these sets specifically. For the retrieval sub-task, we utilize an enhanced triplet loss, adept at handling noisy correspondences. Furthermore, to empirically validate our method, we have re-annotated a real-world dataset, SYSU-Real. Our experiments on SYSU-MM01 and RegDB, conducted under various noise ratios of random and instance-dependent label noise, demonstrate the generalized robustness and effectiveness of our proposed approach.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 5","pages":"4857-4869"},"PeriodicalIF":8.3000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10829635/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, growing needs for advanced security and traffic management have significantly heightened the prominence of the visible-infrared person re-identification community (VI-ReID), garnering considerable attention. A critical challenge in VI-ReID is the performance degradation attributable to label noise, an issue that becomes even more pronounced in cross-modal scenarios due to an increased likelihood of data confusion. While previous methods have achieved notable successes, they often overlook the complexities of instance-dependent and real-world noise, creating a disconnect from the practical applications of person re-identification. To bridge this gap, our research analyzes the primary sources of label noise in real-world settings, which include a) instantiated identities, b) blurry infrared images, and c) annotators’ errors. In response to these challenges, we develop a Robust Hybrid Loss function (RHL) that enables targeted recognition and retrieval optimization through a more fine-grained division of the noisy dataset. The proposed method categorises data into three sets: clean, obviously noisy, and indistinguishably noisy, with bespoke loss calculations for each category. The identification loss is structured to address the varied nature of these sets specifically. For the retrieval sub-task, we utilize an enhanced triplet loss, adept at handling noisy correspondences. Furthermore, to empirically validate our method, we have re-annotated a real-world dataset, SYSU-Real. Our experiments on SYSU-MM01 and RegDB, conducted under various noise ratios of random and instance-dependent label noise, demonstrate the generalized robustness and effectiveness of our proposed approach.

查看原文本刊更多论文

真实世界标签噪声下的可见-红外人再识别

近年来，对先进的安全和交通管理的需求日益增长，显著提高了可见红外人员再识别技术（VI-ReID）的重要性，引起了人们的广泛关注。VI-ReID的一个关键挑战是由于标签噪声导致的性能下降，由于数据混淆的可能性增加，这个问题在跨模式场景中变得更加明显。虽然以前的方法取得了显著的成功，但它们往往忽略了实例依赖和现实世界噪声的复杂性，从而与人员再识别的实际应用产生了脱节。为了弥补这一差距，我们的研究分析了现实环境中标签噪声的主要来源，包括a)实例化的身份，b)模糊的红外图像，以及c)注释者的错误。为了应对这些挑战，我们开发了一种鲁棒混合损失函数（RHL），通过对噪声数据集进行更细粒度的划分，实现有针对性的识别和检索优化。提出的方法将数据分为三组：干净、明显噪声和难以区分噪声，并为每个类别定制损失计算。识别损失的结构是为了专门解决这些集合的不同性质。对于检索子任务，我们使用增强的三联体损失，擅长处理噪声对应。此外，为了从经验上验证我们的方法，我们重新注释了一个真实世界的数据集SYSU-Real。我们在SYSU-MM01和RegDB上进行的实验，在随机和实例相关的标签噪声的不同噪声比下，证明了我们提出的方法的广义鲁棒性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.