Xiujun Shu , Hanjun Li , Wei Wen , Ruizhi Qiao , Nannan Li , Weijian Ruan , Hanjing Su , Bo Wang , Shouzhi Chen , Jun Zhou
{"title":"Precise occlusion-aware and feature-level reconstruction for occluded person re-identification","authors":"Xiujun Shu , Hanjun Li , Wei Wen , Ruizhi Qiao , Nannan Li , Weijian Ruan , Hanjing Su , Bo Wang , Shouzhi Chen , Jun Zhou","doi":"10.1016/j.neucom.2024.128919","DOIUrl":null,"url":null,"abstract":"<div><div>Occluded person re-IDentification (re-ID) is a challenging task in surveillance scenarios that remains unresolved. To address it, existing methods primarily rely on auxiliary models, <em>e.g.</em> pose estimation, to explore visible parts by detecting human keypoints. However, these approaches inevitably encounter two issues: domain gap and information asymmetry. The former arises from pre-training auxiliary models on different domains, while the latter indicates that the occluded query has asymmetric valid cues compared to the holistic visible gallery. In this paper, we propose a novel <em>Precise Occlusion-aware and Feature-level Reconstruction</em> (POFR) network for occluded re-ID. POFR addresses the occlusion issue from two viewpoints: perceiving the occlusions other than visible human bodies and reconstructing the occluded parts at the feature level. The first perspective is achieved through occlusion-driven contrastive learning (OCL). OCL incorporates an occlusion generator capable of generating object and person-specific occlusions. Unlike previous coarse occlusions, our generator leverages segmented pedestrians and obstacles to generate realistic occlusions which are then used for contrastive learning. The second perspective is implemented through an occlusion-guided feature reconstruction (OFR) module. OFR initially learns an occlusion predictor to estimate the occlusion mask, which is subsequently utilized to recover features corresponding to the occluded regions. Benefiting from the occlusion generator, the occlusion predictor can be effectively supervised with the precise occlusion masks, thereby mitigating the domain gap problem. Additionally, the recovered features alleviate information asymmetry when matching an occluded query and a holistic gallery. Extensive experiments conducted on occluded, partial, and holistic datasets demonstrate the superior performance of our POFR over state-of-the-art methods. The source code will be made publicly available upon paper acceptance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128919"},"PeriodicalIF":5.5000,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224016904","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Occluded person re-IDentification (re-ID) is a challenging task in surveillance scenarios that remains unresolved. To address it, existing methods primarily rely on auxiliary models, e.g. pose estimation, to explore visible parts by detecting human keypoints. However, these approaches inevitably encounter two issues: domain gap and information asymmetry. The former arises from pre-training auxiliary models on different domains, while the latter indicates that the occluded query has asymmetric valid cues compared to the holistic visible gallery. In this paper, we propose a novel Precise Occlusion-aware and Feature-level Reconstruction (POFR) network for occluded re-ID. POFR addresses the occlusion issue from two viewpoints: perceiving the occlusions other than visible human bodies and reconstructing the occluded parts at the feature level. The first perspective is achieved through occlusion-driven contrastive learning (OCL). OCL incorporates an occlusion generator capable of generating object and person-specific occlusions. Unlike previous coarse occlusions, our generator leverages segmented pedestrians and obstacles to generate realistic occlusions which are then used for contrastive learning. The second perspective is implemented through an occlusion-guided feature reconstruction (OFR) module. OFR initially learns an occlusion predictor to estimate the occlusion mask, which is subsequently utilized to recover features corresponding to the occluded regions. Benefiting from the occlusion generator, the occlusion predictor can be effectively supervised with the precise occlusion masks, thereby mitigating the domain gap problem. Additionally, the recovered features alleviate information asymmetry when matching an occluded query and a holistic gallery. Extensive experiments conducted on occluded, partial, and holistic datasets demonstrate the superior performance of our POFR over state-of-the-art methods. The source code will be made publicly available upon paper acceptance.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.