{"title":"A Semantically Guided and Focused Network for Occluded Person Re-Identification","authors":"Guorong Lin;Shunzhi Yang;Wei-Shi Zheng;Zuoyong Li;Zhenhua Huang","doi":"10.1109/TIFS.2025.3608672","DOIUrl":null,"url":null,"abstract":"Person re-identification (ReID) is vital for surveillance, tracking, and criminal investigations, yet occlusions often lead to partial information loss and noisy features that significantly degrade ReID performance. Recent CLIP-based occluded person ReID methods have demonstrated promising performance by leveraging cross-modal alignment, but still face two limitations: first, generic text prompts fail to capture the fine-grained semantics of specific samples; second, there is a lack of effective enhancement mechanisms for hard local features in occlusion scenarios. To overcome these limitations, we propose a Semantically Guided and Focused Network (SGFNet), which comprises three synergistic modules. First, to tackle the absence of fine-grained textual descriptions, we design a Segmentation and Text Generation (STG) module that segments pedestrian regions and generates sample-specific text features, providing detailed text descriptions and spatial information for local pedestrian regions. In addition, in order to accurately extract fine-grained features, we propose a Dual-guided Feature Refinement (DGFR) module. This module leverages a spatial attention mechanism guided by dual-semantic information to enhance discriminative fine-grained features while effectively suppressing interference from irrelevant regions. Finally, building upon the DGFR module, we further propose a Hardness-aware Semantic Focus (HASF) module. This module leverages segmentation cues to assess the difficulty of distinguishing local regions and employs a carefully designed Semantic-driven Focal Triplet loss to specifically enhance hard local feature learning, thereby improving the model’s robustness in feature extraction under occlusion scenarios. Extensive experiments demonstrate the superiority of SGFNet, achieving state-of-the-art performance on three occluded person ReID datasets while maintaining competitive results on three holistic person ReID datasets.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"9716-9731"},"PeriodicalIF":8.0000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11159093/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Person re-identification (ReID) is vital for surveillance, tracking, and criminal investigations, yet occlusions often lead to partial information loss and noisy features that significantly degrade ReID performance. Recent CLIP-based occluded person ReID methods have demonstrated promising performance by leveraging cross-modal alignment, but still face two limitations: first, generic text prompts fail to capture the fine-grained semantics of specific samples; second, there is a lack of effective enhancement mechanisms for hard local features in occlusion scenarios. To overcome these limitations, we propose a Semantically Guided and Focused Network (SGFNet), which comprises three synergistic modules. First, to tackle the absence of fine-grained textual descriptions, we design a Segmentation and Text Generation (STG) module that segments pedestrian regions and generates sample-specific text features, providing detailed text descriptions and spatial information for local pedestrian regions. In addition, in order to accurately extract fine-grained features, we propose a Dual-guided Feature Refinement (DGFR) module. This module leverages a spatial attention mechanism guided by dual-semantic information to enhance discriminative fine-grained features while effectively suppressing interference from irrelevant regions. Finally, building upon the DGFR module, we further propose a Hardness-aware Semantic Focus (HASF) module. This module leverages segmentation cues to assess the difficulty of distinguishing local regions and employs a carefully designed Semantic-driven Focal Triplet loss to specifically enhance hard local feature learning, thereby improving the model’s robustness in feature extraction under occlusion scenarios. Extensive experiments demonstrate the superiority of SGFNet, achieving state-of-the-art performance on three occluded person ReID datasets while maintaining competitive results on three holistic person ReID datasets.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features