{"title":"DIFReID: Detail Information Fusion for Person Re-Identification","authors":"Xuebing Bai , Jichang Guo , Jin Che","doi":"10.1016/j.displa.2025.103189","DOIUrl":null,"url":null,"abstract":"<div><div>Person re-identification (ReID) aims to match person images across different scenes in video surveillance. Despite significant progress, existing methods often overlook the importance of multi-scale information and personal belongings, while failing to fully exploit the relationships between images and attributes. These limitations result in underutilization of detailed information, thereby constraining the completeness and discriminative power of person feature representations. To address these challenges, we propose Detail Information Fusion for Person Re-Identification (DIFReID), a novel framework that aims to enhance feature representation by effectively integrating image information and attribute information. Specifically, DIFReID incorporates a multi-scale attention module that combines multi-scale features with attention mechanisms to highlight salient regions and improve the representation of critical details. Furthermore, a refined semantic parsing module integrates semantic regions of personal belongings with human parsing results, effectively capturing personal belongings often omitted in prior approaches. In addition, a cross-modal graph convolutional network module fuses personal attributes with visual features, establishing deeper relationships between images and attributes to generate robust and discriminative representations. Extensive experiments conducted on two benchmark datasets demonstrate that DIFReID achieves state-of-the-art performance, validating its effectiveness in improving both feature completeness and discriminative capability.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103189"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002264","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Person re-identification (ReID) aims to match person images across different scenes in video surveillance. Despite significant progress, existing methods often overlook the importance of multi-scale information and personal belongings, while failing to fully exploit the relationships between images and attributes. These limitations result in underutilization of detailed information, thereby constraining the completeness and discriminative power of person feature representations. To address these challenges, we propose Detail Information Fusion for Person Re-Identification (DIFReID), a novel framework that aims to enhance feature representation by effectively integrating image information and attribute information. Specifically, DIFReID incorporates a multi-scale attention module that combines multi-scale features with attention mechanisms to highlight salient regions and improve the representation of critical details. Furthermore, a refined semantic parsing module integrates semantic regions of personal belongings with human parsing results, effectively capturing personal belongings often omitted in prior approaches. In addition, a cross-modal graph convolutional network module fuses personal attributes with visual features, establishing deeper relationships between images and attributes to generate robust and discriminative representations. Extensive experiments conducted on two benchmark datasets demonstrate that DIFReID achieves state-of-the-art performance, validating its effectiveness in improving both feature completeness and discriminative capability.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.