Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang
{"title":"基于自适应频率挖掘和嵌入的可见红外人物再识别","authors":"Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang","doi":"10.1016/j.dsp.2025.105526","DOIUrl":null,"url":null,"abstract":"<div><div>Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105526"},"PeriodicalIF":3.0000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Visible-infrared person re-identification via adaptive frequency mining and embedding\",\"authors\":\"Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang\",\"doi\":\"10.1016/j.dsp.2025.105526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"168 \",\"pages\":\"Article 105526\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425005482\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425005482","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Visible-infrared person re-identification via adaptive frequency mining and embedding
Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,