基于自适应频率挖掘和嵌入的可见红外人物再识别

IF 3 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2025-08-14 DOI:10.1016/j.dsp.2025.105526

Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang

{"title":"基于自适应频率挖掘和嵌入的可见红外人物再识别","authors":"Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang","doi":"10.1016/j.dsp.2025.105526","DOIUrl":null,"url":null,"abstract":"<div><div>Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105526"},"PeriodicalIF":3.0000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Visible-infrared person re-identification via adaptive frequency mining and embedding\",\"authors\":\"Wei Sun , Yaqi Wang , Xinbo Gao , Yibao Zhao , Yongchao Song , Zhiqiang Hou , Yanning Zhang\",\"doi\":\"10.1016/j.dsp.2025.105526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"168 \",\"pages\":\"Article 105526\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425005482\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425005482","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

可见红外人再识别（VI-ReID）是计算机视觉中的一项具有挑战性的任务，旨在匹配在可见和红外模式下捕获的图像中的个体。现有的方法通常关注图像级或特征级对齐，但往往难以有效地弥合模态差距。在本文中，我们提出了一种新的频率感知表示学习框架，该框架利用可见光和红外图像在频域中的互补特性来生成多样化和信息丰富的嵌入，从而减少跨模态差异。具体来说，我们首先从输入表示中提取低频和高频特征，由自适应解耦的频谱分量引导。然后通过双向调制算子改进这些特征，促进频率分量之间的相互作用。此外，我们设计了一个多阶段的知识融合模块，以增强全局结构和多频尺度细粒度细节之间的互补性。在公共基准数据集上的大量实验表明，我们的方法明显优于最先进的方法，验证了其在复杂的跨模式场景中的有效性和泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Visible-infrared person re-identification via adaptive frequency mining and embedding

Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,