Relational representation learning network for cross-spectral image patch matching

IF 15.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-09-15 DOI:10.1016/j.inffus.2025.103749

Chuang Yu , Yunpeng Liu , Jinmiao Zhao , Dou Quan , Zelin Shi , Xiangyu Yue

{"title":"Relational representation learning network for cross-spectral image patch matching","authors":"Chuang Yu , Yunpeng Liu , Jinmiao Zhao , Dou Quan , Zelin Shi , Xiangyu Yue","doi":"10.1016/j.inffus.2025.103749","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, feature relation learning has drawn widespread attention in cross-spectral image patch matching. However, existing related research focuses on capturing diverse feature relations between image patches and ignores sufficient intrinsic feature representations of individual image patches. To address this limitation, we propose an innovative relational representation learning that simultaneously focuses on sufficiently mining the intrinsic features of individual image patches and the feature relations between image patches. Based on this, we construct a <strong><u>R</u></strong>elational <strong><u>R</u></strong>epresentation <strong><u>L</u></strong>earning <strong><u>Net</u></strong>work (<strong>RRL-Net</strong>). Specifically, we innovatively construct an autoencoder to effectively characterize the individual intrinsic features, and introduce a feature interaction learning (FIL) module to extract deep-level feature relations. Meanwhile, to further fully mine individual intrinsic features, a lightweight multi-dimensional global-to-local attention (MGLA) module is constructed to enhance the global feature extraction of individual image patches and capture local dependencies within global features. By combining the MGLA module, we further explore the feature extraction network and construct an attention-based lightweight feature extraction (ALFE) network. Furthermore, a multi-loss post-pruning (MLPP) optimization strategy is proposed, which can greatly facilitate network optimization while avoiding increases in parameters and inference time. Extensive experiments demonstrate that our RRL-Net achieves state-of-the-art (SOTA) performance on multiple public datasets. Our code is available at <span><span>https://github.com/YuChuang1205/RRL-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103749"},"PeriodicalIF":15.5000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525008115","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, feature relation learning has drawn widespread attention in cross-spectral image patch matching. However, existing related research focuses on capturing diverse feature relations between image patches and ignores sufficient intrinsic feature representations of individual image patches. To address this limitation, we propose an innovative relational representation learning that simultaneously focuses on sufficiently mining the intrinsic features of individual image patches and the feature relations between image patches. Based on this, we construct a Relational Representation Learning Network (RRL-Net). Specifically, we innovatively construct an autoencoder to effectively characterize the individual intrinsic features, and introduce a feature interaction learning (FIL) module to extract deep-level feature relations. Meanwhile, to further fully mine individual intrinsic features, a lightweight multi-dimensional global-to-local attention (MGLA) module is constructed to enhance the global feature extraction of individual image patches and capture local dependencies within global features. By combining the MGLA module, we further explore the feature extraction network and construct an attention-based lightweight feature extraction (ALFE) network. Furthermore, a multi-loss post-pruning (MLPP) optimization strategy is proposed, which can greatly facilitate network optimization while avoiding increases in parameters and inference time. Extensive experiments demonstrate that our RRL-Net achieves state-of-the-art (SOTA) performance on multiple public datasets. Our code is available at https://github.com/YuChuang1205/RRL-Net.

查看原文本刊更多论文

跨光谱图像块匹配的关系表示学习网络

近年来，特征关系学习在跨光谱图像贴片匹配中得到了广泛关注。然而，现有的相关研究侧重于捕捉图像斑块之间的多种特征关系，忽略了单个图像斑块充分的内在特征表征。为了解决这一限制，我们提出了一种创新的关系表示学习，该学习同时专注于充分挖掘单个图像补丁的内在特征和图像补丁之间的特征关系。在此基础上，我们构建了一个关系表示学习网络（RRL-Net）。具体而言，我们创新地构建了一个自动编码器来有效地表征单个内在特征，并引入了特征交互学习（FIL）模块来提取深层特征关系。同时，为了进一步充分挖掘单个图像的内在特征，构建了一个轻量级的多维全局到局部关注（MGLA）模块，增强对单个图像patch的全局特征提取，并捕获全局特征中的局部依赖关系。结合MGLA模块，进一步探索特征提取网络，构建基于注意力的轻量级特征提取（ALFE）网络。在此基础上，提出了一种多损失后剪枝（MLPP）优化策略，在避免参数和推理时间增加的同时，极大地方便了网络优化。大量的实验表明，我们的RRL-Net在多个公共数据集上实现了最先进的（SOTA）性能。我们的代码可在https://github.com/YuChuang1205/RRL-Net上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.