Chuang Yu , Yunpeng Liu , Jinmiao Zhao , Dou Quan , Zelin Shi , Xiangyu Yue
{"title":"Relational representation learning network for cross-spectral image patch matching","authors":"Chuang Yu , Yunpeng Liu , Jinmiao Zhao , Dou Quan , Zelin Shi , Xiangyu Yue","doi":"10.1016/j.inffus.2025.103749","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, feature relation learning has drawn widespread attention in cross-spectral image patch matching. However, existing related research focuses on capturing diverse feature relations between image patches and ignores sufficient intrinsic feature representations of individual image patches. To address this limitation, we propose an innovative relational representation learning that simultaneously focuses on sufficiently mining the intrinsic features of individual image patches and the feature relations between image patches. Based on this, we construct a <strong><u>R</u></strong>elational <strong><u>R</u></strong>epresentation <strong><u>L</u></strong>earning <strong><u>Net</u></strong>work (<strong>RRL-Net</strong>). Specifically, we innovatively construct an autoencoder to effectively characterize the individual intrinsic features, and introduce a feature interaction learning (FIL) module to extract deep-level feature relations. Meanwhile, to further fully mine individual intrinsic features, a lightweight multi-dimensional global-to-local attention (MGLA) module is constructed to enhance the global feature extraction of individual image patches and capture local dependencies within global features. By combining the MGLA module, we further explore the feature extraction network and construct an attention-based lightweight feature extraction (ALFE) network. Furthermore, a multi-loss post-pruning (MLPP) optimization strategy is proposed, which can greatly facilitate network optimization while avoiding increases in parameters and inference time. Extensive experiments demonstrate that our RRL-Net achieves state-of-the-art (SOTA) performance on multiple public datasets. Our code is available at <span><span>https://github.com/YuChuang1205/RRL-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103749"},"PeriodicalIF":15.5000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525008115","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, feature relation learning has drawn widespread attention in cross-spectral image patch matching. However, existing related research focuses on capturing diverse feature relations between image patches and ignores sufficient intrinsic feature representations of individual image patches. To address this limitation, we propose an innovative relational representation learning that simultaneously focuses on sufficiently mining the intrinsic features of individual image patches and the feature relations between image patches. Based on this, we construct a Relational Representation Learning Network (RRL-Net). Specifically, we innovatively construct an autoencoder to effectively characterize the individual intrinsic features, and introduce a feature interaction learning (FIL) module to extract deep-level feature relations. Meanwhile, to further fully mine individual intrinsic features, a lightweight multi-dimensional global-to-local attention (MGLA) module is constructed to enhance the global feature extraction of individual image patches and capture local dependencies within global features. By combining the MGLA module, we further explore the feature extraction network and construct an attention-based lightweight feature extraction (ALFE) network. Furthermore, a multi-loss post-pruning (MLPP) optimization strategy is proposed, which can greatly facilitate network optimization while avoiding increases in parameters and inference time. Extensive experiments demonstrate that our RRL-Net achieves state-of-the-art (SOTA) performance on multiple public datasets. Our code is available at https://github.com/YuChuang1205/RRL-Net.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.