MemoryFusion: A novel architecture for infrared and visible image fusion based on memory unit

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2025-06-21 DOI:10.1016/j.patcog.2025.112004

Jiachen He , Xiaoqing Luo , Zhancheng Zhang , Xiao-jun Wu

{"title":"MemoryFusion: A novel architecture for infrared and visible image fusion based on memory unit","authors":"Jiachen He , Xiaoqing Luo , Zhancheng Zhang , Xiao-jun Wu","doi":"10.1016/j.patcog.2025.112004","DOIUrl":null,"url":null,"abstract":"<div><div>Existing image fusion methods utilize elaborate encoders to sequentially extract shallow and deep features from the source images. However, most methods lack long-term dependence, i.e. shallow details are inevitably lost when the network encodes deep features. To this end, some methods employ skip connections or dense connections to directly assign shallow features into deeper layers, potentially introducing redundant information and increasing computational loads. To overcome these drawbacks and enhance the generalization ability for low-quality scenarios, a novel fusion architecture based on Gated Recurrent Unit (GRU) termed as MemoryFusion is proposed. First, the Input Extension Encoder (IEE) transfers the source image into a feature sequence. Then a Recurrent Fusion Encoder (RFE) containing Recurrent Memory Fusion Unit (RMFU) is designed to learn the intrinsic correlation between the multi-modality feature sequences and generate the fusion feature sequence. This memory fusion unit utilizes a special gating mechanism to incorporate historical information and current input, and then adaptively selects the valuable content and forgets the redundant information. More importantly, it effectively relieves the computational pressure. Finally, since the modality information is distributed at different sequence depths and varying illumination intensity, the Multi-hierarchical Aggregation Module (MHAM) is designed to obtain the corresponding weight sequence. The aggregated fusion feature is obtained by integrating the fusion feature sequence with the weight sequence. Extensive experiments demonstrate that MemoryFusion is superior to the state-of-the-art fusion methods on multiple datasets. Even on low-quality images, such as low-light or foggy conditions, our method also demonstrates exceptional fusion performance and scene fidelity.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112004"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325006648","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Existing image fusion methods utilize elaborate encoders to sequentially extract shallow and deep features from the source images. However, most methods lack long-term dependence, i.e. shallow details are inevitably lost when the network encodes deep features. To this end, some methods employ skip connections or dense connections to directly assign shallow features into deeper layers, potentially introducing redundant information and increasing computational loads. To overcome these drawbacks and enhance the generalization ability for low-quality scenarios, a novel fusion architecture based on Gated Recurrent Unit (GRU) termed as MemoryFusion is proposed. First, the Input Extension Encoder (IEE) transfers the source image into a feature sequence. Then a Recurrent Fusion Encoder (RFE) containing Recurrent Memory Fusion Unit (RMFU) is designed to learn the intrinsic correlation between the multi-modality feature sequences and generate the fusion feature sequence. This memory fusion unit utilizes a special gating mechanism to incorporate historical information and current input, and then adaptively selects the valuable content and forgets the redundant information. More importantly, it effectively relieves the computational pressure. Finally, since the modality information is distributed at different sequence depths and varying illumination intensity, the Multi-hierarchical Aggregation Module (MHAM) is designed to obtain the corresponding weight sequence. The aggregated fusion feature is obtained by integrating the fusion feature sequence with the weight sequence. Extensive experiments demonstrate that MemoryFusion is superior to the state-of-the-art fusion methods on multiple datasets. Even on low-quality images, such as low-light or foggy conditions, our method also demonstrates exceptional fusion performance and scene fidelity.

查看原文本刊更多论文

MemoryFusion：一种基于存储单元的红外和可见光图像融合新架构

现有的图像融合方法利用精细的编码器从源图像中依次提取浅特征和深特征。然而，大多数方法缺乏长期依赖性，即当网络编码深层特征时，不可避免地会丢失浅层细节。为此，一些方法采用跳过连接或密集连接直接将浅层特征分配到更深的层，这可能会引入冗余信息并增加计算负载。为了克服这些缺点，提高低质量场景下的泛化能力，提出了一种基于门控循环单元（GRU）的记忆融合（MemoryFusion）融合架构。首先，输入扩展编码器（IEE）将源图像转换成特征序列。然后设计了包含循环记忆融合单元（RMFU）的循环融合编码器（RFE）来学习多模态特征序列之间的内在相关性并生成融合特征序列。该记忆融合单元利用一种特殊的门控机制，将历史信息和当前输入信息融合在一起，自适应地选择有价值的内容，并遗忘多余的信息。更重要的是，它有效地减轻了计算压力。最后，由于模态信息分布在不同的序列深度和不同的光照强度，设计了多层聚合模块（Multi-hierarchical Aggregation Module， MHAM）来获得相应的权重序列。将融合特征序列与权值序列进行积分，得到聚合融合特征。大量的实验表明，MemoryFusion在多数据集上优于最先进的融合方法。即使在低质量的图像，如低光或多雾的条件下，我们的方法也展示了出色的融合性能和场景保真度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.