MKDFusion: modality knowledge decoupled for infrared and visible image fusion

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2025-04-14 DOI:10.1007/s10489-025-06470-w

Yucheng Zhang, You Ma, Lin Chai

{"title":"MKDFusion: modality knowledge decoupled for infrared and visible image fusion","authors":"Yucheng Zhang, You Ma, Lin Chai","doi":"10.1007/s10489-025-06470-w","DOIUrl":null,"url":null,"abstract":"<div><p>The purpose of infrared and visible fusion is to integrate useful information from both infrared and visible images into a single image. The fused image should possess rich texture details and salient target information of the two images. Current image fusion algorithms primarily face two limitations: 1) The lack of decoupling between modality-agnostic and modality-specific knowledge during the feature extraction stage hinders the alignment of modality-agnostic knowledge and the differentiation of modality-specific knowledge. 2) The interaction between modality features is not sufficiently explored in the feature fusion stage, which inhibits the exploitation of complementary information. To address the above challenges, we propose a Modality Knowledge Decoupled (MKD) module in the feature extraction stage and a Cross-Modality Mamba Fusion (CMF) module in the feature fusion stage. In MKD, we first utilize a dual-branch network to extract modality-agnostic and modality-specific knowledge separately. Then, a pair of Knowledge Discriminators (KD) is constructed to minimize inter-modality irrelevant knowledge and maximize inter-modality relevant knowledge. In CMF, the interactions between different modality knowledge are learnt in a hidden state space, which not only reduces the inter-modality knowledge differences but also enhances the texture information of the image. Experiments on three datasets demonstrate that our method outperforms existing methods, highlighting less salient targets and texture information more effectively. In addition, MKDFusion has demonstrated excellent generalization performance and enormous potential in high-level vision tasks in medical image fusion and object detection applications. The code is available at https://github.com/SEU-ZYC/MKDFusion.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06470-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The purpose of infrared and visible fusion is to integrate useful information from both infrared and visible images into a single image. The fused image should possess rich texture details and salient target information of the two images. Current image fusion algorithms primarily face two limitations: 1) The lack of decoupling between modality-agnostic and modality-specific knowledge during the feature extraction stage hinders the alignment of modality-agnostic knowledge and the differentiation of modality-specific knowledge. 2) The interaction between modality features is not sufficiently explored in the feature fusion stage, which inhibits the exploitation of complementary information. To address the above challenges, we propose a Modality Knowledge Decoupled (MKD) module in the feature extraction stage and a Cross-Modality Mamba Fusion (CMF) module in the feature fusion stage. In MKD, we first utilize a dual-branch network to extract modality-agnostic and modality-specific knowledge separately. Then, a pair of Knowledge Discriminators (KD) is constructed to minimize inter-modality irrelevant knowledge and maximize inter-modality relevant knowledge. In CMF, the interactions between different modality knowledge are learnt in a hidden state space, which not only reduces the inter-modality knowledge differences but also enhances the texture information of the image. Experiments on three datasets demonstrate that our method outperforms existing methods, highlighting less salient targets and texture information more effectively. In addition, MKDFusion has demonstrated excellent generalization performance and enormous potential in high-level vision tasks in medical image fusion and object detection applications. The code is available at https://github.com/SEU-ZYC/MKDFusion.

查看原文本刊更多论文

MKDFusion：用于红外和可见光图像融合的解耦模态知识

红外和可见光融合的目的是将红外和可见光图像中的有用信息融合到单个图像中。融合后的图像应具有丰富的纹理细节和突出的两幅图像的目标信息。目前的图像融合算法主要面临两个方面的局限性：1)特征提取阶段缺乏模态不可知知识和模态特定知识的解耦，阻碍了模态不可知知识的对齐和模态特定知识的区分。2)特征融合阶段未充分挖掘模态特征之间的交互作用，抑制了互补信息的挖掘。为了解决上述挑战，我们在特征提取阶段提出了模态知识解耦（MKD）模块，在特征融合阶段提出了跨模态曼巴融合（CMF）模块。在MKD中，我们首先利用双分支网络分别提取模态不可知知识和模态特定知识。然后，构造了一对知识鉴别器（KD）来最小化模态间不相关知识和最大化模态间相关知识。在CMF中，不同模态知识之间的相互作用是在一个隐藏的状态空间中学习的，这不仅减少了模态知识之间的差异，而且增强了图像的纹理信息。在三个数据集上的实验表明，我们的方法优于现有的方法，可以更有效地突出不太突出的目标和纹理信息。此外，MKDFusion在医学图像融合和目标检测等高级视觉任务中表现出了优异的泛化性能和巨大的应用潜力。代码可在https://github.com/SEU-ZYC/MKDFusion上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.