{"title":"MKDFusion: modality knowledge decoupled for infrared and visible image fusion","authors":"Yucheng Zhang, You Ma, Lin Chai","doi":"10.1007/s10489-025-06470-w","DOIUrl":null,"url":null,"abstract":"<div><p>The purpose of infrared and visible fusion is to integrate useful information from both infrared and visible images into a single image. The fused image should possess rich texture details and salient target information of the two images. Current image fusion algorithms primarily face two limitations: 1) The lack of decoupling between modality-agnostic and modality-specific knowledge during the feature extraction stage hinders the alignment of modality-agnostic knowledge and the differentiation of modality-specific knowledge. 2) The interaction between modality features is not sufficiently explored in the feature fusion stage, which inhibits the exploitation of complementary information. To address the above challenges, we propose a Modality Knowledge Decoupled (MKD) module in the feature extraction stage and a Cross-Modality Mamba Fusion (CMF) module in the feature fusion stage. In MKD, we first utilize a dual-branch network to extract modality-agnostic and modality-specific knowledge separately. Then, a pair of Knowledge Discriminators (KD) is constructed to minimize inter-modality irrelevant knowledge and maximize inter-modality relevant knowledge. In CMF, the interactions between different modality knowledge are learnt in a hidden state space, which not only reduces the inter-modality knowledge differences but also enhances the texture information of the image. Experiments on three datasets demonstrate that our method outperforms existing methods, highlighting less salient targets and texture information more effectively. In addition, MKDFusion has demonstrated excellent generalization performance and enormous potential in high-level vision tasks in medical image fusion and object detection applications. The code is available at https://github.com/SEU-ZYC/MKDFusion.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06470-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The purpose of infrared and visible fusion is to integrate useful information from both infrared and visible images into a single image. The fused image should possess rich texture details and salient target information of the two images. Current image fusion algorithms primarily face two limitations: 1) The lack of decoupling between modality-agnostic and modality-specific knowledge during the feature extraction stage hinders the alignment of modality-agnostic knowledge and the differentiation of modality-specific knowledge. 2) The interaction between modality features is not sufficiently explored in the feature fusion stage, which inhibits the exploitation of complementary information. To address the above challenges, we propose a Modality Knowledge Decoupled (MKD) module in the feature extraction stage and a Cross-Modality Mamba Fusion (CMF) module in the feature fusion stage. In MKD, we first utilize a dual-branch network to extract modality-agnostic and modality-specific knowledge separately. Then, a pair of Knowledge Discriminators (KD) is constructed to minimize inter-modality irrelevant knowledge and maximize inter-modality relevant knowledge. In CMF, the interactions between different modality knowledge are learnt in a hidden state space, which not only reduces the inter-modality knowledge differences but also enhances the texture information of the image. Experiments on three datasets demonstrate that our method outperforms existing methods, highlighting less salient targets and texture information more effectively. In addition, MKDFusion has demonstrated excellent generalization performance and enormous potential in high-level vision tasks in medical image fusion and object detection applications. The code is available at https://github.com/SEU-ZYC/MKDFusion.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.