基于空间-频率约束的双分支渐进式网络图像融合

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-08-30 DOI:10.1016/j.imavis.2025.105709

Zenghui Wang , Wenhao Song , Xuening Xing , Lina Liu , Xianxun Zhu , Mingliang Gao

{"title":"基于空间-频率约束的双分支渐进式网络图像融合","authors":"Zenghui Wang , Wenhao Song , Xuening Xing , Lina Liu , Xianxun Zhu , Mingliang Gao","doi":"10.1016/j.imavis.2025.105709","DOIUrl":null,"url":null,"abstract":"<div><div>Image fusion aims to integrate complementary information from source images to enhance the quality of fused representations. Most existing methods primarily impose pixel-level constraints in the spatial domain, which limits their ability to preserve frequency domain information. Furthermore, single-branch networks typically process source image features uniformly, which hinders cross-modal feature consideration. To address these challenges, we propose a Dual-branch Progressive Network (DPNet) for image fusion. First, a global feature fusion branch is constructed to enhance the extraction of long-range dependencies. This branch promotes global feature interaction through a Global Context Awareness (GCA) module. Subsequently, a local feature fusion branch is designed to extract local information from source images, which comprises multiple Local Feature Attention (LFA) modules to capture valuable local features. Additionally, to preserve both frequency and spatial domain information, we integrate two loss functions that jointly optimize feature retention in both domains. Experimental results on five datasets demonstrate that DPNet surpasses state-of-the-art fusion models both qualitatively and quantitatively. These findings validate its effectiveness for practical applications in military surveillance, environmental monitoring and medical imaging. The code is available at <span><span>https://github.com/zenghui11/DPNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"162 ","pages":"Article 105709"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Dual-branch Progressive Network with spatial-frequency constraint for image fusion\",\"authors\":\"Zenghui Wang , Wenhao Song , Xuening Xing , Lina Liu , Xianxun Zhu , Mingliang Gao\",\"doi\":\"10.1016/j.imavis.2025.105709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Image fusion aims to integrate complementary information from source images to enhance the quality of fused representations. Most existing methods primarily impose pixel-level constraints in the spatial domain, which limits their ability to preserve frequency domain information. Furthermore, single-branch networks typically process source image features uniformly, which hinders cross-modal feature consideration. To address these challenges, we propose a Dual-branch Progressive Network (DPNet) for image fusion. First, a global feature fusion branch is constructed to enhance the extraction of long-range dependencies. This branch promotes global feature interaction through a Global Context Awareness (GCA) module. Subsequently, a local feature fusion branch is designed to extract local information from source images, which comprises multiple Local Feature Attention (LFA) modules to capture valuable local features. Additionally, to preserve both frequency and spatial domain information, we integrate two loss functions that jointly optimize feature retention in both domains. Experimental results on five datasets demonstrate that DPNet surpasses state-of-the-art fusion models both qualitatively and quantitatively. These findings validate its effectiveness for practical applications in military surveillance, environmental monitoring and medical imaging. The code is available at <span><span>https://github.com/zenghui11/DPNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"162 \",\"pages\":\"Article 105709\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625002975\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625002975","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

图像融合旨在整合源图像的互补信息，以提高融合表示的质量。大多数现有方法主要在空间域施加像素级约束，这限制了它们保留频域信息的能力。此外，单分支网络通常均匀地处理源图像特征，这阻碍了跨模态特征的考虑。为了解决这些问题，我们提出了一种用于图像融合的双分支渐进网络（DPNet）。首先，构建全局特征融合分支，增强远程依赖关系的提取；该分支通过全局上下文感知（GCA）模块促进全局特征交互。随后，设计局部特征融合分支从源图像中提取局部信息，该分支由多个局部特征注意（LFA）模块组成，捕获有价值的局部特征。此外，为了同时保留频域和空间域信息，我们集成了两个损失函数，共同优化两个域的特征保留。在五个数据集上的实验结果表明，DPNet在质量和数量上都优于目前最先进的融合模型。这些发现验证了其在军事监视、环境监测和医学成像等实际应用中的有效性。代码可在https://github.com/zenghui11/DPNet上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Dual-branch Progressive Network with spatial-frequency constraint for image fusion

Image fusion aims to integrate complementary information from source images to enhance the quality of fused representations. Most existing methods primarily impose pixel-level constraints in the spatial domain, which limits their ability to preserve frequency domain information. Furthermore, single-branch networks typically process source image features uniformly, which hinders cross-modal feature consideration. To address these challenges, we propose a Dual-branch Progressive Network (DPNet) for image fusion. First, a global feature fusion branch is constructed to enhance the extraction of long-range dependencies. This branch promotes global feature interaction through a Global Context Awareness (GCA) module. Subsequently, a local feature fusion branch is designed to extract local information from source images, which comprises multiple Local Feature Attention (LFA) modules to capture valuable local features. Additionally, to preserve both frequency and spatial domain information, we integrate two loss functions that jointly optimize feature retention in both domains. Experimental results on five datasets demonstrate that DPNet surpasses state-of-the-art fusion models both qualitatively and quantitatively. These findings validate its effectiveness for practical applications in military surveillance, environmental monitoring and medical imaging. The code is available at https://github.com/zenghui11/DPNet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.