Multi-focus image fusion via multi-scale attention and Siamese networks

IF 3 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2025-07-19 DOI:10.1016/j.dsp.2025.105493

Hao Zhai , Nannan Luo , You Yang , Zhendong Xu , Bo Lin

{"title":"Multi-focus image fusion via multi-scale attention and Siamese networks","authors":"Hao Zhai , Nannan Luo , You Yang , Zhendong Xu , Bo Lin","doi":"10.1016/j.dsp.2025.105493","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-focus Image Fusion (MFIF) technology aims to generate a full-focus image with an extended focus range by combining multiple images with different focal depths. This has significant implications in fields such as image restoration and medical imaging. This paper proposes a new MFIF method based on deep learning, which utilizes multi-scale attention and a Siamese network structure to efficiently extract local depth features from images and enhance the fusion effect. The design of the Siamese network structure allows the model to process paired multi-focus images and share the feature extraction process in the deeper layers of the network. This not only enhances the expressive capability but also improves the model's ability to recognize images with different focal depths. Consequently, the network can effectively capture local depth features, which provides rich information for subsequent fusion. By incorporating a multi-scale dilated convolution attention module, which dynamically adapts the receptive field size to encompass a larger number of pixels, the process of information aggregation is facilitated across a wider area, thereby enhancing the optimization of the feature reconstruction process. Furthermore, binary segmentation and small-area filtering methods are employed to enhance the consistency of the fused image. Experimental results show that the proposed method surpasses existing multi-focus image fusion methods in terms of both subjective visual effects and objective evaluation metrics.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105493"},"PeriodicalIF":3.0000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425005159","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Multi-focus Image Fusion (MFIF) technology aims to generate a full-focus image with an extended focus range by combining multiple images with different focal depths. This has significant implications in fields such as image restoration and medical imaging. This paper proposes a new MFIF method based on deep learning, which utilizes multi-scale attention and a Siamese network structure to efficiently extract local depth features from images and enhance the fusion effect. The design of the Siamese network structure allows the model to process paired multi-focus images and share the feature extraction process in the deeper layers of the network. This not only enhances the expressive capability but also improves the model's ability to recognize images with different focal depths. Consequently, the network can effectively capture local depth features, which provides rich information for subsequent fusion. By incorporating a multi-scale dilated convolution attention module, which dynamically adapts the receptive field size to encompass a larger number of pixels, the process of information aggregation is facilitated across a wider area, thereby enhancing the optimization of the feature reconstruction process. Furthermore, binary segmentation and small-area filtering methods are employed to enhance the consistency of the fused image. Experimental results show that the proposed method surpasses existing multi-focus image fusion methods in terms of both subjective visual effects and objective evaluation metrics.

查看原文本刊更多论文

基于多尺度关注和暹罗网络的多焦点图像融合

多焦点图像融合（Multi-focus Image Fusion， MFIF）技术旨在将不同焦深的多幅图像组合在一起，生成焦距范围更广的全焦点图像。这在图像恢复和医学成像等领域具有重要意义。本文提出了一种新的基于深度学习的MFIF方法，利用多尺度关注和Siamese网络结构有效提取图像的局部深度特征，增强融合效果。Siamese网络结构的设计使模型能够处理成对的多焦点图像，并在网络的更深层共享特征提取过程。这不仅增强了模型的表达能力，而且提高了模型对不同焦深图像的识别能力。因此，该网络可以有效地捕获局部深度特征，为后续融合提供丰富的信息。通过引入多尺度扩展卷积注意模块，动态调整接收野大小以包含更多像素，促进了信息聚集过程在更大范围内进行，从而增强了特征重建过程的优化。采用二值分割和小面积滤波方法增强融合图像的一致性。实验结果表明，该方法在主观视觉效果和客观评价指标上都优于现有的多焦点图像融合方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,