Multimodal medical image fusion based on dilated convolution and attention-based graph convolutional network

IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Kaixin Jin , Xiwen Wang , Lifang Wang , Wei Guo , Qiang Han , Xiaoqing Yu
{"title":"Multimodal medical image fusion based on dilated convolution and attention-based graph convolutional network","authors":"Kaixin Jin ,&nbsp;Xiwen Wang ,&nbsp;Lifang Wang ,&nbsp;Wei Guo ,&nbsp;Qiang Han ,&nbsp;Xiaoqing Yu","doi":"10.1016/j.compeleceng.2025.110359","DOIUrl":null,"url":null,"abstract":"<div><div>To address the limitations of current deep learning-based multimodal medical image fusion methods, which include insufficient representation of low- and high-level semantic information and weak robustness against noisy data, this paper proposes a multimodal medical image fusion method based on dilated convolution and attention-based graph convolutional networks, termed CGCN-Fusion. The framework comprises three main components: an encoder, a fusion module, and a decoder. The encoder integrates a low-level CNN encoder with a high-level GCN encoder to comprehensively enhance the representation of semantic features. Specifically, the low-level CNN encoder, through the introduction of dilated convolution technology, significantly enhances its ability to represent low-level semantic information, enabling the model to capture fine-grained details of images better. Meanwhile, the high-level GCN encoder leverages the unique strengths of graph convolutional networks to improve robustness against noise while integrating self-attention and multi-head attention mechanisms for a richer high-level semantic representation. The self-attention mechanism captures semantic information of key nodes, while the multi-head attention integrates structural encoding to model spatial-semantic relationships among nodes. The fusion module systematically integrates the extracted features, and the decoder reconstructs the fused image. Through comparisons with nine state-of-the-art methods, CGCN-Fusion demonstrates superior visual quality and objective performance, validating its effectiveness and clinical applicability in multimodal medical image fusion.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"124 ","pages":"Article 110359"},"PeriodicalIF":4.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625003027","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

To address the limitations of current deep learning-based multimodal medical image fusion methods, which include insufficient representation of low- and high-level semantic information and weak robustness against noisy data, this paper proposes a multimodal medical image fusion method based on dilated convolution and attention-based graph convolutional networks, termed CGCN-Fusion. The framework comprises three main components: an encoder, a fusion module, and a decoder. The encoder integrates a low-level CNN encoder with a high-level GCN encoder to comprehensively enhance the representation of semantic features. Specifically, the low-level CNN encoder, through the introduction of dilated convolution technology, significantly enhances its ability to represent low-level semantic information, enabling the model to capture fine-grained details of images better. Meanwhile, the high-level GCN encoder leverages the unique strengths of graph convolutional networks to improve robustness against noise while integrating self-attention and multi-head attention mechanisms for a richer high-level semantic representation. The self-attention mechanism captures semantic information of key nodes, while the multi-head attention integrates structural encoding to model spatial-semantic relationships among nodes. The fusion module systematically integrates the extracted features, and the decoder reconstructs the fused image. Through comparisons with nine state-of-the-art methods, CGCN-Fusion demonstrates superior visual quality and objective performance, validating its effectiveness and clinical applicability in multimodal medical image fusion.
基于扩展卷积和基于注意力的图卷积网络的多模态医学图像融合
针对当前基于深度学习的多模态医学图像融合方法存在的低、高层语义信息表达不足、对噪声数据鲁棒性较弱等局限性,提出了一种基于扩展卷积和基于注意力的图卷积网络的多模态医学图像融合方法,称为CGCN-Fusion。该框架包括三个主要组件:编码器、融合模块和解码器。该编码器集成了低级CNN编码器和高级GCN编码器,全面增强了语义特征的表示。具体来说,底层CNN编码器通过引入扩展卷积技术,显著增强了底层语义信息的表示能力,使模型能够更好地捕捉图像的细粒度细节。同时,高级GCN编码器利用图卷积网络的独特优势来提高对噪声的鲁棒性,同时集成自注意和多头注意机制,以获得更丰富的高级语义表示。自注意机制捕获关键节点的语义信息,多头注意机制结合结构编码对节点间的空间语义关系进行建模。融合模块对提取的特征进行系统集成,解码器对融合后的图像进行重构。通过与9种最先进的图像融合方法的比较,CGCN-Fusion显示出优越的视觉质量和客观性能,验证了其在多模态医学图像融合中的有效性和临床适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Electrical Engineering
Computers & Electrical Engineering 工程技术-工程:电子与电气
CiteScore
9.20
自引率
7.00%
发文量
661
审稿时长
47 days
期刊介绍: The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信