Kaixin Jin , Xiwen Wang , Lifang Wang , Wei Guo , Qiang Han , Xiaoqing Yu
{"title":"Multimodal medical image fusion based on dilated convolution and attention-based graph convolutional network","authors":"Kaixin Jin , Xiwen Wang , Lifang Wang , Wei Guo , Qiang Han , Xiaoqing Yu","doi":"10.1016/j.compeleceng.2025.110359","DOIUrl":null,"url":null,"abstract":"<div><div>To address the limitations of current deep learning-based multimodal medical image fusion methods, which include insufficient representation of low- and high-level semantic information and weak robustness against noisy data, this paper proposes a multimodal medical image fusion method based on dilated convolution and attention-based graph convolutional networks, termed CGCN-Fusion. The framework comprises three main components: an encoder, a fusion module, and a decoder. The encoder integrates a low-level CNN encoder with a high-level GCN encoder to comprehensively enhance the representation of semantic features. Specifically, the low-level CNN encoder, through the introduction of dilated convolution technology, significantly enhances its ability to represent low-level semantic information, enabling the model to capture fine-grained details of images better. Meanwhile, the high-level GCN encoder leverages the unique strengths of graph convolutional networks to improve robustness against noise while integrating self-attention and multi-head attention mechanisms for a richer high-level semantic representation. The self-attention mechanism captures semantic information of key nodes, while the multi-head attention integrates structural encoding to model spatial-semantic relationships among nodes. The fusion module systematically integrates the extracted features, and the decoder reconstructs the fused image. Through comparisons with nine state-of-the-art methods, CGCN-Fusion demonstrates superior visual quality and objective performance, validating its effectiveness and clinical applicability in multimodal medical image fusion.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"124 ","pages":"Article 110359"},"PeriodicalIF":4.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625003027","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
To address the limitations of current deep learning-based multimodal medical image fusion methods, which include insufficient representation of low- and high-level semantic information and weak robustness against noisy data, this paper proposes a multimodal medical image fusion method based on dilated convolution and attention-based graph convolutional networks, termed CGCN-Fusion. The framework comprises three main components: an encoder, a fusion module, and a decoder. The encoder integrates a low-level CNN encoder with a high-level GCN encoder to comprehensively enhance the representation of semantic features. Specifically, the low-level CNN encoder, through the introduction of dilated convolution technology, significantly enhances its ability to represent low-level semantic information, enabling the model to capture fine-grained details of images better. Meanwhile, the high-level GCN encoder leverages the unique strengths of graph convolutional networks to improve robustness against noise while integrating self-attention and multi-head attention mechanisms for a richer high-level semantic representation. The self-attention mechanism captures semantic information of key nodes, while the multi-head attention integrates structural encoding to model spatial-semantic relationships among nodes. The fusion module systematically integrates the extracted features, and the decoder reconstructs the fused image. Through comparisons with nine state-of-the-art methods, CGCN-Fusion demonstrates superior visual quality and objective performance, validating its effectiveness and clinical applicability in multimodal medical image fusion.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.