Lewu Lin , Jiaxin Xie , Yingying Wang , Jialing Huang , Rongjin Zhuang , Xiaotong Tu , Xinghao Ding , Na Shen , Qing Lu
{"title":"Spatial-frequency dual-domain Kolmogorov–Arnold networks for multimodal medical image fusion","authors":"Lewu Lin , Jiaxin Xie , Yingying Wang , Jialing Huang , Rongjin Zhuang , Xiaotong Tu , Xinghao Ding , Na Shen , Qing Lu","doi":"10.1016/j.neucom.2025.130661","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal Medical Image Fusion (MMIF) can significantly enhance the efficiency and accuracy of clinical diagnosis and treatment by integrating medical images from different modalities into a single image with rich information. Recent advancements in Kolmogorov–Arnold Networks (KAN) have demonstrated significant potential in nonlinear fitting, owing to their ability to decompose complex multivariate functions into simpler univariate functions while maintaining high accuracy and interpretability. While most existing methods focus on developing increasingly complex architectures, addressing MMIF from a frequency analysis perspective and leveraging both spatial and frequency domains for interpretable and effective cross-modal fusion through KAN remains an underexplored frontier in prior research. To address this gap, we introduce Spatial-Frequency Dual-domain KAN (SFDKAN), a novel framework for MMIF. Initially, we apply a Hierarchical Wavelet Decomposition strategy to decompose the input modality into different frequency bands and introduce the powerful nonlinear mapping capability of KAN into the sub-bands of varying frequencies. This approach refines unimodal feature extraction and enhances the retention of high-frequency details and structural integrity. Next, we design a Spatial-Frequency Integration KAN (SFIKAN), leveraging complementary information from both spatial and frequency domains to facilitate effective cross-modality feature interaction and fusion. The Spatial KAN effectively focuses on critical regions in the fusion result, while ignoring irrelevant areas and suppressing redundant information. Meanwhile, the Frequency KAN overcomes the local limitations of the spatial domain, effectively handling long-range dependencies and enhancing global feature representation, thereby enabling more efficient cross-modality feature fusion. Extensive experiments on CI-MRI, PET-MRI, and SPECT-MRI datasets demonstrate the superiority of our method over state-of-the-art (SOTA) medical image fusion algorithms in both quantitative metrics and visual quality. The code will be available at <span><span>https://github.com/xiejiaaax/SFDKAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"648 ","pages":"Article 130661"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225013335","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal Medical Image Fusion (MMIF) can significantly enhance the efficiency and accuracy of clinical diagnosis and treatment by integrating medical images from different modalities into a single image with rich information. Recent advancements in Kolmogorov–Arnold Networks (KAN) have demonstrated significant potential in nonlinear fitting, owing to their ability to decompose complex multivariate functions into simpler univariate functions while maintaining high accuracy and interpretability. While most existing methods focus on developing increasingly complex architectures, addressing MMIF from a frequency analysis perspective and leveraging both spatial and frequency domains for interpretable and effective cross-modal fusion through KAN remains an underexplored frontier in prior research. To address this gap, we introduce Spatial-Frequency Dual-domain KAN (SFDKAN), a novel framework for MMIF. Initially, we apply a Hierarchical Wavelet Decomposition strategy to decompose the input modality into different frequency bands and introduce the powerful nonlinear mapping capability of KAN into the sub-bands of varying frequencies. This approach refines unimodal feature extraction and enhances the retention of high-frequency details and structural integrity. Next, we design a Spatial-Frequency Integration KAN (SFIKAN), leveraging complementary information from both spatial and frequency domains to facilitate effective cross-modality feature interaction and fusion. The Spatial KAN effectively focuses on critical regions in the fusion result, while ignoring irrelevant areas and suppressing redundant information. Meanwhile, the Frequency KAN overcomes the local limitations of the spatial domain, effectively handling long-range dependencies and enhancing global feature representation, thereby enabling more efficient cross-modality feature fusion. Extensive experiments on CI-MRI, PET-MRI, and SPECT-MRI datasets demonstrate the superiority of our method over state-of-the-art (SOTA) medical image fusion algorithms in both quantitative metrics and visual quality. The code will be available at https://github.com/xiejiaaax/SFDKAN.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.