FE-MAANet：用于医学图像分割的频率增强多尺度自适应关注网络

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Biomedical Signal Processing and Control Pub Date : 2025-10-10 DOI:10.1016/j.bspc.2025.108884

Ke Peng, Bi Hong, YiYang Liu, Chunyu Chen, Yu Pan, Yu Jiang, XiaoJuan Liu

{"title":"FE-MAANet：用于医学图像分割的频率增强多尺度自适应关注网络","authors":"Ke Peng, Bi Hong, YiYang Liu, Chunyu Chen, Yu Pan, Yu Jiang, XiaoJuan Liu","doi":"10.1016/j.bspc.2025.108884","DOIUrl":null,"url":null,"abstract":"<div><div>Medical image segmentation is vital for clinical diagnostic support. In recent years, convolutional neural networks (CNNs), particularly U-Net, have made significant advancements in this field. However, in most existing methods, the reliance on fixed-size convolution operations for feature extraction restricts their adaptability in capturing multi-scale features. Furthermore, these methods are also face challenges in effectively modeling global contextual information. To address these limitations, we propose a frequency-enhanced multi-scale adaptive attention network (FE-MAANet) based on a U-shaped architecture. Specifically, we propose a novel Multi-scale Adaptive Large Kernel (MSALK) module. MSALK extracts multi-scale features through cascaded depthwise separable convolutions of different types and sizes, along with a two-step feature calibration strategy to progressively integrate features from different receptive fields, thus optimizing feature representations and improving the model’s adaptability to multi-scale features. Moreover, we design a Frequency-Spatial Parallel Attention (FSPA) module integrated within the skip connections. FSPA adopts a dual-branch strategy to collaboratively leverage global information in the frequency domain and spatial detail information, avoiding the loss of local fine-grained details while enhancing the capability for global contextual modeling. We evaluate the effectiveness of our method on three challenging public datasets: the MICCAI 2015 Multi-Atlas Abdominal Labeling Challenge (Synapse) dataset, the Automated Cardiac Diagnosis (ACDC) dataset, and the Aortic Vessel Tracing (AVT) dataset. Extensive experiments demonstrate that, compared to previous state-of-the-art methods, our approach achieves superior segmentation performance with fewer parameters. Additionally,We conduct ablation studies validating significant segmentation improvement by optimizing two MSALK module key parameters (kernel size and dilation rate).</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"113 ","pages":"Article 108884"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FE-MAANet: A frequency-enhanced multi-scale adaptive attention network for medical image segmentation\",\"authors\":\"Ke Peng, Bi Hong, YiYang Liu, Chunyu Chen, Yu Pan, Yu Jiang, XiaoJuan Liu\",\"doi\":\"10.1016/j.bspc.2025.108884\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Medical image segmentation is vital for clinical diagnostic support. In recent years, convolutional neural networks (CNNs), particularly U-Net, have made significant advancements in this field. However, in most existing methods, the reliance on fixed-size convolution operations for feature extraction restricts their adaptability in capturing multi-scale features. Furthermore, these methods are also face challenges in effectively modeling global contextual information. To address these limitations, we propose a frequency-enhanced multi-scale adaptive attention network (FE-MAANet) based on a U-shaped architecture. Specifically, we propose a novel Multi-scale Adaptive Large Kernel (MSALK) module. MSALK extracts multi-scale features through cascaded depthwise separable convolutions of different types and sizes, along with a two-step feature calibration strategy to progressively integrate features from different receptive fields, thus optimizing feature representations and improving the model’s adaptability to multi-scale features. Moreover, we design a Frequency-Spatial Parallel Attention (FSPA) module integrated within the skip connections. FSPA adopts a dual-branch strategy to collaboratively leverage global information in the frequency domain and spatial detail information, avoiding the loss of local fine-grained details while enhancing the capability for global contextual modeling. We evaluate the effectiveness of our method on three challenging public datasets: the MICCAI 2015 Multi-Atlas Abdominal Labeling Challenge (Synapse) dataset, the Automated Cardiac Diagnosis (ACDC) dataset, and the Aortic Vessel Tracing (AVT) dataset. Extensive experiments demonstrate that, compared to previous state-of-the-art methods, our approach achieves superior segmentation performance with fewer parameters. Additionally,We conduct ablation studies validating significant segmentation improvement by optimizing two MSALK module key parameters (kernel size and dilation rate).</div></div>\",\"PeriodicalId\":55362,\"journal\":{\"name\":\"Biomedical Signal Processing and Control\",\"volume\":\"113 \",\"pages\":\"Article 108884\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Signal Processing and Control\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1746809425013953\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425013953","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

医学图像分割对临床诊断支持至关重要。近年来，卷积神经网络（cnn），特别是U-Net，在这一领域取得了重大进展。然而，在大多数现有方法中，依赖固定大小的卷积操作进行特征提取限制了它们在捕获多尺度特征时的适应性。此外，这些方法在有效建模全局上下文信息方面也面临着挑战。为了解决这些限制，我们提出了一种基于u型结构的频率增强多尺度自适应注意力网络（FE-MAANet）。具体来说，我们提出了一种新的多尺度自适应大核（MSALK）模块。MSALK通过级联不同类型和大小的深度可分离卷积提取多尺度特征，并采用两步特征校准策略，逐步整合不同感受野的特征，从而优化特征表示，提高模型对多尺度特征的适应性。此外，我们设计了一个频率-空间并行注意（FSPA）模块集成在跳跃连接。FSPA采用双分支策略，协同利用频域全局信息和空间细节信息，避免了局部细粒度细节的丢失，同时增强了全局上下文建模的能力。我们在三个具有挑战性的公共数据集上评估了我们的方法的有效性：MICCAI 2015多图谱腹部标记挑战（Synapse）数据集，心脏自动诊断（ACDC）数据集和主动脉血管追踪（AVT）数据集。大量的实验表明，与以前的最先进的方法相比，我们的方法以更少的参数实现了更好的分割性能。此外，我们进行了消融研究，通过优化两个MSALK模块关键参数（内核大小和膨胀率），验证了显著的分割改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

FE-MAANet: A frequency-enhanced multi-scale adaptive attention network for medical image segmentation

查看原文本刊更多论文

FE-MAANet: A frequency-enhanced multi-scale adaptive attention network for medical image segmentation

Medical image segmentation is vital for clinical diagnostic support. In recent years, convolutional neural networks (CNNs), particularly U-Net, have made significant advancements in this field. However, in most existing methods, the reliance on fixed-size convolution operations for feature extraction restricts their adaptability in capturing multi-scale features. Furthermore, these methods are also face challenges in effectively modeling global contextual information. To address these limitations, we propose a frequency-enhanced multi-scale adaptive attention network (FE-MAANet) based on a U-shaped architecture. Specifically, we propose a novel Multi-scale Adaptive Large Kernel (MSALK) module. MSALK extracts multi-scale features through cascaded depthwise separable convolutions of different types and sizes, along with a two-step feature calibration strategy to progressively integrate features from different receptive fields, thus optimizing feature representations and improving the model’s adaptability to multi-scale features. Moreover, we design a Frequency-Spatial Parallel Attention (FSPA) module integrated within the skip connections. FSPA adopts a dual-branch strategy to collaboratively leverage global information in the frequency domain and spatial detail information, avoiding the loss of local fine-grained details while enhancing the capability for global contextual modeling. We evaluate the effectiveness of our method on three challenging public datasets: the MICCAI 2015 Multi-Atlas Abdominal Labeling Challenge (Synapse) dataset, the Automated Cardiac Diagnosis (ACDC) dataset, and the Aortic Vessel Tracing (AVT) dataset. Extensive experiments demonstrate that, compared to previous state-of-the-art methods, our approach achieves superior segmentation performance with fewer parameters. Additionally,We conduct ablation studies validating significant segmentation improvement by optimizing two MSALK module key parameters (kernel size and dilation rate).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biomedical Signal Processing and Control 工程技术-工程：生物医学

CiteScore

9.80

自引率

13.70%

发文量

822

审稿时长

4 months

期刊介绍： Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.