Ke Peng, Bi Hong, YiYang Liu, Chunyu Chen, Yu Pan, Yu Jiang, XiaoJuan Liu
{"title":"FE-MAANet:用于医学图像分割的频率增强多尺度自适应关注网络","authors":"Ke Peng, Bi Hong, YiYang Liu, Chunyu Chen, Yu Pan, Yu Jiang, XiaoJuan Liu","doi":"10.1016/j.bspc.2025.108884","DOIUrl":null,"url":null,"abstract":"<div><div>Medical image segmentation is vital for clinical diagnostic support. In recent years, convolutional neural networks (CNNs), particularly U-Net, have made significant advancements in this field. However, in most existing methods, the reliance on fixed-size convolution operations for feature extraction restricts their adaptability in capturing multi-scale features. Furthermore, these methods are also face challenges in effectively modeling global contextual information. To address these limitations, we propose a frequency-enhanced multi-scale adaptive attention network (FE-MAANet) based on a U-shaped architecture. Specifically, we propose a novel Multi-scale Adaptive Large Kernel (MSALK) module. MSALK extracts multi-scale features through cascaded depthwise separable convolutions of different types and sizes, along with a two-step feature calibration strategy to progressively integrate features from different receptive fields, thus optimizing feature representations and improving the model’s adaptability to multi-scale features. Moreover, we design a Frequency-Spatial Parallel Attention (FSPA) module integrated within the skip connections. FSPA adopts a dual-branch strategy to collaboratively leverage global information in the frequency domain and spatial detail information, avoiding the loss of local fine-grained details while enhancing the capability for global contextual modeling. We evaluate the effectiveness of our method on three challenging public datasets: the MICCAI 2015 Multi-Atlas Abdominal Labeling Challenge (Synapse) dataset, the Automated Cardiac Diagnosis (ACDC) dataset, and the Aortic Vessel Tracing (AVT) dataset. Extensive experiments demonstrate that, compared to previous state-of-the-art methods, our approach achieves superior segmentation performance with fewer parameters. Additionally,We conduct ablation studies validating significant segmentation improvement by optimizing two MSALK module key parameters (kernel size and dilation rate).</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"113 ","pages":"Article 108884"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FE-MAANet: A frequency-enhanced multi-scale adaptive attention network for medical image segmentation\",\"authors\":\"Ke Peng, Bi Hong, YiYang Liu, Chunyu Chen, Yu Pan, Yu Jiang, XiaoJuan Liu\",\"doi\":\"10.1016/j.bspc.2025.108884\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Medical image segmentation is vital for clinical diagnostic support. In recent years, convolutional neural networks (CNNs), particularly U-Net, have made significant advancements in this field. However, in most existing methods, the reliance on fixed-size convolution operations for feature extraction restricts their adaptability in capturing multi-scale features. Furthermore, these methods are also face challenges in effectively modeling global contextual information. To address these limitations, we propose a frequency-enhanced multi-scale adaptive attention network (FE-MAANet) based on a U-shaped architecture. Specifically, we propose a novel Multi-scale Adaptive Large Kernel (MSALK) module. MSALK extracts multi-scale features through cascaded depthwise separable convolutions of different types and sizes, along with a two-step feature calibration strategy to progressively integrate features from different receptive fields, thus optimizing feature representations and improving the model’s adaptability to multi-scale features. Moreover, we design a Frequency-Spatial Parallel Attention (FSPA) module integrated within the skip connections. FSPA adopts a dual-branch strategy to collaboratively leverage global information in the frequency domain and spatial detail information, avoiding the loss of local fine-grained details while enhancing the capability for global contextual modeling. We evaluate the effectiveness of our method on three challenging public datasets: the MICCAI 2015 Multi-Atlas Abdominal Labeling Challenge (Synapse) dataset, the Automated Cardiac Diagnosis (ACDC) dataset, and the Aortic Vessel Tracing (AVT) dataset. Extensive experiments demonstrate that, compared to previous state-of-the-art methods, our approach achieves superior segmentation performance with fewer parameters. Additionally,We conduct ablation studies validating significant segmentation improvement by optimizing two MSALK module key parameters (kernel size and dilation rate).</div></div>\",\"PeriodicalId\":55362,\"journal\":{\"name\":\"Biomedical Signal Processing and Control\",\"volume\":\"113 \",\"pages\":\"Article 108884\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Signal Processing and Control\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1746809425013953\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425013953","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
FE-MAANet: A frequency-enhanced multi-scale adaptive attention network for medical image segmentation
Medical image segmentation is vital for clinical diagnostic support. In recent years, convolutional neural networks (CNNs), particularly U-Net, have made significant advancements in this field. However, in most existing methods, the reliance on fixed-size convolution operations for feature extraction restricts their adaptability in capturing multi-scale features. Furthermore, these methods are also face challenges in effectively modeling global contextual information. To address these limitations, we propose a frequency-enhanced multi-scale adaptive attention network (FE-MAANet) based on a U-shaped architecture. Specifically, we propose a novel Multi-scale Adaptive Large Kernel (MSALK) module. MSALK extracts multi-scale features through cascaded depthwise separable convolutions of different types and sizes, along with a two-step feature calibration strategy to progressively integrate features from different receptive fields, thus optimizing feature representations and improving the model’s adaptability to multi-scale features. Moreover, we design a Frequency-Spatial Parallel Attention (FSPA) module integrated within the skip connections. FSPA adopts a dual-branch strategy to collaboratively leverage global information in the frequency domain and spatial detail information, avoiding the loss of local fine-grained details while enhancing the capability for global contextual modeling. We evaluate the effectiveness of our method on three challenging public datasets: the MICCAI 2015 Multi-Atlas Abdominal Labeling Challenge (Synapse) dataset, the Automated Cardiac Diagnosis (ACDC) dataset, and the Aortic Vessel Tracing (AVT) dataset. Extensive experiments demonstrate that, compared to previous state-of-the-art methods, our approach achieves superior segmentation performance with fewer parameters. Additionally,We conduct ablation studies validating significant segmentation improvement by optimizing two MSALK module key parameters (kernel size and dilation rate).
期刊介绍:
Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management.
Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.