{"title":"SAMP-Net: a medical image segmentation network with split attention and multi-layer perceptron.","authors":"Xiaoxuan Ma, Sihan Shan, Dong Sui","doi":"10.1007/s11517-025-03331-z","DOIUrl":null,"url":null,"abstract":"<p><p>Convolutional neural networks (CNNs) have achieved remarkable success in computer vision, particularly in medical image segmentation. U-Net, a prominent architecture, marked a major breakthrough and remains widely used in practice. However, its uniform downsampling strategy and simple stacking of convolutional layers in the encoder limit its ability to capture rich features at multiple depths, reducing its efficiency for rapid image processing. To address these limitations, this paper proposes a novel segmentation network that integrates attention mechanisms with multilayer perceptrons (MLPs). The network is designed to progressively capture and refine features at different levels. At the low-level layers, the primary feature conservation (PFC) block is introduced to preserve essential spatial details and reduce the loss of primary features during downsampling. In the mid-level layers, the compact attention block (CAB) enhances feature interaction through a multi-path attention structure, improving the network's ability to capture diverse semantic information. At the high-level layers, Shift MLP and Tokenized MLP blocks are incorporated. The Shift MLP block shifts feature channels along different axes, allowing for enhanced local feature modeling by focusing on specific regions of the convolutional features. The Tokenized MLP block converts these features into abstract tokens and leverages MLPs to model their representations in the latent space, effectively reducing the number of parameters and computational complexity while improving segmentation performance. The experiments conducted on the colorectal cancer tumor dataset CCI and the public dataset ISIC-2018 demonstrate that the proposed method significantly outperforms U-Net, U-Net++, Swin-U-Net, Attention U-Net, and RA-U-Net in terms of performance, with average improvements of 6.67%, 5.53%, 10.18%, 4.78%, and 3.55%, respectively. The code is available at the following link: https://github.com/QingTianer/SAMP-Net.git.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical & Biological Engineering & Computing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11517-025-03331-z","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Convolutional neural networks (CNNs) have achieved remarkable success in computer vision, particularly in medical image segmentation. U-Net, a prominent architecture, marked a major breakthrough and remains widely used in practice. However, its uniform downsampling strategy and simple stacking of convolutional layers in the encoder limit its ability to capture rich features at multiple depths, reducing its efficiency for rapid image processing. To address these limitations, this paper proposes a novel segmentation network that integrates attention mechanisms with multilayer perceptrons (MLPs). The network is designed to progressively capture and refine features at different levels. At the low-level layers, the primary feature conservation (PFC) block is introduced to preserve essential spatial details and reduce the loss of primary features during downsampling. In the mid-level layers, the compact attention block (CAB) enhances feature interaction through a multi-path attention structure, improving the network's ability to capture diverse semantic information. At the high-level layers, Shift MLP and Tokenized MLP blocks are incorporated. The Shift MLP block shifts feature channels along different axes, allowing for enhanced local feature modeling by focusing on specific regions of the convolutional features. The Tokenized MLP block converts these features into abstract tokens and leverages MLPs to model their representations in the latent space, effectively reducing the number of parameters and computational complexity while improving segmentation performance. The experiments conducted on the colorectal cancer tumor dataset CCI and the public dataset ISIC-2018 demonstrate that the proposed method significantly outperforms U-Net, U-Net++, Swin-U-Net, Attention U-Net, and RA-U-Net in terms of performance, with average improvements of 6.67%, 5.53%, 10.18%, 4.78%, and 3.55%, respectively. The code is available at the following link: https://github.com/QingTianer/SAMP-Net.git.
期刊介绍:
Founded in 1963, Medical & Biological Engineering & Computing (MBEC) continues to serve the biomedical engineering community, covering the entire spectrum of biomedical and clinical engineering. The journal presents exciting and vital experimental and theoretical developments in biomedical science and technology, and reports on advances in computer-based methodologies in these multidisciplinary subjects. The journal also incorporates new and evolving technologies including cellular engineering and molecular imaging.
MBEC publishes original research articles as well as reviews and technical notes. Its Rapid Communications category focuses on material of immediate value to the readership, while the Controversies section provides a forum to exchange views on selected issues, stimulating a vigorous and informed debate in this exciting and high profile field.
MBEC is an official journal of the International Federation of Medical and Biological Engineering (IFMBE).