MFCPNet：通过多尺度特征融合和通道剪枝实现实时医学图像分割网络

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Biomedical Signal Processing and Control Pub Date : 2024-10-29 DOI:10.1016/j.bspc.2024.107074

Linlin Hou , Zishen Yan , Christian Desrosiers , Hui Liu

{"title":"MFCPNet：通过多尺度特征融合和通道剪枝实现实时医学图像分割网络","authors":"Linlin Hou , Zishen Yan , Christian Desrosiers , Hui Liu","doi":"10.1016/j.bspc.2024.107074","DOIUrl":null,"url":null,"abstract":"<div><div>Real-time medical image segmentation can not only enhance the interactivity and feasibility of applications but also support more medical application scenarios. Local feature extraction methods reliant on Convolutional Neural Networks (CNN) are hampered by restricted receptive fields, which weakens their ability to capture comprehensive information. Conversely, global feature extraction methods based on Transformers generally face impediments in real-time tasks due to their extensive computational demands. To address these challenges and explore accurate and real-time medical image segmentation models, we introduce this novel MFCPNet. MFCPNet begins by devising Multi-Scale Multi-Channel Convolution (MSMC Conv) to extract local features across various levels and scales. This innovative design contributes to extracting richer local information without unduly burdening the model. Second, for the enhanced receptive field of convolution and the model’s generalization capability, we introduce an Attention Block (Attn Block) carrying rotation invariance. This block, inspired by lightweight Bi-Level Routing Attention (BRA) and MLP-Mixer, effectively mitigates the constraints of convolutional structures and achieves superior contextual modeling. Finally, a judicious pruning of the channel count is employed within MFCPNet, striking a trade-off between segmentation accuracy and efficiency. To evaluate the proposed method, we compare it with several classic approaches using three different types of datasets: retinal images, brain scans, and colon polyps. Across these datasets, MFCPNet achieves segmentation performance comparable to existing methods, with a computational cost of 2.2G FLOPs and 0.49M parameters. Furthermore, it demonstrates a processing speed of 79.54 FPS, meeting the requirements for real-time applications.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107074"},"PeriodicalIF":4.9000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MFCPNet: Real time medical image segmentation network via multi-scale feature fusion and channel pruning\",\"authors\":\"Linlin Hou , Zishen Yan , Christian Desrosiers , Hui Liu\",\"doi\":\"10.1016/j.bspc.2024.107074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Real-time medical image segmentation can not only enhance the interactivity and feasibility of applications but also support more medical application scenarios. Local feature extraction methods reliant on Convolutional Neural Networks (CNN) are hampered by restricted receptive fields, which weakens their ability to capture comprehensive information. Conversely, global feature extraction methods based on Transformers generally face impediments in real-time tasks due to their extensive computational demands. To address these challenges and explore accurate and real-time medical image segmentation models, we introduce this novel MFCPNet. MFCPNet begins by devising Multi-Scale Multi-Channel Convolution (MSMC Conv) to extract local features across various levels and scales. This innovative design contributes to extracting richer local information without unduly burdening the model. Second, for the enhanced receptive field of convolution and the model’s generalization capability, we introduce an Attention Block (Attn Block) carrying rotation invariance. This block, inspired by lightweight Bi-Level Routing Attention (BRA) and MLP-Mixer, effectively mitigates the constraints of convolutional structures and achieves superior contextual modeling. Finally, a judicious pruning of the channel count is employed within MFCPNet, striking a trade-off between segmentation accuracy and efficiency. To evaluate the proposed method, we compare it with several classic approaches using three different types of datasets: retinal images, brain scans, and colon polyps. Across these datasets, MFCPNet achieves segmentation performance comparable to existing methods, with a computational cost of 2.2G FLOPs and 0.49M parameters. Furthermore, it demonstrates a processing speed of 79.54 FPS, meeting the requirements for real-time applications.</div></div>\",\"PeriodicalId\":55362,\"journal\":{\"name\":\"Biomedical Signal Processing and Control\",\"volume\":\"100 \",\"pages\":\"Article 107074\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Signal Processing and Control\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1746809424011327\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809424011327","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

实时医学图像分割不仅能增强应用的交互性和可行性，还能支持更多的医学应用场景。依赖卷积神经网络（CNN）的局部特征提取方法受到感受野的限制，从而削弱了其捕捉综合信息的能力。相反，基于变形器的全局特征提取方法由于计算量大，在实时任务中普遍面临障碍。为了应对这些挑战，探索准确、实时的医学图像分割模型，我们引入了新颖的 MFCPNet。MFCPNet 首先设计了多尺度多通道卷积（MSMC Conv），以提取不同层次和尺度的局部特征。这一创新设计有助于提取更丰富的局部信息，同时又不会给模型带来不必要的负担。其次，为了增强卷积的感受野和模型的泛化能力，我们引入了具有旋转不变性的注意力区块（Attention Block，Attn Block）。受轻量级双层路由注意力（BRA）和 MLP-Mixer 的启发，这个区块有效地减轻了卷积结构的限制，实现了卓越的上下文建模。最后，在 MFCPNet 中采用了明智的通道数修剪方法，在分割准确性和效率之间实现了权衡。为了评估所提出的方法，我们使用三种不同类型的数据集：视网膜图像、大脑扫描和结肠息肉，将其与几种经典方法进行了比较。在这些数据集中，MFCPNet 以 2.2G FLOPs 的计算成本和 0.49M 的参数实现了与现有方法相当的分割性能。此外，它的处理速度达到 79.54 FPS，满足了实时应用的要求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MFCPNet: Real time medical image segmentation network via multi-scale feature fusion and channel pruning

Real-time medical image segmentation can not only enhance the interactivity and feasibility of applications but also support more medical application scenarios. Local feature extraction methods reliant on Convolutional Neural Networks (CNN) are hampered by restricted receptive fields, which weakens their ability to capture comprehensive information. Conversely, global feature extraction methods based on Transformers generally face impediments in real-time tasks due to their extensive computational demands. To address these challenges and explore accurate and real-time medical image segmentation models, we introduce this novel MFCPNet. MFCPNet begins by devising Multi-Scale Multi-Channel Convolution (MSMC Conv) to extract local features across various levels and scales. This innovative design contributes to extracting richer local information without unduly burdening the model. Second, for the enhanced receptive field of convolution and the model’s generalization capability, we introduce an Attention Block (Attn Block) carrying rotation invariance. This block, inspired by lightweight Bi-Level Routing Attention (BRA) and MLP-Mixer, effectively mitigates the constraints of convolutional structures and achieves superior contextual modeling. Finally, a judicious pruning of the channel count is employed within MFCPNet, striking a trade-off between segmentation accuracy and efficiency. To evaluate the proposed method, we compare it with several classic approaches using three different types of datasets: retinal images, brain scans, and colon polyps. Across these datasets, MFCPNet achieves segmentation performance comparable to existing methods, with a computational cost of 2.2G FLOPs and 0.49M parameters. Furthermore, it demonstrates a processing speed of 79.54 FPS, meeting the requirements for real-time applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biomedical Signal Processing and Control 工程技术-工程：生物医学

CiteScore

9.80

自引率

13.70%

发文量

822

审稿时长

4 months

期刊介绍： Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.