DFEDC：利用增强型可变形卷积进行医学图像分割的双重融合

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2024-09-13 DOI:10.1016/j.imavis.2024.105277

Xian Fang, Yueqian Pan, Qiaohong Chen

{"title":"DFEDC：利用增强型可变形卷积进行医学图像分割的双重融合","authors":"Xian Fang, Yueqian Pan, Qiaohong Chen","doi":"10.1016/j.imavis.2024.105277","DOIUrl":null,"url":null,"abstract":"<div><p>Considering the complexity of lesion regions in medical images, current researches relying on CNNs typically employ large-kernel convolutions to expand the receptive field and enhance segmentation quality. However, these convolution methods are hindered by substantial computational requirements and limited capacity to extract contextual and multi-scale information, making it challenging to efficiently segment complex regions. To address this issue, we propose a dual fusion with enhanced deformable convolution network, namely DFEDC, which dynamically adjusts the receptive field and simultaneously integrates multi-scale feature information to effectively segment complex lesion areas and process boundaries. Firstly, we combine global channel and spatial fusion in a serial way, which integrates and reuses global channel attention and fully connected layers to achieve lightweight extraction of channel and spatial information. Additionally, we design a structured deformable convolution (SDC) that structures deformable convolution with inceptions and large kernel attention, and enhances the learning of offsets through parallel fusion to efficiently extract multi-scale feature information. To compensate for the loss of spatial information of SDC, we introduce a hybrid 2D and 3D feature extraction module to transform feature extraction from a single dimension to a fusion of 2D and 3D. Extensive experimental results on the Synapse, ACDC, and ISIC-2018 datasets demonstrate that our proposed DFEDC achieves superior results.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"151 ","pages":"Article 105277"},"PeriodicalIF":4.2000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DFEDC: Dual fusion with enhanced deformable convolution for medical image segmentation\",\"authors\":\"Xian Fang, Yueqian Pan, Qiaohong Chen\",\"doi\":\"10.1016/j.imavis.2024.105277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Considering the complexity of lesion regions in medical images, current researches relying on CNNs typically employ large-kernel convolutions to expand the receptive field and enhance segmentation quality. However, these convolution methods are hindered by substantial computational requirements and limited capacity to extract contextual and multi-scale information, making it challenging to efficiently segment complex regions. To address this issue, we propose a dual fusion with enhanced deformable convolution network, namely DFEDC, which dynamically adjusts the receptive field and simultaneously integrates multi-scale feature information to effectively segment complex lesion areas and process boundaries. Firstly, we combine global channel and spatial fusion in a serial way, which integrates and reuses global channel attention and fully connected layers to achieve lightweight extraction of channel and spatial information. Additionally, we design a structured deformable convolution (SDC) that structures deformable convolution with inceptions and large kernel attention, and enhances the learning of offsets through parallel fusion to efficiently extract multi-scale feature information. To compensate for the loss of spatial information of SDC, we introduce a hybrid 2D and 3D feature extraction module to transform feature extraction from a single dimension to a fusion of 2D and 3D. Extensive experimental results on the Synapse, ACDC, and ISIC-2018 datasets demonstrate that our proposed DFEDC achieves superior results.</p></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"151 \",\"pages\":\"Article 105277\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885624003822\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624003822","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

考虑到医学图像中病变区域的复杂性，目前依赖 CNN 的研究通常采用大核卷积来扩大感受野并提高分割质量。然而，这些卷积方法存在计算量大、提取上下文和多尺度信息的能力有限等问题，因而难以有效分割复杂区域。为解决这一问题，我们提出了增强型可变形卷积网络双重融合方法，即 DFEDC，它能动态调整感受野，同时整合多尺度特征信息，从而有效分割复杂病变区域和过程边界。首先，我们将全局信道和空间融合以串联的方式结合起来，整合并重用全局信道注意力和全连接层，实现信道和空间信息的轻量级提取。此外，我们还设计了一种结构化可变形卷积（SDC），将可变形卷积与概念和大核注意力进行结构化，并通过并行融合增强偏移学习，从而高效提取多尺度特征信息。为了弥补 SDC 的空间信息损失，我们引入了二维和三维混合特征提取模块，将特征提取从单一维度转变为二维和三维融合。在 Synapse、ACDC 和 ISIC-2018 数据集上的大量实验结果表明，我们提出的 DFEDC 取得了卓越的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DFEDC: Dual fusion with enhanced deformable convolution for medical image segmentation

Considering the complexity of lesion regions in medical images, current researches relying on CNNs typically employ large-kernel convolutions to expand the receptive field and enhance segmentation quality. However, these convolution methods are hindered by substantial computational requirements and limited capacity to extract contextual and multi-scale information, making it challenging to efficiently segment complex regions. To address this issue, we propose a dual fusion with enhanced deformable convolution network, namely DFEDC, which dynamically adjusts the receptive field and simultaneously integrates multi-scale feature information to effectively segment complex lesion areas and process boundaries. Firstly, we combine global channel and spatial fusion in a serial way, which integrates and reuses global channel attention and fully connected layers to achieve lightweight extraction of channel and spatial information. Additionally, we design a structured deformable convolution (SDC) that structures deformable convolution with inceptions and large kernel attention, and enhances the learning of offsets through parallel fusion to efficiently extract multi-scale feature information. To compensate for the loss of spatial information of SDC, we introduce a hybrid 2D and 3D feature extraction module to transform feature extraction from a single dimension to a fusion of 2D and 3D. Extensive experimental results on the Synapse, ACDC, and ISIC-2018 datasets demonstrate that our proposed DFEDC achieves superior results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.