A Bi-Directionally Fused Boundary Aware Network for Skin Lesion Segmentation

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-10-23 DOI:10.1109/TIP.2024.3482864

Feiniu Yuan;Yuhuan Peng;Qinghua Huang;Xuelong Li

{"title":"A Bi-Directionally Fused Boundary Aware Network for Skin Lesion Segmentation","authors":"Feiniu Yuan;Yuhuan Peng;Qinghua Huang;Xuelong Li","doi":"10.1109/TIP.2024.3482864","DOIUrl":null,"url":null,"abstract":"It is quite challenging to visually identify skin lesions with irregular shapes, blurred boundaries and large scale variances. Convolutional Neural Network (CNN) extracts more local features with abundant spatial information, while Transformer has the powerful ability to capture more global information but with insufficient spatial details. To overcome the difficulties in discriminating small or blurred skin lesions, we propose a Bi-directionally Fused Boundary Aware Network (BiFBA-Net). To utilize complementary features produced by CNNs and Transformers, we design a dual-encoding structure. Different from existing dual-encoders, our method designs a Bi-directional Attention Gate (Bi-AG) with two inputs and two outputs for crosswise feature fusion. Our Bi-AG accepts two kinds of features from CNN and Transformer encoders, and two attention gates are designed to generate two attention outputs that are sent back to the two encoders. Thus, we implement adequate exchanging of multi-scale information between CNN and Transformer encoders in a bi-directional and attention way. To perfectly restore feature maps, we propose a progressive decoding structure with boundary aware, containing three decoders with six supervised losses. The first decoder is a CNN network for producing more spatial details. The second one is a Partial Decoder (PD) for aggregating high-level features with more semantics. The last one is a Boundary Aware Decoder (BAD) proposed to progressively improve boundary accuracy. Our BAD uses residual structure and Reverse Attention (RA) at different scales to deeply mine structural and spatial details for refining lesion boundaries. Extensive experiments on public datasets show that our BiFBA-Net achieves higher segmentation accuracy, and has much better ability of boundary perceptions than compared methods. It also alleviates both over-segmentation of small lesions and under-segmentation of large ones.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6340-6353"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10733833/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

It is quite challenging to visually identify skin lesions with irregular shapes, blurred boundaries and large scale variances. Convolutional Neural Network (CNN) extracts more local features with abundant spatial information, while Transformer has the powerful ability to capture more global information but with insufficient spatial details. To overcome the difficulties in discriminating small or blurred skin lesions, we propose a Bi-directionally Fused Boundary Aware Network (BiFBA-Net). To utilize complementary features produced by CNNs and Transformers, we design a dual-encoding structure. Different from existing dual-encoders, our method designs a Bi-directional Attention Gate (Bi-AG) with two inputs and two outputs for crosswise feature fusion. Our Bi-AG accepts two kinds of features from CNN and Transformer encoders, and two attention gates are designed to generate two attention outputs that are sent back to the two encoders. Thus, we implement adequate exchanging of multi-scale information between CNN and Transformer encoders in a bi-directional and attention way. To perfectly restore feature maps, we propose a progressive decoding structure with boundary aware, containing three decoders with six supervised losses. The first decoder is a CNN network for producing more spatial details. The second one is a Partial Decoder (PD) for aggregating high-level features with more semantics. The last one is a Boundary Aware Decoder (BAD) proposed to progressively improve boundary accuracy. Our BAD uses residual structure and Reverse Attention (RA) at different scales to deeply mine structural and spatial details for refining lesion boundaries. Extensive experiments on public datasets show that our BiFBA-Net achieves higher segmentation accuracy, and has much better ability of boundary perceptions than compared methods. It also alleviates both over-segmentation of small lesions and under-segmentation of large ones.

查看原文本刊更多论文

用于皮损分割的双向融合边界感知网络

要通过视觉识别形状不规则、边界模糊、尺度差异大的皮肤病变是一项相当具有挑战性的工作。卷积神经网络（CNN）能提取空间信息丰富的局部特征，而变换器（Transformer）则能捕捉更多全局信息，但空间细节不足。为了克服辨别细小或模糊皮损的困难，我们提出了双向融合边界感知网络（BiFBA-Net）。为了利用 CNN 和变换器产生的互补特征，我们设计了一种双编码结构。与现有的双编码器不同，我们的方法设计了一个具有两个输入和两个输出的双向注意门（Bi-AG），用于交叉特征融合。我们的双向注意门（Bi-AG）接受来自 CNN 和 Transformer 编码器的两种特征，并设计了两个注意门来生成两个注意输出，将其送回两个编码器。因此，我们以双向和注意力的方式在 CNN 和变换器编码器之间实现了多尺度信息的充分交换。为了完美还原特征图，我们提出了一种具有边界感知的渐进式解码结构，其中包含三个具有六个监督损失的解码器。第一个解码器是一个 CNN 网络，用于生成更多空间细节。第二个解码器是部分解码器（PD），用于聚合具有更多语义的高级特征。最后一个是边界感知解码器（BAD），用于逐步提高边界准确性。我们的 BAD 使用不同尺度的残余结构和反向注意（RA）来深入挖掘结构和空间细节，以完善病变边界。在公共数据集上进行的大量实验表明，我们的 BiFBA-Net 可实现更高的分割准确度，其边界感知能力也远胜于其他方法。它还能减轻对小病灶的过度分割和对大病灶的分割不足。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量