Anwar Jimi , Nabila Zrira , Oumaima Guendoul , Ibtissam Benmiloud , Haris Ahmad Khan , Shah Nawaz
{"title":"ESC-UNET:一种混合CNN和Swin变压器的皮肤损伤分割方法","authors":"Anwar Jimi , Nabila Zrira , Oumaima Guendoul , Ibtissam Benmiloud , Haris Ahmad Khan , Shah Nawaz","doi":"10.1016/j.ibmed.2025.100257","DOIUrl":null,"url":null,"abstract":"<div><div>One of the most important tasks in computer-aided diagnostics is the automatic segmentation of skin lesions, which plays an essential role in the early diagnosis and treatment of skin cancer. In recent years, the Convolutional Neural Network (CNN) has largely replaced other traditional methods for segmenting skin lesions. However, due to insufficient information and unclear lesion region segmentation, skin lesion image segmentation still has challenges. In this paper, we propose a novel deep medical image segmentation approach named “ESC-UNET” which combines the advantages of CNN and Transformer to effectively leverage local information and long-range dependencies to enhance medical image segmentation. In terms of the local information, we use a CNN-based encoder and decoder framework. The CNN branch mines local information from medical images using the locality of convolution processes and the pre-trained EfficientNetB5 network. As for the long-range dependencies, we build a Transformer branch that emphasizes the global context. In addition, we employ Atrous Spatial Pyramid Pooling (ASPP) to gather network-wide relevant information. The Convolution Block Attention Module (CBAM) is added to the model to promote effective features and suppress ineffective features in segmentation. We have evaluated our network using the ISIC 2016, ISIC 2017, and ISIC 2018 datasets. The results demonstrate the efficiency of the proposed model in segmenting skin lesions.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"12 ","pages":"Article 100257"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ESC-UNET: A hybrid CNN and Swin Transformers for skin lesion segmentation\",\"authors\":\"Anwar Jimi , Nabila Zrira , Oumaima Guendoul , Ibtissam Benmiloud , Haris Ahmad Khan , Shah Nawaz\",\"doi\":\"10.1016/j.ibmed.2025.100257\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>One of the most important tasks in computer-aided diagnostics is the automatic segmentation of skin lesions, which plays an essential role in the early diagnosis and treatment of skin cancer. In recent years, the Convolutional Neural Network (CNN) has largely replaced other traditional methods for segmenting skin lesions. However, due to insufficient information and unclear lesion region segmentation, skin lesion image segmentation still has challenges. In this paper, we propose a novel deep medical image segmentation approach named “ESC-UNET” which combines the advantages of CNN and Transformer to effectively leverage local information and long-range dependencies to enhance medical image segmentation. In terms of the local information, we use a CNN-based encoder and decoder framework. The CNN branch mines local information from medical images using the locality of convolution processes and the pre-trained EfficientNetB5 network. As for the long-range dependencies, we build a Transformer branch that emphasizes the global context. In addition, we employ Atrous Spatial Pyramid Pooling (ASPP) to gather network-wide relevant information. The Convolution Block Attention Module (CBAM) is added to the model to promote effective features and suppress ineffective features in segmentation. We have evaluated our network using the ISIC 2016, ISIC 2017, and ISIC 2018 datasets. The results demonstrate the efficiency of the proposed model in segmenting skin lesions.</div></div>\",\"PeriodicalId\":73399,\"journal\":{\"name\":\"Intelligence-based medicine\",\"volume\":\"12 \",\"pages\":\"Article 100257\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligence-based medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666521225000614\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521225000614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ESC-UNET: A hybrid CNN and Swin Transformers for skin lesion segmentation
One of the most important tasks in computer-aided diagnostics is the automatic segmentation of skin lesions, which plays an essential role in the early diagnosis and treatment of skin cancer. In recent years, the Convolutional Neural Network (CNN) has largely replaced other traditional methods for segmenting skin lesions. However, due to insufficient information and unclear lesion region segmentation, skin lesion image segmentation still has challenges. In this paper, we propose a novel deep medical image segmentation approach named “ESC-UNET” which combines the advantages of CNN and Transformer to effectively leverage local information and long-range dependencies to enhance medical image segmentation. In terms of the local information, we use a CNN-based encoder and decoder framework. The CNN branch mines local information from medical images using the locality of convolution processes and the pre-trained EfficientNetB5 network. As for the long-range dependencies, we build a Transformer branch that emphasizes the global context. In addition, we employ Atrous Spatial Pyramid Pooling (ASPP) to gather network-wide relevant information. The Convolution Block Attention Module (CBAM) is added to the model to promote effective features and suppress ineffective features in segmentation. We have evaluated our network using the ISIC 2016, ISIC 2017, and ISIC 2018 datasets. The results demonstrate the efficiency of the proposed model in segmenting skin lesions.