Shun Zou, Mingya Zhang, Bingjian Fan, Zhengyi Zhou, Xiuguo Zou
{"title":"SkinMamba:具有跨尺度全球状态建模和频率边界指导功能的精确皮肤病变分割架构","authors":"Shun Zou, Mingya Zhang, Bingjian Fan, Zhengyi Zhou, Xiuguo Zou","doi":"arxiv-2409.10890","DOIUrl":null,"url":null,"abstract":"Skin lesion segmentation is a crucial method for identifying early skin\ncancer. In recent years, both convolutional neural network (CNN) and\nTransformer-based methods have been widely applied. Moreover, combining CNN and\nTransformer effectively integrates global and local relationships, but remains\nlimited by the quadratic complexity of Transformer. To address this, we propose\na hybrid architecture based on Mamba and CNN, called SkinMamba. It maintains\nlinear complexity while offering powerful long-range dependency modeling and\nlocal feature extraction capabilities. Specifically, we introduce the Scale\nResidual State Space Block (SRSSB), which captures global contextual\nrelationships and cross-scale information exchange at a macro level, enabling\nexpert communication in a global state. This effectively addresses challenges\nin skin lesion segmentation related to varying lesion sizes and inconspicuous\ntarget areas. Additionally, to mitigate boundary blurring and information loss\nduring model downsampling, we introduce the Frequency Boundary Guided Module\n(FBGM), providing sufficient boundary priors to guide precise boundary\nsegmentation, while also using the retained information to assist the decoder\nin the decoding process. Finally, we conducted comparative and ablation\nexperiments on two public lesion segmentation datasets (ISIC2017 and ISIC2018),\nand the results demonstrate the strong competitiveness of SkinMamba in skin\nlesion segmentation tasks. The code is available at\nhttps://github.com/zs1314/SkinMamba.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance\",\"authors\":\"Shun Zou, Mingya Zhang, Bingjian Fan, Zhengyi Zhou, Xiuguo Zou\",\"doi\":\"arxiv-2409.10890\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Skin lesion segmentation is a crucial method for identifying early skin\\ncancer. In recent years, both convolutional neural network (CNN) and\\nTransformer-based methods have been widely applied. Moreover, combining CNN and\\nTransformer effectively integrates global and local relationships, but remains\\nlimited by the quadratic complexity of Transformer. To address this, we propose\\na hybrid architecture based on Mamba and CNN, called SkinMamba. It maintains\\nlinear complexity while offering powerful long-range dependency modeling and\\nlocal feature extraction capabilities. Specifically, we introduce the Scale\\nResidual State Space Block (SRSSB), which captures global contextual\\nrelationships and cross-scale information exchange at a macro level, enabling\\nexpert communication in a global state. This effectively addresses challenges\\nin skin lesion segmentation related to varying lesion sizes and inconspicuous\\ntarget areas. Additionally, to mitigate boundary blurring and information loss\\nduring model downsampling, we introduce the Frequency Boundary Guided Module\\n(FBGM), providing sufficient boundary priors to guide precise boundary\\nsegmentation, while also using the retained information to assist the decoder\\nin the decoding process. Finally, we conducted comparative and ablation\\nexperiments on two public lesion segmentation datasets (ISIC2017 and ISIC2018),\\nand the results demonstrate the strong competitiveness of SkinMamba in skin\\nlesion segmentation tasks. The code is available at\\nhttps://github.com/zs1314/SkinMamba.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10890\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance
Skin lesion segmentation is a crucial method for identifying early skin
cancer. In recent years, both convolutional neural network (CNN) and
Transformer-based methods have been widely applied. Moreover, combining CNN and
Transformer effectively integrates global and local relationships, but remains
limited by the quadratic complexity of Transformer. To address this, we propose
a hybrid architecture based on Mamba and CNN, called SkinMamba. It maintains
linear complexity while offering powerful long-range dependency modeling and
local feature extraction capabilities. Specifically, we introduce the Scale
Residual State Space Block (SRSSB), which captures global contextual
relationships and cross-scale information exchange at a macro level, enabling
expert communication in a global state. This effectively addresses challenges
in skin lesion segmentation related to varying lesion sizes and inconspicuous
target areas. Additionally, to mitigate boundary blurring and information loss
during model downsampling, we introduce the Frequency Boundary Guided Module
(FBGM), providing sufficient boundary priors to guide precise boundary
segmentation, while also using the retained information to assist the decoder
in the decoding process. Finally, we conducted comparative and ablation
experiments on two public lesion segmentation datasets (ISIC2017 and ISIC2018),
and the results demonstrate the strong competitiveness of SkinMamba in skin
lesion segmentation tasks. The code is available at
https://github.com/zs1314/SkinMamba.