{"title":"SET:用于皮肤病变分割的超像素嵌入式变压器","authors":"Zhonghua Wang , Junyan Lyu , Xiaoying Tang","doi":"10.1016/j.media.2025.103738","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate skin lesion segmentation is crucial for the early detection and treatment of skin cancer. Despite significant advances in deep learning, current segmentation methods often struggle to fully capture global contextual information and maintain the structural integrity of skin lesions. To address these challenges, this paper introduces Superpixel Embedded Transformer (SET), which integrates superpixels into the Transformer framework for skin lesion segmentation. Instead of embedding non-overlapping patches as tokens, SET employs an Association Embedded Merging & Dispatching (AEM&D) module to treat superpixels as the fundamental units during both the down-sampling and up-sampling phases. To better capture the multi-scale information of lesions, we propose a superpixel bank to store various superpixel maps with distinct compactness values. An Ensemble Fusion and Refinery (EFR) module is then designed to fuse and refine the results obtained from each map in the superpixel bank. This approach enables the model to selectively focus on different features by adopting various superpixel maps, thereby enhancing the segmentation performance. Extensive experiments are conducted on multiple skin lesion segmentation datasets, including ISIC 2016, ISIC 2017, and ISIC 2018. Comparative analyses with state-of-the-art methods showcase SET’s superior performance, and ablation studies confirm the effectiveness of our proposed modules incorporating superpixels into Vision Transformer. The source code of our SET will be available at <span><span>https://github.com/Wzhjerry/SET</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103738"},"PeriodicalIF":11.8000,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SET: Superpixel Embedded Transformer for skin lesion segmentation\",\"authors\":\"Zhonghua Wang , Junyan Lyu , Xiaoying Tang\",\"doi\":\"10.1016/j.media.2025.103738\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurate skin lesion segmentation is crucial for the early detection and treatment of skin cancer. Despite significant advances in deep learning, current segmentation methods often struggle to fully capture global contextual information and maintain the structural integrity of skin lesions. To address these challenges, this paper introduces Superpixel Embedded Transformer (SET), which integrates superpixels into the Transformer framework for skin lesion segmentation. Instead of embedding non-overlapping patches as tokens, SET employs an Association Embedded Merging & Dispatching (AEM&D) module to treat superpixels as the fundamental units during both the down-sampling and up-sampling phases. To better capture the multi-scale information of lesions, we propose a superpixel bank to store various superpixel maps with distinct compactness values. An Ensemble Fusion and Refinery (EFR) module is then designed to fuse and refine the results obtained from each map in the superpixel bank. This approach enables the model to selectively focus on different features by adopting various superpixel maps, thereby enhancing the segmentation performance. Extensive experiments are conducted on multiple skin lesion segmentation datasets, including ISIC 2016, ISIC 2017, and ISIC 2018. Comparative analyses with state-of-the-art methods showcase SET’s superior performance, and ablation studies confirm the effectiveness of our proposed modules incorporating superpixels into Vision Transformer. The source code of our SET will be available at <span><span>https://github.com/Wzhjerry/SET</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"105 \",\"pages\":\"Article 103738\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525002853\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525002853","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
SET: Superpixel Embedded Transformer for skin lesion segmentation
Accurate skin lesion segmentation is crucial for the early detection and treatment of skin cancer. Despite significant advances in deep learning, current segmentation methods often struggle to fully capture global contextual information and maintain the structural integrity of skin lesions. To address these challenges, this paper introduces Superpixel Embedded Transformer (SET), which integrates superpixels into the Transformer framework for skin lesion segmentation. Instead of embedding non-overlapping patches as tokens, SET employs an Association Embedded Merging & Dispatching (AEM&D) module to treat superpixels as the fundamental units during both the down-sampling and up-sampling phases. To better capture the multi-scale information of lesions, we propose a superpixel bank to store various superpixel maps with distinct compactness values. An Ensemble Fusion and Refinery (EFR) module is then designed to fuse and refine the results obtained from each map in the superpixel bank. This approach enables the model to selectively focus on different features by adopting various superpixel maps, thereby enhancing the segmentation performance. Extensive experiments are conducted on multiple skin lesion segmentation datasets, including ISIC 2016, ISIC 2017, and ISIC 2018. Comparative analyses with state-of-the-art methods showcase SET’s superior performance, and ablation studies confirm the effectiveness of our proposed modules incorporating superpixels into Vision Transformer. The source code of our SET will be available at https://github.com/Wzhjerry/SET.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.