{"title":"基于特征多尺度双输入动态增强的皮肤病灶分割模型","authors":"Xiaosen Li;Linli Li;Xinlong Xing;Huixian Liao;Wenji Wang;Qiutong Dong;Xiao Qin;Chang’an Yuan","doi":"10.1109/TMI.2025.3549011","DOIUrl":null,"url":null,"abstract":"Melanoma is a malignant tumor originating from the lesions of skin cells. Medical image segmentation tasks for skin lesion play a crucial role in quantitative analysis. Achieving precise and efficient segmentation remains a significant challenge for medical practitioners. Hence, a skin lesion segmentation model named MSDUNet, which incorporates multi-scale deformable block (MSD Block) and dual-input dynamic enhancement module(D2M), is proposed. Firstly, the model employs a hybrid architecture encoder that better integrates global and local features. Secondly, to better utilize macroscopic and microscopic multiscale information, improvements are made to skip connection and decoder block, introducing D2M and MSD Block. The D2M leverages large kernel dilated convolution to draw out attention bias matrix on the decoder features, supplementing and enhancing the semantic features of the decoder’s lower layers transmitted through skip connection features, thereby compensating semantic gaps. The MSD Block uses channel-wise split and deformable convolutions with varying receptive fields to better extract and integrate multi-scale information while controlling the model’s size, enabling the decoder to focus more on task-relevant regions and edge details. MSDUNet attains outstanding performance with Dice scores of 93.08% and 91.68% on the ISIC-2016 and ISIC-2018 datasets, respectively. Furthermore, experiments on the HAM10000 dataset demonstrate its superior performance with a Dice score of 95.40%. External validation experiments based on the ISIC-2016, ISIC-2018, and HAM10000 experimental weights on the PH2 dataset yield Dice scores of 92.67%, 92.31%, and 93.46%, respectively, showcasing the exceptional generalization capability of MSDUNet. Our code implementation is publicly available at the Github.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 7","pages":"2819-2830"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSDUNet: A Model Based on Feature Multi-Scale and Dual-Input Dynamic Enhancement for Skin Lesion Segmentation\",\"authors\":\"Xiaosen Li;Linli Li;Xinlong Xing;Huixian Liao;Wenji Wang;Qiutong Dong;Xiao Qin;Chang’an Yuan\",\"doi\":\"10.1109/TMI.2025.3549011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Melanoma is a malignant tumor originating from the lesions of skin cells. Medical image segmentation tasks for skin lesion play a crucial role in quantitative analysis. Achieving precise and efficient segmentation remains a significant challenge for medical practitioners. Hence, a skin lesion segmentation model named MSDUNet, which incorporates multi-scale deformable block (MSD Block) and dual-input dynamic enhancement module(D2M), is proposed. Firstly, the model employs a hybrid architecture encoder that better integrates global and local features. Secondly, to better utilize macroscopic and microscopic multiscale information, improvements are made to skip connection and decoder block, introducing D2M and MSD Block. The D2M leverages large kernel dilated convolution to draw out attention bias matrix on the decoder features, supplementing and enhancing the semantic features of the decoder’s lower layers transmitted through skip connection features, thereby compensating semantic gaps. The MSD Block uses channel-wise split and deformable convolutions with varying receptive fields to better extract and integrate multi-scale information while controlling the model’s size, enabling the decoder to focus more on task-relevant regions and edge details. MSDUNet attains outstanding performance with Dice scores of 93.08% and 91.68% on the ISIC-2016 and ISIC-2018 datasets, respectively. Furthermore, experiments on the HAM10000 dataset demonstrate its superior performance with a Dice score of 95.40%. External validation experiments based on the ISIC-2016, ISIC-2018, and HAM10000 experimental weights on the PH2 dataset yield Dice scores of 92.67%, 92.31%, and 93.46%, respectively, showcasing the exceptional generalization capability of MSDUNet. Our code implementation is publicly available at the Github.\",\"PeriodicalId\":94033,\"journal\":{\"name\":\"IEEE transactions on medical imaging\",\"volume\":\"44 7\",\"pages\":\"2819-2830\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10916741/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10916741/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MSDUNet: A Model Based on Feature Multi-Scale and Dual-Input Dynamic Enhancement for Skin Lesion Segmentation
Melanoma is a malignant tumor originating from the lesions of skin cells. Medical image segmentation tasks for skin lesion play a crucial role in quantitative analysis. Achieving precise and efficient segmentation remains a significant challenge for medical practitioners. Hence, a skin lesion segmentation model named MSDUNet, which incorporates multi-scale deformable block (MSD Block) and dual-input dynamic enhancement module(D2M), is proposed. Firstly, the model employs a hybrid architecture encoder that better integrates global and local features. Secondly, to better utilize macroscopic and microscopic multiscale information, improvements are made to skip connection and decoder block, introducing D2M and MSD Block. The D2M leverages large kernel dilated convolution to draw out attention bias matrix on the decoder features, supplementing and enhancing the semantic features of the decoder’s lower layers transmitted through skip connection features, thereby compensating semantic gaps. The MSD Block uses channel-wise split and deformable convolutions with varying receptive fields to better extract and integrate multi-scale information while controlling the model’s size, enabling the decoder to focus more on task-relevant regions and edge details. MSDUNet attains outstanding performance with Dice scores of 93.08% and 91.68% on the ISIC-2016 and ISIC-2018 datasets, respectively. Furthermore, experiments on the HAM10000 dataset demonstrate its superior performance with a Dice score of 95.40%. External validation experiments based on the ISIC-2016, ISIC-2018, and HAM10000 experimental weights on the PH2 dataset yield Dice scores of 92.67%, 92.31%, and 93.46%, respectively, showcasing the exceptional generalization capability of MSDUNet. Our code implementation is publicly available at the Github.