{"title":"[Skin lesion classification with multi-level fusion of Swin-T and ConvNeXt].","authors":"Zetong Wang, Junhua Zhang, Xiao Wang","doi":"10.7507/1001-5515.202305025","DOIUrl":null,"url":null,"abstract":"<p><p>Skin cancer is a significant public health issue, and computer-aided diagnosis technology can effectively alleviate this burden. Accurate identification of skin lesion types is crucial when employing computer-aided diagnosis. This study proposes a multi-level attention cascaded fusion model based on Swin-T and ConvNeXt. It employed hierarchical Swin-T and ConvNeXt to extract global and local features, respectively, and introduced residual channel attention and spatial attention modules for further feature extraction. Multi-level attention mechanisms were utilized to process multi-scale global and local features. To address the problem of shallow features being lost due to their distance from the classifier, a hierarchical inverted residual fusion module was proposed to dynamically adjust the extracted feature information. Balanced sampling strategies and focal loss were employed to tackle the issue of imbalanced categories of skin lesions. Experimental testing on the ISIC2018 and ISIC2019 datasets yielded accuracy, precision, recall, and F1-Score of 96.01%, 93.67%, 92.65%, and 93.11%, respectively, and 92.79%, 91.52%, 88.90%, and 90.15%, respectively. Compared to Swin-T, the proposed method achieved an accuracy improvement of 3.60% and 1.66%, and compared to ConvNeXt, it achieved an accuracy improvement of 2.87% and 3.45%. The experiments demonstrate that the proposed method accurately classifies skin lesion images, providing a new solution for skin cancer diagnosis.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"41 3","pages":"544-551"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11208655/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"生物医学工程学杂志","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.7507/1001-5515.202305025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Skin cancer is a significant public health issue, and computer-aided diagnosis technology can effectively alleviate this burden. Accurate identification of skin lesion types is crucial when employing computer-aided diagnosis. This study proposes a multi-level attention cascaded fusion model based on Swin-T and ConvNeXt. It employed hierarchical Swin-T and ConvNeXt to extract global and local features, respectively, and introduced residual channel attention and spatial attention modules for further feature extraction. Multi-level attention mechanisms were utilized to process multi-scale global and local features. To address the problem of shallow features being lost due to their distance from the classifier, a hierarchical inverted residual fusion module was proposed to dynamically adjust the extracted feature information. Balanced sampling strategies and focal loss were employed to tackle the issue of imbalanced categories of skin lesions. Experimental testing on the ISIC2018 and ISIC2019 datasets yielded accuracy, precision, recall, and F1-Score of 96.01%, 93.67%, 92.65%, and 93.11%, respectively, and 92.79%, 91.52%, 88.90%, and 90.15%, respectively. Compared to Swin-T, the proposed method achieved an accuracy improvement of 3.60% and 1.66%, and compared to ConvNeXt, it achieved an accuracy improvement of 2.87% and 3.45%. The experiments demonstrate that the proposed method accurately classifies skin lesion images, providing a new solution for skin cancer diagnosis.