Jiyong Zhang, Deguang Li, Yan Wu, Zhengwei Zhao, Yanlei Wang, Yang Li, Binqing Zhang
{"title":"A deep feature refinement self-supervised learning algorithm for medical image annotation","authors":"Jiyong Zhang, Deguang Li, Yan Wu, Zhengwei Zhao, Yanlei Wang, Yang Li, Binqing Zhang","doi":"10.1007/s10489-025-06737-2","DOIUrl":null,"url":null,"abstract":"<div><p>Biomedical image segmentation models heavily rely on large-scale annotated data for training, yet manual annotation notoriously labor-intensive, error-prone, and cost-prohibitive, especially in medical domains requiring expert knowledge. To address this limitation, we propose a self-supervised learning (SSL) framework that leverages unlabeled data to automatically extract discrim- inative features, thereby reducing dependence on human annotations. In this study, self-supervising refers to a learning paradigm where supervisory signals are generated directly from the data itself (e.g., spatial context, channel corre- lations) without external labels. Our goal is to design an SSL method tailored for medical image annotation tasks, enabling robust feature representation even with limited labeled data. This paper introduces a novel self-supervised learning algorithm for refining deep features in the context of medical image annotation tasks. By leveraging the self-supervised ability to learn from unlabeled data, the pro- posed approach aims to enhance feature representation. With the help of spatial and channel attention blocks, our method focuses on intricate feature details within medical images. The spatial attention component enables the network to selectively attend to relevant regions, while the channel attention mecha- nism fine-tunes feature maps for improved annotation accuracy. Both strengthen the model’s ability to capture intricate details and fine-grained information in medical images. To verify the effectiveness of the proposed model, we conducted exten- sive research on four benchmark datasets. The experimental results show that our approach achieves competitive performance compared with other state-of-the-art annotation methods. On the KDSB18(20%) dataset, the values of Precision, Dice and mIoU are 0.964, 0.888, 0.880 (without Barlow Twins Strategy), and 0.965, 0.888, 0.880 (with Barlow Twins Strategy). On the BUSIS dataset with 20% labeled data, the proposed framework achieves a Dice score of 0.861 and mIoU of 0.869, surpassing the baseline U-Net by 36.5% and 23.4%, respectively. For BraTS18 brain tumor segmentation under 10% supervision, our method attains a boundary localization accuracy (Dice) of 0.853, outperforming state-of-the-art models (e.g., RCA-IUNet) by 3.7%. This study develops a novel model that integrates spatial and channel attention, spatial information compression, and dilated convolutions. By leveraging a self-supervised pre-training network with BT strategy, the model optimizes its parameters for improved accuracy and stability on testing data. Experimental results on four datasets demonstrate that our framework consistently improves Dice scores by 12.8–29.8% compared to vanilla self-supervised methods (e.g., Barlow Twins) on medical image segmentation tasks with ≤ 20% annotations. The proposed lesion-aware contrastive loss reduces false positives by 18.5% (from 0.23 to 0.19) in small lesion detection, as validated on the ISIC18 dataset. In summary, the proposed model showcases competitive anno- tation performance compared with other models across multiple datasets, with the potential study for enhancing the accuracy, attention mechanisms, and deployment on resource-constrained platforms.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06737-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Biomedical image segmentation models heavily rely on large-scale annotated data for training, yet manual annotation notoriously labor-intensive, error-prone, and cost-prohibitive, especially in medical domains requiring expert knowledge. To address this limitation, we propose a self-supervised learning (SSL) framework that leverages unlabeled data to automatically extract discrim- inative features, thereby reducing dependence on human annotations. In this study, self-supervising refers to a learning paradigm where supervisory signals are generated directly from the data itself (e.g., spatial context, channel corre- lations) without external labels. Our goal is to design an SSL method tailored for medical image annotation tasks, enabling robust feature representation even with limited labeled data. This paper introduces a novel self-supervised learning algorithm for refining deep features in the context of medical image annotation tasks. By leveraging the self-supervised ability to learn from unlabeled data, the pro- posed approach aims to enhance feature representation. With the help of spatial and channel attention blocks, our method focuses on intricate feature details within medical images. The spatial attention component enables the network to selectively attend to relevant regions, while the channel attention mecha- nism fine-tunes feature maps for improved annotation accuracy. Both strengthen the model’s ability to capture intricate details and fine-grained information in medical images. To verify the effectiveness of the proposed model, we conducted exten- sive research on four benchmark datasets. The experimental results show that our approach achieves competitive performance compared with other state-of-the-art annotation methods. On the KDSB18(20%) dataset, the values of Precision, Dice and mIoU are 0.964, 0.888, 0.880 (without Barlow Twins Strategy), and 0.965, 0.888, 0.880 (with Barlow Twins Strategy). On the BUSIS dataset with 20% labeled data, the proposed framework achieves a Dice score of 0.861 and mIoU of 0.869, surpassing the baseline U-Net by 36.5% and 23.4%, respectively. For BraTS18 brain tumor segmentation under 10% supervision, our method attains a boundary localization accuracy (Dice) of 0.853, outperforming state-of-the-art models (e.g., RCA-IUNet) by 3.7%. This study develops a novel model that integrates spatial and channel attention, spatial information compression, and dilated convolutions. By leveraging a self-supervised pre-training network with BT strategy, the model optimizes its parameters for improved accuracy and stability on testing data. Experimental results on four datasets demonstrate that our framework consistently improves Dice scores by 12.8–29.8% compared to vanilla self-supervised methods (e.g., Barlow Twins) on medical image segmentation tasks with ≤ 20% annotations. The proposed lesion-aware contrastive loss reduces false positives by 18.5% (from 0.23 to 0.19) in small lesion detection, as validated on the ISIC18 dataset. In summary, the proposed model showcases competitive anno- tation performance compared with other models across multiple datasets, with the potential study for enhancing the accuracy, attention mechanisms, and deployment on resource-constrained platforms.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.