医学图像标注的深度特征细化自监督学习算法

IF 3.5 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2025-08-18 DOI:10.1007/s10489-025-06737-2

Jiyong Zhang, Deguang Li, Yan Wu, Zhengwei Zhao, Yanlei Wang, Yang Li, Binqing Zhang

{"title":"医学图像标注的深度特征细化自监督学习算法","authors":"Jiyong Zhang, Deguang Li, Yan Wu, Zhengwei Zhao, Yanlei Wang, Yang Li, Binqing Zhang","doi":"10.1007/s10489-025-06737-2","DOIUrl":null,"url":null,"abstract":"<div><p>Biomedical image segmentation models heavily rely on large-scale annotated data for training, yet manual annotation notoriously labor-intensive, error-prone, and cost-prohibitive, especially in medical domains requiring expert knowledge. To address this limitation, we propose a self-supervised learning (SSL) framework that leverages unlabeled data to automatically extract discrim- inative features, thereby reducing dependence on human annotations. In this study, self-supervising refers to a learning paradigm where supervisory signals are generated directly from the data itself (e.g., spatial context, channel corre- lations) without external labels. Our goal is to design an SSL method tailored for medical image annotation tasks, enabling robust feature representation even with limited labeled data. This paper introduces a novel self-supervised learning algorithm for refining deep features in the context of medical image annotation tasks. By leveraging the self-supervised ability to learn from unlabeled data, the pro- posed approach aims to enhance feature representation. With the help of spatial and channel attention blocks, our method focuses on intricate feature details within medical images. The spatial attention component enables the network to selectively attend to relevant regions, while the channel attention mecha- nism fine-tunes feature maps for improved annotation accuracy. Both strengthen the model’s ability to capture intricate details and fine-grained information in medical images. To verify the effectiveness of the proposed model, we conducted exten- sive research on four benchmark datasets. The experimental results show that our approach achieves competitive performance compared with other state-of-the-art annotation methods. On the KDSB18(20%) dataset, the values of Precision, Dice and mIoU are 0.964, 0.888, 0.880 (without Barlow Twins Strategy), and 0.965, 0.888, 0.880 (with Barlow Twins Strategy). On the BUSIS dataset with 20% labeled data, the proposed framework achieves a Dice score of 0.861 and mIoU of 0.869, surpassing the baseline U-Net by 36.5% and 23.4%, respectively. For BraTS18 brain tumor segmentation under 10% supervision, our method attains a boundary localization accuracy (Dice) of 0.853, outperforming state-of-the-art models (e.g., RCA-IUNet) by 3.7%. This study develops a novel model that integrates spatial and channel attention, spatial information compression, and dilated convolutions. By leveraging a self-supervised pre-training network with BT strategy, the model optimizes its parameters for improved accuracy and stability on testing data. Experimental results on four datasets demonstrate that our framework consistently improves Dice scores by 12.8–29.8% compared to vanilla self-supervised methods (e.g., Barlow Twins) on medical image segmentation tasks with ≤ 20% annotations. The proposed lesion-aware contrastive loss reduces false positives by 18.5% (from 0.23 to 0.19) in small lesion detection, as validated on the ISIC18 dataset. In summary, the proposed model showcases competitive anno- tation performance compared with other models across multiple datasets, with the potential study for enhancing the accuracy, attention mechanisms, and deployment on resource-constrained platforms.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep feature refinement self-supervised learning algorithm for medical image annotation\",\"authors\":\"Jiyong Zhang, Deguang Li, Yan Wu, Zhengwei Zhao, Yanlei Wang, Yang Li, Binqing Zhang\",\"doi\":\"10.1007/s10489-025-06737-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Biomedical image segmentation models heavily rely on large-scale annotated data for training, yet manual annotation notoriously labor-intensive, error-prone, and cost-prohibitive, especially in medical domains requiring expert knowledge. To address this limitation, we propose a self-supervised learning (SSL) framework that leverages unlabeled data to automatically extract discrim- inative features, thereby reducing dependence on human annotations. In this study, self-supervising refers to a learning paradigm where supervisory signals are generated directly from the data itself (e.g., spatial context, channel corre- lations) without external labels. Our goal is to design an SSL method tailored for medical image annotation tasks, enabling robust feature representation even with limited labeled data. This paper introduces a novel self-supervised learning algorithm for refining deep features in the context of medical image annotation tasks. By leveraging the self-supervised ability to learn from unlabeled data, the pro- posed approach aims to enhance feature representation. With the help of spatial and channel attention blocks, our method focuses on intricate feature details within medical images. The spatial attention component enables the network to selectively attend to relevant regions, while the channel attention mecha- nism fine-tunes feature maps for improved annotation accuracy. Both strengthen the model’s ability to capture intricate details and fine-grained information in medical images. To verify the effectiveness of the proposed model, we conducted exten- sive research on four benchmark datasets. The experimental results show that our approach achieves competitive performance compared with other state-of-the-art annotation methods. On the KDSB18(20%) dataset, the values of Precision, Dice and mIoU are 0.964, 0.888, 0.880 (without Barlow Twins Strategy), and 0.965, 0.888, 0.880 (with Barlow Twins Strategy). On the BUSIS dataset with 20% labeled data, the proposed framework achieves a Dice score of 0.861 and mIoU of 0.869, surpassing the baseline U-Net by 36.5% and 23.4%, respectively. For BraTS18 brain tumor segmentation under 10% supervision, our method attains a boundary localization accuracy (Dice) of 0.853, outperforming state-of-the-art models (e.g., RCA-IUNet) by 3.7%. This study develops a novel model that integrates spatial and channel attention, spatial information compression, and dilated convolutions. By leveraging a self-supervised pre-training network with BT strategy, the model optimizes its parameters for improved accuracy and stability on testing data. Experimental results on four datasets demonstrate that our framework consistently improves Dice scores by 12.8–29.8% compared to vanilla self-supervised methods (e.g., Barlow Twins) on medical image segmentation tasks with ≤ 20% annotations. The proposed lesion-aware contrastive loss reduces false positives by 18.5% (from 0.23 to 0.19) in small lesion detection, as validated on the ISIC18 dataset. In summary, the proposed model showcases competitive anno- tation performance compared with other models across multiple datasets, with the potential study for enhancing the accuracy, attention mechanisms, and deployment on resource-constrained platforms.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 13\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06737-2\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06737-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

生物医学图像分割模型严重依赖于大规模标注数据进行训练，然而人工标注是出了名的劳动密集型、容易出错且成本高昂，尤其是在需要专业知识的医学领域。为了解决这一限制，我们提出了一个自监督学习（SSL）框架，该框架利用未标记的数据自动提取判别特征，从而减少对人工注释的依赖。在本研究中，自我监督指的是一种学习范式，在这种范式中，监督信号直接从数据本身（如空间背景、频道相关性）中产生，而不需要外部标签。我们的目标是设计一种适合医学图像注释任务的SSL方法，即使在有限的标记数据下也能实现鲁棒的特征表示。本文介绍了一种新的自监督学习算法，用于医学图像标注任务的深度特征提取。通过利用自监督能力从未标记的数据中学习，提出的方法旨在增强特征表示。借助空间和通道注意块，我们的方法专注于医学图像中复杂的特征细节。空间注意组件使网络能够选择性地关注相关区域，而通道注意机制则对特征映射进行微调，以提高标注精度。两者都增强了模型捕捉医学图像中复杂细节和细粒度信息的能力。为了验证所提出的模型的有效性，我们对四个基准数据集进行了广泛的研究。实验结果表明，与其他最先进的标注方法相比，我们的方法具有竞争力。在KDSB18（20%）数据集上，Precision、Dice和mIoU的值分别为0.964、0.888、0.880（不含Barlow Twins策略）和0.965、0.888、0.880（含Barlow Twins策略）。在具有20%标记数据的BUSIS数据集上，该框架的Dice得分为0.861，mIoU为0.869，分别超过基线U-Net 36.5%和23.4%。对于BraTS18脑肿瘤分割，在10%的监督下，我们的方法获得了0.853的边界定位精度（Dice），比最先进的模型（如RCA-IUNet）高出3.7%。本研究开发了一个整合空间和通道注意、空间信息压缩和扩张卷积的新模型。通过利用BT策略的自监督预训练网络，该模型优化了其参数，以提高测试数据的准确性和稳定性。在四个数据集上的实验结果表明，在标注≤20%的医学图像分割任务上，我们的框架与香草自监督方法（如Barlow Twins）相比，Dice分数持续提高12.8-29.8%。在ISIC18数据集上验证了所提出的病变感知对比损失在小病变检测中减少了18.5%的假阳性（从0.23降至0.19）。总之，与其他模型相比，该模型在多个数据集上展示了具有竞争力的注释性能，具有提高准确性、注意机制和在资源受限平台上部署的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A deep feature refinement self-supervised learning algorithm for medical image annotation

Biomedical image segmentation models heavily rely on large-scale annotated data for training, yet manual annotation notoriously labor-intensive, error-prone, and cost-prohibitive, especially in medical domains requiring expert knowledge. To address this limitation, we propose a self-supervised learning (SSL) framework that leverages unlabeled data to automatically extract discrim- inative features, thereby reducing dependence on human annotations. In this study, self-supervising refers to a learning paradigm where supervisory signals are generated directly from the data itself (e.g., spatial context, channel corre- lations) without external labels. Our goal is to design an SSL method tailored for medical image annotation tasks, enabling robust feature representation even with limited labeled data. This paper introduces a novel self-supervised learning algorithm for refining deep features in the context of medical image annotation tasks. By leveraging the self-supervised ability to learn from unlabeled data, the pro- posed approach aims to enhance feature representation. With the help of spatial and channel attention blocks, our method focuses on intricate feature details within medical images. The spatial attention component enables the network to selectively attend to relevant regions, while the channel attention mecha- nism fine-tunes feature maps for improved annotation accuracy. Both strengthen the model’s ability to capture intricate details and fine-grained information in medical images. To verify the effectiveness of the proposed model, we conducted exten- sive research on four benchmark datasets. The experimental results show that our approach achieves competitive performance compared with other state-of-the-art annotation methods. On the KDSB18(20%) dataset, the values of Precision, Dice and mIoU are 0.964, 0.888, 0.880 (without Barlow Twins Strategy), and 0.965, 0.888, 0.880 (with Barlow Twins Strategy). On the BUSIS dataset with 20% labeled data, the proposed framework achieves a Dice score of 0.861 and mIoU of 0.869, surpassing the baseline U-Net by 36.5% and 23.4%, respectively. For BraTS18 brain tumor segmentation under 10% supervision, our method attains a boundary localization accuracy (Dice) of 0.853, outperforming state-of-the-art models (e.g., RCA-IUNet) by 3.7%. This study develops a novel model that integrates spatial and channel attention, spatial information compression, and dilated convolutions. By leveraging a self-supervised pre-training network with BT strategy, the model optimizes its parameters for improved accuracy and stability on testing data. Experimental results on four datasets demonstrate that our framework consistently improves Dice scores by 12.8–29.8% compared to vanilla self-supervised methods (e.g., Barlow Twins) on medical image segmentation tasks with ≤ 20% annotations. The proposed lesion-aware contrastive loss reduces false positives by 18.5% (from 0.23 to 0.19) in small lesion detection, as validated on the ISIC18 dataset. In summary, the proposed model showcases competitive anno- tation performance compared with other models across multiple datasets, with the potential study for enhancing the accuracy, attention mechanisms, and deployment on resource-constrained platforms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.