特征距离加权自适应解耦知识蒸馏在医学图像分割中的应用。

IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL
Xiangchun Yu, Ziyun Xiong, Miaomiao Liang, Lingjuan Yu, Jian Zheng
{"title":"特征距离加权自适应解耦知识蒸馏在医学图像分割中的应用。","authors":"Xiangchun Yu, Ziyun Xiong, Miaomiao Liang, Lingjuan Yu, Jian Zheng","doi":"10.1007/s11548-025-03346-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This paper aims to apply decoupled knowledge distillation (DKD) to medical image segmentation, focusing on transferring knowledge from a high-performance teacher network to a lightweight student network, thereby facilitating model deployment on embedded devices.</p><p><strong>Methods: </strong>We initially decouple the distillation loss into pixel-wise target class knowledge distillation (PTCKD) and pixel-wise non-target class knowledge distillation (PNCKD). Subsequently, to address the limitations of the fixed weight paradigm in PTCKD, we propose a novel feature distance-weighted adaptive decoupled knowledge distillation (FDWA-DKD) method. FDWA-DKD quantifies the feature disparity between student and teacher, generating instance-level adaptive weights for PTCKD. We design a feature distance weighting (FDW) module that dynamically calculates feature distance to obtain adaptive weights, integrating feature space distance information into logit distillation. Lastly, we introduce a class-wise feature probability distribution loss to encourage the student to mimic the teacher's spatial distribution.</p><p><strong>Results: </strong>Extensive experiments conducted on the Synapse and FLARE22 datasets demonstrate that our proposed FDWA-DKD achieves satisfactory performance, yielding optimal Dice scores and, in some instances, surpassing the performance of the teacher network. Ablation studies further validate the effectiveness of each module within our proposed method.</p><p><strong>Conclusion: </strong>Our method overcomes the constraints of traditional distillation methods by offering instance-level adaptive learning weights tailored to PTCKD. By quantifying student-teacher feature disparity and minimizing class-wise feature probability distribution loss, our method outperforms other distillation methods.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature distance-weighted adaptive decoupled knowledge distillation for medical image segmentation.\",\"authors\":\"Xiangchun Yu, Ziyun Xiong, Miaomiao Liang, Lingjuan Yu, Jian Zheng\",\"doi\":\"10.1007/s11548-025-03346-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>This paper aims to apply decoupled knowledge distillation (DKD) to medical image segmentation, focusing on transferring knowledge from a high-performance teacher network to a lightweight student network, thereby facilitating model deployment on embedded devices.</p><p><strong>Methods: </strong>We initially decouple the distillation loss into pixel-wise target class knowledge distillation (PTCKD) and pixel-wise non-target class knowledge distillation (PNCKD). Subsequently, to address the limitations of the fixed weight paradigm in PTCKD, we propose a novel feature distance-weighted adaptive decoupled knowledge distillation (FDWA-DKD) method. FDWA-DKD quantifies the feature disparity between student and teacher, generating instance-level adaptive weights for PTCKD. We design a feature distance weighting (FDW) module that dynamically calculates feature distance to obtain adaptive weights, integrating feature space distance information into logit distillation. Lastly, we introduce a class-wise feature probability distribution loss to encourage the student to mimic the teacher's spatial distribution.</p><p><strong>Results: </strong>Extensive experiments conducted on the Synapse and FLARE22 datasets demonstrate that our proposed FDWA-DKD achieves satisfactory performance, yielding optimal Dice scores and, in some instances, surpassing the performance of the teacher network. Ablation studies further validate the effectiveness of each module within our proposed method.</p><p><strong>Conclusion: </strong>Our method overcomes the constraints of traditional distillation methods by offering instance-level adaptive learning weights tailored to PTCKD. By quantifying student-teacher feature disparity and minimizing class-wise feature probability distribution loss, our method outperforms other distillation methods.</p>\",\"PeriodicalId\":51251,\"journal\":{\"name\":\"International Journal of Computer Assisted Radiology and Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Assisted Radiology and Surgery\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11548-025-03346-9\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03346-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

摘要

目的:本文旨在将解耦知识蒸馏(DKD)应用于医学图像分割,重点是将知识从高性能的教师网络转移到轻量级的学生网络,从而促进模型在嵌入式设备上的部署。方法:首先将蒸馏损失解耦为逐像素目标类知识蒸馏(PTCKD)和逐像素非目标类知识蒸馏(PNCKD)。随后,为了解决固定权重范式在PTCKD中的局限性,我们提出了一种新的特征距离加权自适应解耦知识蒸馏(FDWA-DKD)方法。FDWA-DKD量化学生和教师之间的特征差异,为PTCKD生成实例级自适应权重。设计了特征距离加权(FDW)模块,动态计算特征距离获得自适应权重,将特征空间距离信息整合到logit精馏中。最后,我们引入了一个分类特征概率分布损失来鼓励学生模仿老师的空间分布。结果:在Synapse和FLARE22数据集上进行的大量实验表明,我们提出的FDWA-DKD达到了令人满意的性能,产生了最佳的Dice分数,在某些情况下,甚至超过了教师网络的性能。烧蚀研究进一步验证了我们提出的方法中每个模块的有效性。结论:该方法通过提供适合PTCKD的实例级自适应学习权值,克服了传统蒸馏方法的局限性。通过量化学生-教师特征差异和最小化班级特征概率分布损失,我们的方法优于其他蒸馏方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Feature distance-weighted adaptive decoupled knowledge distillation for medical image segmentation.

Purpose: This paper aims to apply decoupled knowledge distillation (DKD) to medical image segmentation, focusing on transferring knowledge from a high-performance teacher network to a lightweight student network, thereby facilitating model deployment on embedded devices.

Methods: We initially decouple the distillation loss into pixel-wise target class knowledge distillation (PTCKD) and pixel-wise non-target class knowledge distillation (PNCKD). Subsequently, to address the limitations of the fixed weight paradigm in PTCKD, we propose a novel feature distance-weighted adaptive decoupled knowledge distillation (FDWA-DKD) method. FDWA-DKD quantifies the feature disparity between student and teacher, generating instance-level adaptive weights for PTCKD. We design a feature distance weighting (FDW) module that dynamically calculates feature distance to obtain adaptive weights, integrating feature space distance information into logit distillation. Lastly, we introduce a class-wise feature probability distribution loss to encourage the student to mimic the teacher's spatial distribution.

Results: Extensive experiments conducted on the Synapse and FLARE22 datasets demonstrate that our proposed FDWA-DKD achieves satisfactory performance, yielding optimal Dice scores and, in some instances, surpassing the performance of the teacher network. Ablation studies further validate the effectiveness of each module within our proposed method.

Conclusion: Our method overcomes the constraints of traditional distillation methods by offering instance-level adaptive learning weights tailored to PTCKD. By quantifying student-teacher feature disparity and minimizing class-wise feature probability distribution loss, our method outperforms other distillation methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Computer Assisted Radiology and Surgery
International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
5.90
自引率
6.70%
发文量
243
审稿时长
6-12 weeks
期刊介绍: The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信