SRMA-KD：结构化关系多尺度关注知识精馏的有效轻量级心脏图像分割

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-04-30 DOI:10.1016/j.imavis.2025.105577

Bo Chen , Youhao Huang , Yufan Liu , Dong Sui , Fei Yang

{"title":"SRMA-KD：结构化关系多尺度关注知识精馏的有效轻量级心脏图像分割","authors":"Bo Chen , Youhao Huang , Yufan Liu , Dong Sui , Fei Yang","doi":"10.1016/j.imavis.2025.105577","DOIUrl":null,"url":null,"abstract":"<div><div>Cardiac image segmentation is essential for accurately extracting structural information of the heart, aiding in precise diagnosis and personalized treatment planning. However, real-time segmentation on medical devices demands computational efficiency that often conflicts with the intensive processing and storage requirements of deep learning algorithms. These algorithms are frequently hindered by their complex models and extensive parameter sets, which limit their feasibility in clinical settings with constrained resources. Meanwhile, the performance of lightweight heart segmentation models still requires enhancement. This study introduces the SRMA-KD framework, a knowledge distillation approach for cardiac image segmentation designed to achieve high accuracy with lightweight models. The framework efficiently transfers semantic feature information and structural knowledge from a teacher model to student model, ensuring effective segmentation within clinical resource limitations. The SRMA-KD framework includes three key modules: the Global Structural Relational Block (GSRB), the Multi-scale Feature Attention Block (MFAB), and the Prediction Difference Transfer Block (PDTB). The GSRB correlates the outputs of the teacher and student networks with the ground truth, transferring structural correlations to enhance the student network's global feature learning. The MFAB enables the student network to learn multi-scale feature extraction from the teacher network, focusing on relevant semantic regions. The PDTB minimizes pixel-level differences between the segmentation images of the teacher and student networks. Our experiments demonstrate that the SRMA-KD framework significantly improves the segmentation accuracy of the student network compared to other medical imaging knowledge distillation methods, highlighting its potential as an effective solution for cardiac image segmentation in resource-limited clinical environments.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"159 ","pages":"Article 105577"},"PeriodicalIF":4.2000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SRMA-KD: Structured relational multi-scale attention knowledge distillation for effective lightweight cardiac image segmentation\",\"authors\":\"Bo Chen , Youhao Huang , Yufan Liu , Dong Sui , Fei Yang\",\"doi\":\"10.1016/j.imavis.2025.105577\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cardiac image segmentation is essential for accurately extracting structural information of the heart, aiding in precise diagnosis and personalized treatment planning. However, real-time segmentation on medical devices demands computational efficiency that often conflicts with the intensive processing and storage requirements of deep learning algorithms. These algorithms are frequently hindered by their complex models and extensive parameter sets, which limit their feasibility in clinical settings with constrained resources. Meanwhile, the performance of lightweight heart segmentation models still requires enhancement. This study introduces the SRMA-KD framework, a knowledge distillation approach for cardiac image segmentation designed to achieve high accuracy with lightweight models. The framework efficiently transfers semantic feature information and structural knowledge from a teacher model to student model, ensuring effective segmentation within clinical resource limitations. The SRMA-KD framework includes three key modules: the Global Structural Relational Block (GSRB), the Multi-scale Feature Attention Block (MFAB), and the Prediction Difference Transfer Block (PDTB). The GSRB correlates the outputs of the teacher and student networks with the ground truth, transferring structural correlations to enhance the student network's global feature learning. The MFAB enables the student network to learn multi-scale feature extraction from the teacher network, focusing on relevant semantic regions. The PDTB minimizes pixel-level differences between the segmentation images of the teacher and student networks. Our experiments demonstrate that the SRMA-KD framework significantly improves the segmentation accuracy of the student network compared to other medical imaging knowledge distillation methods, highlighting its potential as an effective solution for cardiac image segmentation in resource-limited clinical environments.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"159 \",\"pages\":\"Article 105577\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625001659\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625001659","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

心脏图像分割是准确提取心脏结构信息、精确诊断和个性化治疗方案的关键。然而，医疗设备上的实时分割要求计算效率，这往往与深度学习算法的密集处理和存储要求相冲突。这些算法经常受到其复杂模型和广泛参数集的阻碍，这限制了它们在资源有限的临床环境中的可行性。同时，轻量级心脏分割模型的性能还有待提高。本研究介绍了SRMA-KD框架，这是一种用于心脏图像分割的知识蒸馏方法，旨在通过轻量级模型实现高精度。该框架有效地将语义特征信息和结构知识从教师模型转移到学生模型，确保在临床资源有限的情况下进行有效的分割。SRMA-KD框架包括三个关键模块：全局结构关系块（GSRB）、多尺度特征注意块（MFAB）和预测差异转移块（PDTB）。GSRB将教师和学生网络的输出与基础事实相关联，传递结构相关性以增强学生网络的全局特征学习。MFAB使学生网络能够从教师网络中学习多尺度特征提取，重点关注相关语义区域。PDTB最大限度地减少了教师和学生网络分割图像之间的像素级差异。我们的实验表明，与其他医学成像知识蒸馏方法相比，SRMA-KD框架显著提高了学生网络的分割精度，突出了其作为资源有限的临床环境下心脏图像分割的有效解决方案的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SRMA-KD: Structured relational multi-scale attention knowledge distillation for effective lightweight cardiac image segmentation

Cardiac image segmentation is essential for accurately extracting structural information of the heart, aiding in precise diagnosis and personalized treatment planning. However, real-time segmentation on medical devices demands computational efficiency that often conflicts with the intensive processing and storage requirements of deep learning algorithms. These algorithms are frequently hindered by their complex models and extensive parameter sets, which limit their feasibility in clinical settings with constrained resources. Meanwhile, the performance of lightweight heart segmentation models still requires enhancement. This study introduces the SRMA-KD framework, a knowledge distillation approach for cardiac image segmentation designed to achieve high accuracy with lightweight models. The framework efficiently transfers semantic feature information and structural knowledge from a teacher model to student model, ensuring effective segmentation within clinical resource limitations. The SRMA-KD framework includes three key modules: the Global Structural Relational Block (GSRB), the Multi-scale Feature Attention Block (MFAB), and the Prediction Difference Transfer Block (PDTB). The GSRB correlates the outputs of the teacher and student networks with the ground truth, transferring structural correlations to enhance the student network's global feature learning. The MFAB enables the student network to learn multi-scale feature extraction from the teacher network, focusing on relevant semantic regions. The PDTB minimizes pixel-level differences between the segmentation images of the teacher and student networks. Our experiments demonstrate that the SRMA-KD framework significantly improves the segmentation accuracy of the student network compared to other medical imaging knowledge distillation methods, highlighting its potential as an effective solution for cardiac image segmentation in resource-limited clinical environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.