{"title":"通过双重蒸馏实现医学图像分割的多尺度情境学习。","authors":"Ruize Cui, Lanqing Liu, Youyi Song, Ge Ren, Xiaowei Hu, Jing Qin","doi":"10.1002/mp.17506","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Recently, many studies have explored fusing features extracted from Convolutional neural networks (CNNs) and transformers to integrate multi-scale representations for better performance in medical image segmentation tasks. Although these hybrid models have achieved better results than previous CNN-based and transformer-based methods, they suffer from high computation and space complexities.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>The purpose of this research is to address the prohibitive computation and space complexities of hybrid models, which limit their application in clinical practice where computational resources are usually constrained.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>We propose a novel model equipped with a dual distillation scheme to sufficiently harness the complementary advantages of CNNs and transformers without compromising model efficiency. We further propose a multi-scale prior-knowledge distillation (MPD) module to effectively distill multi-scale knowledge from features extracted from transformers. In addition, to cooperate with the knowledge distillation scheme, we also propose an efficient and robust Selective Fusion module in the student network.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We extensively evaluate the proposed model against fourteen different network frameworks on two representative datasets: SipakMed and ISIC 2017. In the SipakMed dataset, 3037 Pap smear images are used for training and 1012 for testing. In the ISIC 2017 dataset, 2000 dermoscopic images are used for training, 150 for validation, and 600 for testing. Experimental results demonstrate that our method not only surpasses existing methods by a considerable margin with respect to the evaluation metrics of mean Intersection over Union, mean Dice coefficient, mean average symmetric surface distance, but also requires fewer computational resources in terms of model parameters and floating-point operations per second.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>Comprehensive comparisons in terms of segmentation accuracy and computational complexity unequivocally confirm that our method effectively and efficiently integrates the advantages of both CNNs and transformers, showing its suitability and significance for clinical applications.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 2","pages":"787-800"},"PeriodicalIF":3.2000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-scale contextual learning for medical image segmentation via dual distillation\",\"authors\":\"Ruize Cui, Lanqing Liu, Youyi Song, Ge Ren, Xiaowei Hu, Jing Qin\",\"doi\":\"10.1002/mp.17506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Recently, many studies have explored fusing features extracted from Convolutional neural networks (CNNs) and transformers to integrate multi-scale representations for better performance in medical image segmentation tasks. Although these hybrid models have achieved better results than previous CNN-based and transformer-based methods, they suffer from high computation and space complexities.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>The purpose of this research is to address the prohibitive computation and space complexities of hybrid models, which limit their application in clinical practice where computational resources are usually constrained.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>We propose a novel model equipped with a dual distillation scheme to sufficiently harness the complementary advantages of CNNs and transformers without compromising model efficiency. We further propose a multi-scale prior-knowledge distillation (MPD) module to effectively distill multi-scale knowledge from features extracted from transformers. In addition, to cooperate with the knowledge distillation scheme, we also propose an efficient and robust Selective Fusion module in the student network.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>We extensively evaluate the proposed model against fourteen different network frameworks on two representative datasets: SipakMed and ISIC 2017. In the SipakMed dataset, 3037 Pap smear images are used for training and 1012 for testing. In the ISIC 2017 dataset, 2000 dermoscopic images are used for training, 150 for validation, and 600 for testing. Experimental results demonstrate that our method not only surpasses existing methods by a considerable margin with respect to the evaluation metrics of mean Intersection over Union, mean Dice coefficient, mean average symmetric surface distance, but also requires fewer computational resources in terms of model parameters and floating-point operations per second.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>Comprehensive comparisons in terms of segmentation accuracy and computational complexity unequivocally confirm that our method effectively and efficiently integrates the advantages of both CNNs and transformers, showing its suitability and significance for clinical applications.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 2\",\"pages\":\"787-800\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/mp.17506\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mp.17506","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Multi-scale contextual learning for medical image segmentation via dual distillation
Background
Recently, many studies have explored fusing features extracted from Convolutional neural networks (CNNs) and transformers to integrate multi-scale representations for better performance in medical image segmentation tasks. Although these hybrid models have achieved better results than previous CNN-based and transformer-based methods, they suffer from high computation and space complexities.
Purpose
The purpose of this research is to address the prohibitive computation and space complexities of hybrid models, which limit their application in clinical practice where computational resources are usually constrained.
Methods
We propose a novel model equipped with a dual distillation scheme to sufficiently harness the complementary advantages of CNNs and transformers without compromising model efficiency. We further propose a multi-scale prior-knowledge distillation (MPD) module to effectively distill multi-scale knowledge from features extracted from transformers. In addition, to cooperate with the knowledge distillation scheme, we also propose an efficient and robust Selective Fusion module in the student network.
Results
We extensively evaluate the proposed model against fourteen different network frameworks on two representative datasets: SipakMed and ISIC 2017. In the SipakMed dataset, 3037 Pap smear images are used for training and 1012 for testing. In the ISIC 2017 dataset, 2000 dermoscopic images are used for training, 150 for validation, and 600 for testing. Experimental results demonstrate that our method not only surpasses existing methods by a considerable margin with respect to the evaluation metrics of mean Intersection over Union, mean Dice coefficient, mean average symmetric surface distance, but also requires fewer computational resources in terms of model parameters and floating-point operations per second.
Conclusions
Comprehensive comparisons in terms of segmentation accuracy and computational complexity unequivocally confirm that our method effectively and efficiently integrates the advantages of both CNNs and transformers, showing its suitability and significance for clinical applications.
期刊介绍:
Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments
Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.