Jianping Gou , Kaijie Chen , Cheng Chen , Weihua Ou , Xin Luo , Zhang Yi
{"title":"语义分割的分层关联和注意差异蒸馏","authors":"Jianping Gou , Kaijie Chen , Cheng Chen , Weihua Ou , Xin Luo , Zhang Yi","doi":"10.1016/j.patcog.2025.112438","DOIUrl":null,"url":null,"abstract":"<div><div>Knowledge distillation (KD) has recently garnered increased attention in segmentation tasks due to its effective balance between accuracy and computational efficiency. Nonetheless, existing methods mainly rely on structured knowledge from a single layer, overlooking the valuable discrepant knowledge that captures the diversity and distinctiveness of features across various layers, which is essential for the KD process. We present Layer-wise Correlation and Attention Discrepancy Distillation (LCADD) to tackle this issue, training compact and accurate semantic segmentation networks by considering layer-wise discrepancy knowledge. Specifically, we employ two distillation schemes: (i) correlation discrepancy distillation, which constructs a pixel-wise correlation discrepancy matrix across various layers to seize more detailed spatial dependencies, and (ii) attention discrepancy self-distillation, which aims to guide the shallower layers of the student network to emulate the attention discrepancy maps of the deeper layers, facilitating self-learning of attention discrepancy knowledge within the student network. Each proposed method is designed to work collaboratively in learning discrepancy knowledge, allowing the student network to better imitate the teacher from the perspective of layer-wise discrepancy. Our method has demonstrated superior performance on various semantic segmentation datasets, including Cityscapes, Pascal VOC 2012, and CamVid, compared to the latest knowledge distillation techniques, thereby validating its effectiveness.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112438"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Layer-wise correlation and attention discrepancy distillation for semantic segmentation\",\"authors\":\"Jianping Gou , Kaijie Chen , Cheng Chen , Weihua Ou , Xin Luo , Zhang Yi\",\"doi\":\"10.1016/j.patcog.2025.112438\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Knowledge distillation (KD) has recently garnered increased attention in segmentation tasks due to its effective balance between accuracy and computational efficiency. Nonetheless, existing methods mainly rely on structured knowledge from a single layer, overlooking the valuable discrepant knowledge that captures the diversity and distinctiveness of features across various layers, which is essential for the KD process. We present Layer-wise Correlation and Attention Discrepancy Distillation (LCADD) to tackle this issue, training compact and accurate semantic segmentation networks by considering layer-wise discrepancy knowledge. Specifically, we employ two distillation schemes: (i) correlation discrepancy distillation, which constructs a pixel-wise correlation discrepancy matrix across various layers to seize more detailed spatial dependencies, and (ii) attention discrepancy self-distillation, which aims to guide the shallower layers of the student network to emulate the attention discrepancy maps of the deeper layers, facilitating self-learning of attention discrepancy knowledge within the student network. Each proposed method is designed to work collaboratively in learning discrepancy knowledge, allowing the student network to better imitate the teacher from the perspective of layer-wise discrepancy. Our method has demonstrated superior performance on various semantic segmentation datasets, including Cityscapes, Pascal VOC 2012, and CamVid, compared to the latest knowledge distillation techniques, thereby validating its effectiveness.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"172 \",\"pages\":\"Article 112438\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325010994\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325010994","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Layer-wise correlation and attention discrepancy distillation for semantic segmentation
Knowledge distillation (KD) has recently garnered increased attention in segmentation tasks due to its effective balance between accuracy and computational efficiency. Nonetheless, existing methods mainly rely on structured knowledge from a single layer, overlooking the valuable discrepant knowledge that captures the diversity and distinctiveness of features across various layers, which is essential for the KD process. We present Layer-wise Correlation and Attention Discrepancy Distillation (LCADD) to tackle this issue, training compact and accurate semantic segmentation networks by considering layer-wise discrepancy knowledge. Specifically, we employ two distillation schemes: (i) correlation discrepancy distillation, which constructs a pixel-wise correlation discrepancy matrix across various layers to seize more detailed spatial dependencies, and (ii) attention discrepancy self-distillation, which aims to guide the shallower layers of the student network to emulate the attention discrepancy maps of the deeper layers, facilitating self-learning of attention discrepancy knowledge within the student network. Each proposed method is designed to work collaboratively in learning discrepancy knowledge, allowing the student network to better imitate the teacher from the perspective of layer-wise discrepancy. Our method has demonstrated superior performance on various semantic segmentation datasets, including Cityscapes, Pascal VOC 2012, and CamVid, compared to the latest knowledge distillation techniques, thereby validating its effectiveness.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.