基于广义分集损失的自蒸馏显著目标检测

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2025-05-23 DOI:10.1016/j.patcog.2025.111804

Yunfei Zheng , Jibin Yang , Haijun Tao , Yong Wang , Lei Chen , Yang Wang , Tieyong Cao

{"title":"基于广义分集损失的自蒸馏显著目标检测","authors":"Yunfei Zheng , Jibin Yang , Haijun Tao , Yong Wang , Lei Chen , Yang Wang , Tieyong Cao","doi":"10.1016/j.patcog.2025.111804","DOIUrl":null,"url":null,"abstract":"<div><div>Classic knowledge distillation (KD) via the Kullback–Leibler loss can improve the performance of small deep classification models effectively, but they are hard to be applied into salient object detection (SOD) models due to the lack of necessary multi-dimension knowledge representations in the logit layer. In this paper, a generalized diversity (GD) loss, inspired by ensemble learning, is proposed to constrain the student and teacher models to hold low diversity. This process drives the student to mimic the teacher’s salient knowledge representations while enhancing the student’s generalization ability. Secondly, a salient self-distillation (SD) framework based on the shared backbone and the salient SD loss is proposed. In a shared backbone network, a lightweight student sub-network and a large parameter teacher sub-network are constructed, respectively, to synchronously achieve coarse but rapid feature extraction, and refined but slow feature extraction. The SD loss is utilized to transfer refined salient knowledge from the teacher sub-network to the student sub-network, so that the performance of the student sub-network is improved. Extensive experimental results on five benchmark datasets demonstrate that the proposed GD loss can achieve salient knowledge transfer and outperforms recent six KD methods, and the proposed student network outperforms recent eleven SOD networks in performance and efficiency.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"168 ","pages":"Article 111804"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-distillation salient object detection via generalized diversity loss\",\"authors\":\"Yunfei Zheng , Jibin Yang , Haijun Tao , Yong Wang , Lei Chen , Yang Wang , Tieyong Cao\",\"doi\":\"10.1016/j.patcog.2025.111804\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Classic knowledge distillation (KD) via the Kullback–Leibler loss can improve the performance of small deep classification models effectively, but they are hard to be applied into salient object detection (SOD) models due to the lack of necessary multi-dimension knowledge representations in the logit layer. In this paper, a generalized diversity (GD) loss, inspired by ensemble learning, is proposed to constrain the student and teacher models to hold low diversity. This process drives the student to mimic the teacher’s salient knowledge representations while enhancing the student’s generalization ability. Secondly, a salient self-distillation (SD) framework based on the shared backbone and the salient SD loss is proposed. In a shared backbone network, a lightweight student sub-network and a large parameter teacher sub-network are constructed, respectively, to synchronously achieve coarse but rapid feature extraction, and refined but slow feature extraction. The SD loss is utilized to transfer refined salient knowledge from the teacher sub-network to the student sub-network, so that the performance of the student sub-network is improved. Extensive experimental results on five benchmark datasets demonstrate that the proposed GD loss can achieve salient knowledge transfer and outperforms recent six KD methods, and the proposed student network outperforms recent eleven SOD networks in performance and efficiency.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"168 \",\"pages\":\"Article 111804\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325004649\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325004649","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

基于Kullback-Leibler损失的经典知识蒸馏（KD）可以有效地提高小型深度分类模型的性能，但由于在logit层缺乏必要的多维知识表示，难以应用于显著目标检测（SOD）模型。本文在集成学习的启发下，提出了一种广义多样性损失（GD）来约束学生和教师模型保持低多样性。这一过程促使学生模仿教师的突出知识表征，同时增强学生的泛化能力。其次，提出了一种基于共享主干和显著自蒸馏损失的显著自蒸馏框架。在共享骨干网中，分别构建轻量级的学生子网络和大参数的教师子网络，同步实现粗而快的特征提取和精而慢的特征提取。利用SD损失将精细化的显著性知识从教师子网转移到学生子网，从而提高学生子网的性能。在5个基准数据集上的大量实验结果表明，所提出的GD损失可以实现显著的知识转移，并且优于最近的6种KD方法，所提出的学生网络在性能和效率上优于最近的11种SOD网络。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-distillation salient object detection via generalized diversity loss

Classic knowledge distillation (KD) via the Kullback–Leibler loss can improve the performance of small deep classification models effectively, but they are hard to be applied into salient object detection (SOD) models due to the lack of necessary multi-dimension knowledge representations in the logit layer. In this paper, a generalized diversity (GD) loss, inspired by ensemble learning, is proposed to constrain the student and teacher models to hold low diversity. This process drives the student to mimic the teacher’s salient knowledge representations while enhancing the student’s generalization ability. Secondly, a salient self-distillation (SD) framework based on the shared backbone and the salient SD loss is proposed. In a shared backbone network, a lightweight student sub-network and a large parameter teacher sub-network are constructed, respectively, to synchronously achieve coarse but rapid feature extraction, and refined but slow feature extraction. The SD loss is utilized to transfer refined salient knowledge from the teacher sub-network to the student sub-network, so that the performance of the student sub-network is improved. Extensive experimental results on five benchmark datasets demonstrate that the proposed GD loss can achieve salient knowledge transfer and outperforms recent six KD methods, and the proposed student network outperforms recent eleven SOD networks in performance and efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.