{"title":"Flexible Temperature Parallel Distillation for Dense Object Detection: Make Response-Based Knowledge Distillation Great Again","authors":"Yaoye Song;Peng Zhang;Wei Huang;Yufei Zha;Yanning Zhang","doi":"10.1109/TCSVT.2024.3525051","DOIUrl":null,"url":null,"abstract":"Feature-based approaches have been the focal point of previous research on knowledge distillation (KD) for dense object detection. These methods employ feature imitation and result in competitive performance. Despite being able to achieve comparable performance in image recognition, response-based KD methods can not reach the same level in dense object detection. Inspired by improving distillation performance from two key aspects: where to distill and how to distill, in this paper, a parallel distillation (PD) is introduced to fully utilize the sophisticated detection head and transfer all the output responses from the teacher to the student efficiently. In particular, the proposed PD takes an important consideration of the specific location of distillation, which is crucial for effective knowledge transfer. Regarding the discrepancies in output responses between the localization branch and the classification branch, we propose a novel Dynamic Localization Temperature (DLT) module to enhance the precision of distilling localization information. As for the classification branch, a Classification Temperature-Free (CTF) module is also designed to increase the robustness of distillation in heterogeneous networks. By incorporating the DLT and CTF into the PD framework to avoid setting temperature values manually, the Flexible Temperature Parallel Distillation (FTPD) is proposed to achieve a state-of-the-art (SOTA) performance, which can also be further combined with mainstream feature-based methods for better results. In terms of accuracy and robustness with extensive experiments, the proposed FTPD outperforms other KD methods in the task of dense object detection.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 5","pages":"4963-4975"},"PeriodicalIF":8.3000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10820962/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Feature-based approaches have been the focal point of previous research on knowledge distillation (KD) for dense object detection. These methods employ feature imitation and result in competitive performance. Despite being able to achieve comparable performance in image recognition, response-based KD methods can not reach the same level in dense object detection. Inspired by improving distillation performance from two key aspects: where to distill and how to distill, in this paper, a parallel distillation (PD) is introduced to fully utilize the sophisticated detection head and transfer all the output responses from the teacher to the student efficiently. In particular, the proposed PD takes an important consideration of the specific location of distillation, which is crucial for effective knowledge transfer. Regarding the discrepancies in output responses between the localization branch and the classification branch, we propose a novel Dynamic Localization Temperature (DLT) module to enhance the precision of distilling localization information. As for the classification branch, a Classification Temperature-Free (CTF) module is also designed to increase the robustness of distillation in heterogeneous networks. By incorporating the DLT and CTF into the PD framework to avoid setting temperature values manually, the Flexible Temperature Parallel Distillation (FTPD) is proposed to achieve a state-of-the-art (SOTA) performance, which can also be further combined with mainstream feature-based methods for better results. In terms of accuracy and robustness with extensive experiments, the proposed FTPD outperforms other KD methods in the task of dense object detection.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.