{"title":"Regional filtering distillation for object detection","authors":"","doi":"10.1007/s00138-023-01503-1","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>Knowledge distillation is a common and effective method in model compression, which trains a compact student model to mimic the capability of a large teacher model to get superior generalization. Previous works on knowledge distillation are underperforming for challenging tasks such as object detection, compared to the general application of unsophisticated classification tasks. In this paper, we propose that the failure of knowledge distillation on object detection is mainly caused by the imbalance between features of informative and invalid background. Not all background noise is redundant, and the valuable background noise after screening contains relations between foreground and background. Therefore, we propose a novel regional filtering distillation (RFD) algorithm to solve this problem through two modules: region selection and attention-guided distillation. Region selection first filters massive invalid backgrounds and retains knowledge-dense regions on near object anchor locations. Attention-guided distillation further improves distillation performance on object detection tasks by extracting the relations between foreground and background to migrate key features. Extensive experiments on both one-stage and two-stage detectors have been conducted to prove the effectiveness of RFD. For example, RFD improves 2.8% and 2.6% mAP for ResNet50-RetinaNet and ResNet50-FPN student networks on the MS COCO dataset, respectively. We also evaluate our method with the Faster R-CNN model on Pascal VOC and KITTI benchmark, which obtain 1.52% and 4.36% mAP promotions for the ResNet18-FPN student network, respectively. Furthermore, our method increases 5.70% of mAP for MobileNetv2-SSD compared to the original model. The proposed RFD technique performs highly on detection tasks through regional filtering distillation. In the future, we plan to extend it to more challenging task scenarios, such as segmentation.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"12 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-023-01503-1","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Knowledge distillation is a common and effective method in model compression, which trains a compact student model to mimic the capability of a large teacher model to get superior generalization. Previous works on knowledge distillation are underperforming for challenging tasks such as object detection, compared to the general application of unsophisticated classification tasks. In this paper, we propose that the failure of knowledge distillation on object detection is mainly caused by the imbalance between features of informative and invalid background. Not all background noise is redundant, and the valuable background noise after screening contains relations between foreground and background. Therefore, we propose a novel regional filtering distillation (RFD) algorithm to solve this problem through two modules: region selection and attention-guided distillation. Region selection first filters massive invalid backgrounds and retains knowledge-dense regions on near object anchor locations. Attention-guided distillation further improves distillation performance on object detection tasks by extracting the relations between foreground and background to migrate key features. Extensive experiments on both one-stage and two-stage detectors have been conducted to prove the effectiveness of RFD. For example, RFD improves 2.8% and 2.6% mAP for ResNet50-RetinaNet and ResNet50-FPN student networks on the MS COCO dataset, respectively. We also evaluate our method with the Faster R-CNN model on Pascal VOC and KITTI benchmark, which obtain 1.52% and 4.36% mAP promotions for the ResNet18-FPN student network, respectively. Furthermore, our method increases 5.70% of mAP for MobileNetv2-SSD compared to the original model. The proposed RFD technique performs highly on detection tasks through regional filtering distillation. In the future, we plan to extend it to more challenging task scenarios, such as segmentation.
期刊介绍:
Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal.
Particular emphasis is placed on engineering and technology aspects of image processing and computer vision.
The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.