利用全局知识提取对象检测器

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision Pub Date : 2022-10-17 DOI:10.48550/arXiv.2210.09022

Sanli Tang, Zhongyu Zhang, Zhanzhan Cheng, Jing Lu, Yunlu Xu, Yi Niu, Fan He

{"title":"利用全局知识提取对象检测器","authors":"Sanli Tang, Zhongyu Zhang, Zhanzhan Cheng, Jing Lu, Yunlu Xu, Yi Niu, Fan He","doi":"10.48550/arXiv.2210.09022","DOIUrl":null,"url":null,"abstract":". Knowledge distillation learns a lightweight student model that mimics a cumbersome teacher. Existing methods regard the knowledge as the feature of each instance or their relations, which is the instance-level knowledge only from the teacher model, i.e., the local knowledge. However, the empirical studies show that the local knowledge is much noisy in object detection tasks, especially on the blurred, occluded, or small instances. Thus, a more intrinsic approach is to measure the representations of instances w.r.t. a group of common basis vectors in the two feature spaces of the teacher and the student detectors, i.e., global knowledge. Then, the distilling algorithm can be applied as space alignment. To this end, a novel prototype generation module (PGM) is proposed to find the common basis vectors, dubbed prototypes , in the two feature spaces. Then, a robust distilling module (RDM) is applied to construct the global knowledge based on the prototypes and filtrate noisy local knowledge by measuring the discrepancy of the representations in two feature spaces. Experiments with Faster-RCNN and RetinaNet on PASCAL and COCO datasets show that our method achieves the best performance for distilling object detectors with various backbones, which even surpasses the performance of the teacher model. We also show that the existing methods can be easily combined with global knowledge and obtain further improvement. Code is available: https://github.com/hikvision-research/DAVAR-Lab-ML . to (1) construct the global knowledge by projecting the instances w.r.t. the prototypes, and (2) robustly distill the global and local knowledge by measuring their discrepancy in the two spaces. Experiments show that the proposed method achieves state-of-the-art performance on two popular detection frameworks and benchmarks. The extensive experimental results show that the proposed method can be easily stretched with larger teachers and the existing knowledge distillation methods to obtain further improvement.","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"18 1","pages":"422-438"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Distilling Object Detectors With Global Knowledge\",\"authors\":\"Sanli Tang, Zhongyu Zhang, Zhanzhan Cheng, Jing Lu, Yunlu Xu, Yi Niu, Fan He\",\"doi\":\"10.48550/arXiv.2210.09022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\". Knowledge distillation learns a lightweight student model that mimics a cumbersome teacher. Existing methods regard the knowledge as the feature of each instance or their relations, which is the instance-level knowledge only from the teacher model, i.e., the local knowledge. However, the empirical studies show that the local knowledge is much noisy in object detection tasks, especially on the blurred, occluded, or small instances. Thus, a more intrinsic approach is to measure the representations of instances w.r.t. a group of common basis vectors in the two feature spaces of the teacher and the student detectors, i.e., global knowledge. Then, the distilling algorithm can be applied as space alignment. To this end, a novel prototype generation module (PGM) is proposed to find the common basis vectors, dubbed prototypes , in the two feature spaces. Then, a robust distilling module (RDM) is applied to construct the global knowledge based on the prototypes and filtrate noisy local knowledge by measuring the discrepancy of the representations in two feature spaces. Experiments with Faster-RCNN and RetinaNet on PASCAL and COCO datasets show that our method achieves the best performance for distilling object detectors with various backbones, which even surpasses the performance of the teacher model. We also show that the existing methods can be easily combined with global knowledge and obtain further improvement. Code is available: https://github.com/hikvision-research/DAVAR-Lab-ML . to (1) construct the global knowledge by projecting the instances w.r.t. the prototypes, and (2) robustly distill the global and local knowledge by measuring their discrepancy in the two spaces. Experiments show that the proposed method achieves state-of-the-art performance on two popular detection frameworks and benchmarks. The extensive experimental results show that the proposed method can be easily stretched with larger teachers and the existing knowledge distillation methods to obtain further improvement.\",\"PeriodicalId\":72676,\"journal\":{\"name\":\"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision\",\"volume\":\"18 1\",\"pages\":\"422-438\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2210.09022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.09022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

。知识蒸馏学习一个轻量级的学生模型，模仿一个笨重的老师。现有的方法将知识视为每个实例的特征或它们之间的关系，这是仅来自教师模型的实例级知识，即局部知识。然而，实证研究表明，在目标检测任务中，局部知识存在较大的噪声，特别是在模糊、遮挡或小样本情况下。因此，一种更内在的方法是测量实例的表示，而不是教师和学生检测器的两个特征空间中的一组公共基向量，即全局知识。然后，将提取算法应用于空间对齐。为此，提出了一种新的原型生成模块(PGM)来寻找两个特征空间中的公共基向量，称为原型。然后，采用鲁棒提取模块(RDM)构建基于原型的全局知识，并通过测量两个特征空间中表示的差异来过滤有噪声的局部知识。在PASCAL和COCO数据集上对Faster-RCNN和RetinaNet进行的实验表明，我们的方法在提取具有多种主干的目标检测器方面取得了最好的性能，甚至超过了教师模型的性能。我们还表明，现有的方法可以很容易地与全局知识相结合，并得到进一步的改进。可获得代码:https://github.com/hikvision-research/DAVAR-Lab-ML。(1)通过在原型的基础上投影实例来构建全局知识;(2)通过测量它们在两个空间中的差异来稳健地提取全局和局部知识。实验表明，该方法在两种流行的检测框架和基准测试中都达到了最先进的性能。广泛的实验结果表明，该方法可以很容易地扩展到更大的教师和现有的知识蒸馏方法，以获得进一步的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Distilling Object Detectors With Global Knowledge

. Knowledge distillation learns a lightweight student model that mimics a cumbersome teacher. Existing methods regard the knowledge as the feature of each instance or their relations, which is the instance-level knowledge only from the teacher model, i.e., the local knowledge. However, the empirical studies show that the local knowledge is much noisy in object detection tasks, especially on the blurred, occluded, or small instances. Thus, a more intrinsic approach is to measure the representations of instances w.r.t. a group of common basis vectors in the two feature spaces of the teacher and the student detectors, i.e., global knowledge. Then, the distilling algorithm can be applied as space alignment. To this end, a novel prototype generation module (PGM) is proposed to find the common basis vectors, dubbed prototypes , in the two feature spaces. Then, a robust distilling module (RDM) is applied to construct the global knowledge based on the prototypes and filtrate noisy local knowledge by measuring the discrepancy of the representations in two feature spaces. Experiments with Faster-RCNN and RetinaNet on PASCAL and COCO datasets show that our method achieves the best performance for distilling object detectors with various backbones, which even surpasses the performance of the teacher model. We also show that the existing methods can be easily combined with global knowledge and obtain further improvement. Code is available: https://github.com/hikvision-research/DAVAR-Lab-ML . to (1) construct the global knowledge by projecting the instances w.r.t. the prototypes, and (2) robustly distill the global and local knowledge by measuring their discrepancy in the two spaces. Experiments show that the proposed method achieves state-of-the-art performance on two popular detection frameworks and benchmarks. The extensive experimental results show that the proposed method can be easily stretched with larger teachers and the existing knowledge distillation methods to obtain further improvement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision

自引率

0.00%

发文量