Class-wise knowledge distillation

Third International Seminar on Artificial Intelligence, Networking, and Information Technology Pub Date : 2023-02-22 DOI:10.1117/12.2667603

Fei Li, Yifang Yang

引用次数: 0

Abstract

Knowledge distillation (KD) transfers knowledge of a teacher model to improve the performance of a student model which is usually equipped with a lower capacity. The standard KD framework, however, neglects that the DNNs exhibit a wide range of class-wise accuracy and the performance of some classes is even decreased after distillation. Observing the above phenomena, we propose a novel Class-Wise Knowledge Distillation method to find the hard classes with a simple yet effective technique and then make the students take more effort to learn these hard classes. In the experiments on image classification tasks using CIFAR-100 dataset, we demonstrate that the proposed method outperforms the other KD methods and achieves excellent performance enhancement on various networks.

查看原文本刊更多论文

分类知识提炼

知识蒸馏(Knowledge distillation, KD)通过转移教师模型的知识来提高学生模型的性能，而学生模型的能力通常较低。然而，标准KD框架忽略了dnn表现出广泛的分类精度，并且某些类别的性能在蒸馏后甚至下降。观察到上述现象，我们提出了一种新颖的班级知识蒸馏方法，以一种简单而有效的方法找到困难的课程，然后让学生付出更多的努力来学习这些困难的课程。在使用CIFAR-100数据集的图像分类任务实验中，我们证明了该方法优于其他KD方法，并在各种网络上取得了出色的性能增强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Third International Seminar on Artificial Intelligence, Networking, and Information Technology

自引率

0.00%

发文量