分类知识提炼

Third International Seminar on Artificial Intelligence, Networking, and Information Technology Pub Date : 2023-02-22 DOI:10.1117/12.2667603

Fei Li, Yifang Yang

{"title":"分类知识提炼","authors":"Fei Li, Yifang Yang","doi":"10.1117/12.2667603","DOIUrl":null,"url":null,"abstract":"Knowledge distillation (KD) transfers knowledge of a teacher model to improve the performance of a student model which is usually equipped with a lower capacity. The standard KD framework, however, neglects that the DNNs exhibit a wide range of class-wise accuracy and the performance of some classes is even decreased after distillation. Observing the above phenomena, we propose a novel Class-Wise Knowledge Distillation method to find the hard classes with a simple yet effective technique and then make the students take more effort to learn these hard classes. In the experiments on image classification tasks using CIFAR-100 dataset, we demonstrate that the proposed method outperforms the other KD methods and achieves excellent performance enhancement on various networks.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Class-wise knowledge distillation\",\"authors\":\"Fei Li, Yifang Yang\",\"doi\":\"10.1117/12.2667603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Knowledge distillation (KD) transfers knowledge of a teacher model to improve the performance of a student model which is usually equipped with a lower capacity. The standard KD framework, however, neglects that the DNNs exhibit a wide range of class-wise accuracy and the performance of some classes is even decreased after distillation. Observing the above phenomena, we propose a novel Class-Wise Knowledge Distillation method to find the hard classes with a simple yet effective technique and then make the students take more effort to learn these hard classes. In the experiments on image classification tasks using CIFAR-100 dataset, we demonstrate that the proposed method outperforms the other KD methods and achieves excellent performance enhancement on various networks.\",\"PeriodicalId\":128051,\"journal\":{\"name\":\"Third International Seminar on Artificial Intelligence, Networking, and Information Technology\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Third International Seminar on Artificial Intelligence, Networking, and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2667603\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2667603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

知识蒸馏(Knowledge distillation, KD)通过转移教师模型的知识来提高学生模型的性能，而学生模型的能力通常较低。然而，标准KD框架忽略了dnn表现出广泛的分类精度，并且某些类别的性能在蒸馏后甚至下降。观察到上述现象，我们提出了一种新颖的班级知识蒸馏方法，以一种简单而有效的方法找到困难的课程，然后让学生付出更多的努力来学习这些困难的课程。在使用CIFAR-100数据集的图像分类任务实验中，我们证明了该方法优于其他KD方法，并在各种网络上取得了出色的性能增强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Class-wise knowledge distillation

Knowledge distillation (KD) transfers knowledge of a teacher model to improve the performance of a student model which is usually equipped with a lower capacity. The standard KD framework, however, neglects that the DNNs exhibit a wide range of class-wise accuracy and the performance of some classes is even decreased after distillation. Observing the above phenomena, we propose a novel Class-Wise Knowledge Distillation method to find the hard classes with a simple yet effective technique and then make the students take more effort to learn these hard classes. In the experiments on image classification tasks using CIFAR-100 dataset, we demonstrate that the proposed method outperforms the other KD methods and achieves excellent performance enhancement on various networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Third International Seminar on Artificial Intelligence, Networking, and Information Technology

自引率

0.00%

发文量