Target-Embedding Autoencoder With Knowledge Distillation for Multi-Label Classification

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-03-21 DOI:10.1109/TETCI.2024.3372693

Ying Ma;Xiaoyan Zou;Qizheng Pan;Ming Yan;Guoqi Li

{"title":"Target-Embedding Autoencoder With Knowledge Distillation for Multi-Label Classification","authors":"Ying Ma;Xiaoyan Zou;Qizheng Pan;Ming Yan;Guoqi Li","doi":"10.1109/TETCI.2024.3372693","DOIUrl":null,"url":null,"abstract":"In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 3","pages":"2506-2517"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10477613/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.

查看原文本刊更多论文

针对多标签分类的目标嵌入式自动编码器与知识蒸馏器

在多标签分类任务中，确定标签之间的相关性是一项关键挑战。目标嵌入自动编码器（TEA）是解决这一问题的方法之一，但大多数基于 TEA 的框架参数多、模型大、复杂度高，难以应对大规模学习的问题。为了解决这个问题，我们提供了一种基于知识蒸馏的目标嵌入自动编码器框架（KD-TEA），通过知识蒸馏将参数较大的教师模型压缩成较小的学生模型。具体来说，KD-TEA 将从教师模型中学到的暗知识转移到学生模型中。暗知识可以提供有效的正则化，缓解训练过程中的过拟合问题，从而增强学生模型的泛化能力，更好地完成多标签任务。为了让学生模型直接学习教师模型的知识，我们改进了蒸馏损失：KD-TEA 使用 MSE 损失而不是 KL 分歧损失来提高模型在多标签任务中的性能。在多个数据集上的实验表明，我们的 KD-TEA 框架在性能和效率上都优于最先进的多标签分类方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.