{"title":"Target-Embedding Autoencoder With Knowledge Distillation for Multi-Label Classification","authors":"Ying Ma;Xiaoyan Zou;Qizheng Pan;Ming Yan;Guoqi Li","doi":"10.1109/TETCI.2024.3372693","DOIUrl":null,"url":null,"abstract":"In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 3","pages":"2506-2517"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10477613/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.
期刊介绍:
The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys.
TETCI is an electronics only publication. TETCI publishes six issues per year.
Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.