针对多标签分类的目标嵌入式自动编码器与知识蒸馏器

IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Ying Ma;Xiaoyan Zou;Qizheng Pan;Ming Yan;Guoqi Li
{"title":"针对多标签分类的目标嵌入式自动编码器与知识蒸馏器","authors":"Ying Ma;Xiaoyan Zou;Qizheng Pan;Ming Yan;Guoqi Li","doi":"10.1109/TETCI.2024.3372693","DOIUrl":null,"url":null,"abstract":"In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 3","pages":"2506-2517"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Target-Embedding Autoencoder With Knowledge Distillation for Multi-Label Classification\",\"authors\":\"Ying Ma;Xiaoyan Zou;Qizheng Pan;Ming Yan;Guoqi Li\",\"doi\":\"10.1109/TETCI.2024.3372693\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.\",\"PeriodicalId\":13135,\"journal\":{\"name\":\"IEEE Transactions on Emerging Topics in Computational Intelligence\",\"volume\":\"8 3\",\"pages\":\"2506-2517\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2024-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Emerging Topics in Computational Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10477613/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10477613/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在多标签分类任务中,确定标签之间的相关性是一项关键挑战。目标嵌入自动编码器(TEA)是解决这一问题的方法之一,但大多数基于 TEA 的框架参数多、模型大、复杂度高,难以应对大规模学习的问题。为了解决这个问题,我们提供了一种基于知识蒸馏的目标嵌入自动编码器框架(KD-TEA),通过知识蒸馏将参数较大的教师模型压缩成较小的学生模型。具体来说,KD-TEA 将从教师模型中学到的暗知识转移到学生模型中。暗知识可以提供有效的正则化,缓解训练过程中的过拟合问题,从而增强学生模型的泛化能力,更好地完成多标签任务。为了让学生模型直接学习教师模型的知识,我们改进了蒸馏损失:KD-TEA 使用 MSE 损失而不是 KL 分歧损失来提高模型在多标签任务中的性能。在多个数据集上的实验表明,我们的 KD-TEA 框架在性能和效率上都优于最先进的多标签分类方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Target-Embedding Autoencoder With Knowledge Distillation for Multi-Label Classification
In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
10.30
自引率
7.50%
发文量
147
期刊介绍: The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信