通过非线性特征对齐加强知识提炼

IF 1 Q4 OPTICS
Jiangxiao Zhang, Feng Gao, Lina Huo, Hongliang Wang, Ying Dang
{"title":"通过非线性特征对齐加强知识提炼","authors":"Jiangxiao Zhang,&nbsp;Feng Gao,&nbsp;Lina Huo,&nbsp;Hongliang Wang,&nbsp;Ying Dang","doi":"10.3103/S1060992X23040136","DOIUrl":null,"url":null,"abstract":"<p>Deploying AI models on resource-constrained devices is indeed a challenging task. It requires models to have a small parameter while maintaining high performance. Achieving a balance between model size and performance is essential to ensuring the efficient and effective deployment of AI models in such environments. Knowledge distillation (KD) is an important model compression technique that aims to have a small model learn from a larger model by leveraging the high-performance features of the larger model to enhance the performance of the smaller model, ultimately achieving or surpassing the performance of the larger models. This paper presents a pipeline-based knowledge distillation method that improves model performance through non-linear feature alignment (FA) after the feature extraction stage. We conducted experiments on both single-teacher distillation and multi-teacher distillation and through extensive experimentation, we demonstrated that our method can improve the accuracy of knowledge distillation on the existing KD loss function and further improve the performance of small models.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"32 4","pages":"310 - 317"},"PeriodicalIF":1.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancement of Knowledge Distillation via Non-Linear Feature Alignment\",\"authors\":\"Jiangxiao Zhang,&nbsp;Feng Gao,&nbsp;Lina Huo,&nbsp;Hongliang Wang,&nbsp;Ying Dang\",\"doi\":\"10.3103/S1060992X23040136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Deploying AI models on resource-constrained devices is indeed a challenging task. It requires models to have a small parameter while maintaining high performance. Achieving a balance between model size and performance is essential to ensuring the efficient and effective deployment of AI models in such environments. Knowledge distillation (KD) is an important model compression technique that aims to have a small model learn from a larger model by leveraging the high-performance features of the larger model to enhance the performance of the smaller model, ultimately achieving or surpassing the performance of the larger models. This paper presents a pipeline-based knowledge distillation method that improves model performance through non-linear feature alignment (FA) after the feature extraction stage. We conducted experiments on both single-teacher distillation and multi-teacher distillation and through extensive experimentation, we demonstrated that our method can improve the accuracy of knowledge distillation on the existing KD loss function and further improve the performance of small models.</p>\",\"PeriodicalId\":721,\"journal\":{\"name\":\"Optical Memory and Neural Networks\",\"volume\":\"32 4\",\"pages\":\"310 - 317\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optical Memory and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.3103/S1060992X23040136\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optical Memory and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S1060992X23040136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0

摘要

摘要在资源有限的设备上部署人工智能模型确实是一项具有挑战性的任务。它要求模型在保持高性能的同时具有较小的参数。要确保在此类环境中高效部署人工智能模型,实现模型大小与性能之间的平衡至关重要。知识蒸馏(KD)是一种重要的模型压缩技术,其目的是让小型模型从大型模型中学习,利用大型模型的高性能特征来提高小型模型的性能,最终达到或超过大型模型的性能。本文提出了一种基于流水线的知识提炼方法,该方法在特征提取阶段之后通过非线性特征对齐(FA)提高模型性能。我们对单教师蒸馏和多教师蒸馏进行了实验,通过大量实验证明,我们的方法可以在现有的 KD 损失函数上提高知识蒸馏的准确性,并进一步提高小型模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Enhancement of Knowledge Distillation via Non-Linear Feature Alignment

Enhancement of Knowledge Distillation via Non-Linear Feature Alignment

Enhancement of Knowledge Distillation via Non-Linear Feature Alignment

Deploying AI models on resource-constrained devices is indeed a challenging task. It requires models to have a small parameter while maintaining high performance. Achieving a balance between model size and performance is essential to ensuring the efficient and effective deployment of AI models in such environments. Knowledge distillation (KD) is an important model compression technique that aims to have a small model learn from a larger model by leveraging the high-performance features of the larger model to enhance the performance of the smaller model, ultimately achieving or surpassing the performance of the larger models. This paper presents a pipeline-based knowledge distillation method that improves model performance through non-linear feature alignment (FA) after the feature extraction stage. We conducted experiments on both single-teacher distillation and multi-teacher distillation and through extensive experimentation, we demonstrated that our method can improve the accuracy of knowledge distillation on the existing KD loss function and further improve the performance of small models.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.50
自引率
11.10%
发文量
25
期刊介绍: The journal covers a wide range of issues in information optics such as optical memory, mechanisms for optical data recording and processing, photosensitive materials, optical, optoelectronic and holographic nanostructures, and many other related topics. Papers on memory systems using holographic and biological structures and concepts of brain operation are also included. The journal pays particular attention to research in the field of neural net systems that may lead to a new generation of computional technologies by endowing them with intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信