通过非线性特征对齐加强知识提炼

IF 1 Q4 OPTICS

Optical Memory and Neural Networks Pub Date : 2023-12-22 DOI:10.3103/S1060992X23040136

Jiangxiao Zhang, Feng Gao, Lina Huo, Hongliang Wang, Ying Dang

{"title":"通过非线性特征对齐加强知识提炼","authors":"Jiangxiao Zhang, Feng Gao, Lina Huo, Hongliang Wang, Ying Dang","doi":"10.3103/S1060992X23040136","DOIUrl":null,"url":null,"abstract":"<p>Deploying AI models on resource-constrained devices is indeed a challenging task. It requires models to have a small parameter while maintaining high performance. Achieving a balance between model size and performance is essential to ensuring the efficient and effective deployment of AI models in such environments. Knowledge distillation (KD) is an important model compression technique that aims to have a small model learn from a larger model by leveraging the high-performance features of the larger model to enhance the performance of the smaller model, ultimately achieving or surpassing the performance of the larger models. This paper presents a pipeline-based knowledge distillation method that improves model performance through non-linear feature alignment (FA) after the feature extraction stage. We conducted experiments on both single-teacher distillation and multi-teacher distillation and through extensive experimentation, we demonstrated that our method can improve the accuracy of knowledge distillation on the existing KD loss function and further improve the performance of small models.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"32 4","pages":"310 - 317"},"PeriodicalIF":1.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancement of Knowledge Distillation via Non-Linear Feature Alignment\",\"authors\":\"Jiangxiao Zhang, Feng Gao, Lina Huo, Hongliang Wang, Ying Dang\",\"doi\":\"10.3103/S1060992X23040136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Deploying AI models on resource-constrained devices is indeed a challenging task. It requires models to have a small parameter while maintaining high performance. Achieving a balance between model size and performance is essential to ensuring the efficient and effective deployment of AI models in such environments. Knowledge distillation (KD) is an important model compression technique that aims to have a small model learn from a larger model by leveraging the high-performance features of the larger model to enhance the performance of the smaller model, ultimately achieving or surpassing the performance of the larger models. This paper presents a pipeline-based knowledge distillation method that improves model performance through non-linear feature alignment (FA) after the feature extraction stage. We conducted experiments on both single-teacher distillation and multi-teacher distillation and through extensive experimentation, we demonstrated that our method can improve the accuracy of knowledge distillation on the existing KD loss function and further improve the performance of small models.</p>\",\"PeriodicalId\":721,\"journal\":{\"name\":\"Optical Memory and Neural Networks\",\"volume\":\"32 4\",\"pages\":\"310 - 317\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optical Memory and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.3103/S1060992X23040136\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optical Memory and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S1060992X23040136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 0

摘要

摘要在资源有限的设备上部署人工智能模型确实是一项具有挑战性的任务。它要求模型在保持高性能的同时具有较小的参数。要确保在此类环境中高效部署人工智能模型，实现模型大小与性能之间的平衡至关重要。知识蒸馏（KD）是一种重要的模型压缩技术，其目的是让小型模型从大型模型中学习，利用大型模型的高性能特征来提高小型模型的性能，最终达到或超过大型模型的性能。本文提出了一种基于流水线的知识提炼方法，该方法在特征提取阶段之后通过非线性特征对齐（FA）提高模型性能。我们对单教师蒸馏和多教师蒸馏进行了实验，通过大量实验证明，我们的方法可以在现有的 KD 损失函数上提高知识蒸馏的准确性，并进一步提高小型模型的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Enhancement of Knowledge Distillation via Non-Linear Feature Alignment

查看原文本刊更多论文

Enhancement of Knowledge Distillation via Non-Linear Feature Alignment

Deploying AI models on resource-constrained devices is indeed a challenging task. It requires models to have a small parameter while maintaining high performance. Achieving a balance between model size and performance is essential to ensuring the efficient and effective deployment of AI models in such environments. Knowledge distillation (KD) is an important model compression technique that aims to have a small model learn from a larger model by leveraging the high-performance features of the larger model to enhance the performance of the smaller model, ultimately achieving or surpassing the performance of the larger models. This paper presents a pipeline-based knowledge distillation method that improves model performance through non-linear feature alignment (FA) after the feature extraction stage. We conducted experiments on both single-teacher distillation and multi-teacher distillation and through extensive experimentation, we demonstrated that our method can improve the accuracy of knowledge distillation on the existing KD loss function and further improve the performance of small models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Optical Memory and Neural Networks OPTICS-

CiteScore

1.50

自引率

11.10%

发文量

期刊介绍： The journal covers a wide range of issues in information optics such as optical memory, mechanisms for optical data recording and processing, photosensitive materials, optical, optoelectronic and holographic nanostructures, and many other related topics. Papers on memory systems using holographic and biological structures and concepts of brain operation are also included. The journal pays particular attention to research in the field of neural net systems that may lead to a new generation of computional technologies by endowing them with intelligence.