Wenqing Du , Liting Geng , Jianxiong Liu , Zhigang Zhao , Chunxiao Wang , Jidong Huo
{"title":"基于元学习的解耦知识蒸馏方法","authors":"Wenqing Du , Liting Geng , Jianxiong Liu , Zhigang Zhao , Chunxiao Wang , Jidong Huo","doi":"10.1016/j.hcc.2023.100164","DOIUrl":null,"url":null,"abstract":"<div><p>With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.</p></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"4 1","pages":"Article 100164"},"PeriodicalIF":3.2000,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667295223000624/pdfft?md5=716f214f6655f84938b0daddee4b5296&pid=1-s2.0-S2667295223000624-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Decoupled knowledge distillation method based on meta-learning\",\"authors\":\"Wenqing Du , Liting Geng , Jianxiong Liu , Zhigang Zhao , Chunxiao Wang , Jidong Huo\",\"doi\":\"10.1016/j.hcc.2023.100164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.</p></div>\",\"PeriodicalId\":100605,\"journal\":{\"name\":\"High-Confidence Computing\",\"volume\":\"4 1\",\"pages\":\"Article 100164\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667295223000624/pdfft?md5=716f214f6655f84938b0daddee4b5296&pid=1-s2.0-S2667295223000624-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Confidence Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667295223000624\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295223000624","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Decoupled knowledge distillation method based on meta-learning
With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.