{"title":"VFT:一种基于特征分布感知知识蒸馏的轻量级卷积神经网络通用微调方案","authors":"Hyeonseok Hong, Hyun Kim","doi":"10.1016/j.engappai.2025.111597","DOIUrl":null,"url":null,"abstract":"<div><div>Various network compression techniques, such as pruning and quantization, are being actively researched in order to lighten convolutional neural networks (CNNs), which have increasingly deep and complex structures accompanied by the achievement of higher accuracy. Since most of these network compression techniques cause a decrease in accuracy, fine-tuning is essential to recover the performance of lightweight models; however, fine-tuning has received limited research attention compared to numerous compression techniques, and thus, performance recovery by fine-tuning has significant room for improvement. In this paper, we analyze the shortcomings of existing fine-tuning methods in terms of loss landscape and introduce a knowledge distillation (KD)-based fine-tuning approach that solves these problems. In particular, to overcome the limitation that KD can be adversely affected by the capacity difference between the teacher and student models or the defined knowledge to be transferred, we propose a feature distribution-aware knowledge distillation (FDKD) method, which defines appropriate supervision in the form of feature distribution to transfer the semantic information from teacher models. Moreover, we also propose a layer-wise FDKD method by exploiting the uniqueness of the lightweight model that the baseline (<em>i.e.</em>, teacher) and compressed models (<em>i.e.</em>, student) have the same architecture. Experiments on classification tasks demonstrate the superiority of the proposed method over existing fine-tuning methods, achieving up to 1.99% and 3.83% of accuracy improvement for pruned and quantized models, respectively. The source code for this implementation is available at [<span><span>https://github.com/IDSL-SeoulTech/VFT</span><svg><path></path></svg></span>].</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111597"},"PeriodicalIF":8.0000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VFT: A versatile fine-tuning scheme based on feature distribution-aware knowledge distillation for lightweight convolutional neural networks\",\"authors\":\"Hyeonseok Hong, Hyun Kim\",\"doi\":\"10.1016/j.engappai.2025.111597\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Various network compression techniques, such as pruning and quantization, are being actively researched in order to lighten convolutional neural networks (CNNs), which have increasingly deep and complex structures accompanied by the achievement of higher accuracy. Since most of these network compression techniques cause a decrease in accuracy, fine-tuning is essential to recover the performance of lightweight models; however, fine-tuning has received limited research attention compared to numerous compression techniques, and thus, performance recovery by fine-tuning has significant room for improvement. In this paper, we analyze the shortcomings of existing fine-tuning methods in terms of loss landscape and introduce a knowledge distillation (KD)-based fine-tuning approach that solves these problems. In particular, to overcome the limitation that KD can be adversely affected by the capacity difference between the teacher and student models or the defined knowledge to be transferred, we propose a feature distribution-aware knowledge distillation (FDKD) method, which defines appropriate supervision in the form of feature distribution to transfer the semantic information from teacher models. Moreover, we also propose a layer-wise FDKD method by exploiting the uniqueness of the lightweight model that the baseline (<em>i.e.</em>, teacher) and compressed models (<em>i.e.</em>, student) have the same architecture. Experiments on classification tasks demonstrate the superiority of the proposed method over existing fine-tuning methods, achieving up to 1.99% and 3.83% of accuracy improvement for pruned and quantized models, respectively. The source code for this implementation is available at [<span><span>https://github.com/IDSL-SeoulTech/VFT</span><svg><path></path></svg></span>].</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"159 \",\"pages\":\"Article 111597\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625015994\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625015994","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
VFT: A versatile fine-tuning scheme based on feature distribution-aware knowledge distillation for lightweight convolutional neural networks
Various network compression techniques, such as pruning and quantization, are being actively researched in order to lighten convolutional neural networks (CNNs), which have increasingly deep and complex structures accompanied by the achievement of higher accuracy. Since most of these network compression techniques cause a decrease in accuracy, fine-tuning is essential to recover the performance of lightweight models; however, fine-tuning has received limited research attention compared to numerous compression techniques, and thus, performance recovery by fine-tuning has significant room for improvement. In this paper, we analyze the shortcomings of existing fine-tuning methods in terms of loss landscape and introduce a knowledge distillation (KD)-based fine-tuning approach that solves these problems. In particular, to overcome the limitation that KD can be adversely affected by the capacity difference between the teacher and student models or the defined knowledge to be transferred, we propose a feature distribution-aware knowledge distillation (FDKD) method, which defines appropriate supervision in the form of feature distribution to transfer the semantic information from teacher models. Moreover, we also propose a layer-wise FDKD method by exploiting the uniqueness of the lightweight model that the baseline (i.e., teacher) and compressed models (i.e., student) have the same architecture. Experiments on classification tasks demonstrate the superiority of the proposed method over existing fine-tuning methods, achieving up to 1.99% and 3.83% of accuracy improvement for pruned and quantized models, respectively. The source code for this implementation is available at [https://github.com/IDSL-SeoulTech/VFT].
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.