{"title":"BFP:基于知识蒸馏的平衡滤波器剪枝,用于cnn在边缘设备上的有效部署","authors":"Haikun Zhang , Yajun Liu","doi":"10.1016/j.neucom.2025.130946","DOIUrl":null,"url":null,"abstract":"<div><div>Model pruning can reduce the computational cost of convolutional neural networks (CNNs), which enables CNNs to be deployed on edge devices with limited computational resources. However, most existing CNN pruning methods rely on a single global indicator to evaluate the importance of filters, ignoring local feature redundancy, which can easily lead to the loss of key information and affect the performance recovery and generalization of the pruning model. In light of this circumstance, a novel balanced filter pruning (BFP) method is proposed in this paper, connecting global measurement with focused attention. First, the method utilizes the BN layer scaling coefficient to perform global channel evaluation and mines local information redundancy through feature map correlation to achieve dynamic balance between structure compression and information preservation. Next, the weighting of the above two indicators is used as a balanced indicator for assessing the importance of the filters, and pruning is performed according to the set pruning rate. Finally, knowledge distillation is used to compensate for the loss of performance caused by the pruning network, which makes the method show better application prospects in scenarios such as edge computing. The effectiveness of the proposed method is validated on two image classification datasets. For example, for the ResNet-50 on the ImageNet dataset, BFP achieves a 59.2 % reduction in float-point-operations (FLOPs) and a 47.1 % reduction in parameters, and the Top-1 accuracy and Top-5 accuracy of the model only lose 0.52 % and 0.35 %, respectively.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"650 ","pages":"Article 130946"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BFP: Balanced filter pruning via knowledge distillation for efficient deployment of CNNs on edge devices\",\"authors\":\"Haikun Zhang , Yajun Liu\",\"doi\":\"10.1016/j.neucom.2025.130946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Model pruning can reduce the computational cost of convolutional neural networks (CNNs), which enables CNNs to be deployed on edge devices with limited computational resources. However, most existing CNN pruning methods rely on a single global indicator to evaluate the importance of filters, ignoring local feature redundancy, which can easily lead to the loss of key information and affect the performance recovery and generalization of the pruning model. In light of this circumstance, a novel balanced filter pruning (BFP) method is proposed in this paper, connecting global measurement with focused attention. First, the method utilizes the BN layer scaling coefficient to perform global channel evaluation and mines local information redundancy through feature map correlation to achieve dynamic balance between structure compression and information preservation. Next, the weighting of the above two indicators is used as a balanced indicator for assessing the importance of the filters, and pruning is performed according to the set pruning rate. Finally, knowledge distillation is used to compensate for the loss of performance caused by the pruning network, which makes the method show better application prospects in scenarios such as edge computing. The effectiveness of the proposed method is validated on two image classification datasets. For example, for the ResNet-50 on the ImageNet dataset, BFP achieves a 59.2 % reduction in float-point-operations (FLOPs) and a 47.1 % reduction in parameters, and the Top-1 accuracy and Top-5 accuracy of the model only lose 0.52 % and 0.35 %, respectively.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"650 \",\"pages\":\"Article 130946\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225016182\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225016182","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
BFP: Balanced filter pruning via knowledge distillation for efficient deployment of CNNs on edge devices
Model pruning can reduce the computational cost of convolutional neural networks (CNNs), which enables CNNs to be deployed on edge devices with limited computational resources. However, most existing CNN pruning methods rely on a single global indicator to evaluate the importance of filters, ignoring local feature redundancy, which can easily lead to the loss of key information and affect the performance recovery and generalization of the pruning model. In light of this circumstance, a novel balanced filter pruning (BFP) method is proposed in this paper, connecting global measurement with focused attention. First, the method utilizes the BN layer scaling coefficient to perform global channel evaluation and mines local information redundancy through feature map correlation to achieve dynamic balance between structure compression and information preservation. Next, the weighting of the above two indicators is used as a balanced indicator for assessing the importance of the filters, and pruning is performed according to the set pruning rate. Finally, knowledge distillation is used to compensate for the loss of performance caused by the pruning network, which makes the method show better application prospects in scenarios such as edge computing. The effectiveness of the proposed method is validated on two image classification datasets. For example, for the ResNet-50 on the ImageNet dataset, BFP achieves a 59.2 % reduction in float-point-operations (FLOPs) and a 47.1 % reduction in parameters, and the Top-1 accuracy and Top-5 accuracy of the model only lose 0.52 % and 0.35 %, respectively.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.