BFP：基于知识蒸馏的平衡滤波器剪枝，用于cnn在边缘设备上的有效部署

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-07-09 DOI:10.1016/j.neucom.2025.130946

Haikun Zhang , Yajun Liu

{"title":"BFP：基于知识蒸馏的平衡滤波器剪枝，用于cnn在边缘设备上的有效部署","authors":"Haikun Zhang , Yajun Liu","doi":"10.1016/j.neucom.2025.130946","DOIUrl":null,"url":null,"abstract":"<div><div>Model pruning can reduce the computational cost of convolutional neural networks (CNNs), which enables CNNs to be deployed on edge devices with limited computational resources. However, most existing CNN pruning methods rely on a single global indicator to evaluate the importance of filters, ignoring local feature redundancy, which can easily lead to the loss of key information and affect the performance recovery and generalization of the pruning model. In light of this circumstance, a novel balanced filter pruning (BFP) method is proposed in this paper, connecting global measurement with focused attention. First, the method utilizes the BN layer scaling coefficient to perform global channel evaluation and mines local information redundancy through feature map correlation to achieve dynamic balance between structure compression and information preservation. Next, the weighting of the above two indicators is used as a balanced indicator for assessing the importance of the filters, and pruning is performed according to the set pruning rate. Finally, knowledge distillation is used to compensate for the loss of performance caused by the pruning network, which makes the method show better application prospects in scenarios such as edge computing. The effectiveness of the proposed method is validated on two image classification datasets. For example, for the ResNet-50 on the ImageNet dataset, BFP achieves a 59.2 % reduction in float-point-operations (FLOPs) and a 47.1 % reduction in parameters, and the Top-1 accuracy and Top-5 accuracy of the model only lose 0.52 % and 0.35 %, respectively.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"650 ","pages":"Article 130946"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BFP: Balanced filter pruning via knowledge distillation for efficient deployment of CNNs on edge devices\",\"authors\":\"Haikun Zhang , Yajun Liu\",\"doi\":\"10.1016/j.neucom.2025.130946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Model pruning can reduce the computational cost of convolutional neural networks (CNNs), which enables CNNs to be deployed on edge devices with limited computational resources. However, most existing CNN pruning methods rely on a single global indicator to evaluate the importance of filters, ignoring local feature redundancy, which can easily lead to the loss of key information and affect the performance recovery and generalization of the pruning model. In light of this circumstance, a novel balanced filter pruning (BFP) method is proposed in this paper, connecting global measurement with focused attention. First, the method utilizes the BN layer scaling coefficient to perform global channel evaluation and mines local information redundancy through feature map correlation to achieve dynamic balance between structure compression and information preservation. Next, the weighting of the above two indicators is used as a balanced indicator for assessing the importance of the filters, and pruning is performed according to the set pruning rate. Finally, knowledge distillation is used to compensate for the loss of performance caused by the pruning network, which makes the method show better application prospects in scenarios such as edge computing. The effectiveness of the proposed method is validated on two image classification datasets. For example, for the ResNet-50 on the ImageNet dataset, BFP achieves a 59.2 % reduction in float-point-operations (FLOPs) and a 47.1 % reduction in parameters, and the Top-1 accuracy and Top-5 accuracy of the model only lose 0.52 % and 0.35 %, respectively.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"650 \",\"pages\":\"Article 130946\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225016182\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225016182","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

模型剪枝可以降低卷积神经网络的计算成本，使卷积神经网络能够部署在计算资源有限的边缘设备上。然而，现有的CNN剪枝方法大多依赖于单一全局指标来评估滤波器的重要性，忽略了局部特征冗余，容易导致关键信息的丢失，影响剪枝模型的性能恢复和泛化。针对这种情况，本文提出了一种新的平衡滤波剪枝（BFP）方法，将全局测量与集中注意力联系起来。该方法首先利用BN层尺度系数进行全局信道评估，通过特征映射关联挖掘局部信息冗余，实现结构压缩与信息保存的动态平衡。接下来，将上述两个指标的权重作为评估滤波器重要性的平衡指标，并根据设定的剪枝率进行剪枝。最后，利用知识精馏来弥补剪枝网络带来的性能损失，使该方法在边缘计算等场景中显示出更好的应用前景。在两个图像分类数据集上验证了该方法的有效性。例如，对于ImageNet数据集上的ResNet-50， BFP实现了浮点运算（FLOPs）减少59.2% %，参数减少47.1% %，模型的Top-1精度和Top-5精度分别仅损失0.52 %和0.35 %。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BFP: Balanced filter pruning via knowledge distillation for efficient deployment of CNNs on edge devices

Model pruning can reduce the computational cost of convolutional neural networks (CNNs), which enables CNNs to be deployed on edge devices with limited computational resources. However, most existing CNN pruning methods rely on a single global indicator to evaluate the importance of filters, ignoring local feature redundancy, which can easily lead to the loss of key information and affect the performance recovery and generalization of the pruning model. In light of this circumstance, a novel balanced filter pruning (BFP) method is proposed in this paper, connecting global measurement with focused attention. First, the method utilizes the BN layer scaling coefficient to perform global channel evaluation and mines local information redundancy through feature map correlation to achieve dynamic balance between structure compression and information preservation. Next, the weighting of the above two indicators is used as a balanced indicator for assessing the importance of the filters, and pruning is performed according to the set pruning rate. Finally, knowledge distillation is used to compensate for the loss of performance caused by the pruning network, which makes the method show better application prospects in scenarios such as edge computing. The effectiveness of the proposed method is validated on two image classification datasets. For example, for the ResNet-50 on the ImageNet dataset, BFP achieves a 59.2 % reduction in float-point-operations (FLOPs) and a 47.1 % reduction in parameters, and the Top-1 accuracy and Top-5 accuracy of the model only lose 0.52 % and 0.35 %, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.