Pruning Networks Only Using Few-Shot Pretraining Based on Gradient Similarity Frequency

Haigen Hu;Huihuang Zhang;Qianwei Zhou;Tieming Chen
{"title":"Pruning Networks Only Using Few-Shot Pretraining Based on Gradient Similarity Frequency","authors":"Haigen Hu;Huihuang Zhang;Qianwei Zhou;Tieming Chen","doi":"10.1109/TAI.2025.3544582","DOIUrl":null,"url":null,"abstract":"Neural network pruning is a popular and promising approach aiming at reducing heavy networks to lightweight ones by removing redundancies. Most existing methods adopt a three-stage pipeline, including pretraining, pruning, and fine-tuning. However, it is time-consuming to train a large and redundant network in the pretraining process. In this work, we propose a new minimal pretraining pruning method, gradient similarity frequency-based pruning (GSFP), which prunes a given network only using few-shot pretraining before training. Instead of pretraining a fully trained over-parameterized model, our method only uses one epoch to obtain the ranked list of convolution filters to be pruned according to their gradient similarity frequency and determines the redundant convolution filters that should be removed. Then, the obtained sparse network is trained in the standard way without the need to fine-tune the inherited weights from the full model. Finally, a series of experiments are conducted to verify the effectiveness of CIFAR10/100 and ImageNet. The results show that our method can achieve remarkable results on some popular networks, such as VGG, ResNet, and DenseNet. Importantly, the proposed pruning approach never requires pretraining the over-parameterized model, thus offering a promising prospect of application and spreading for limited computational resources.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 8","pages":"2253-2265"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10902134/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Neural network pruning is a popular and promising approach aiming at reducing heavy networks to lightweight ones by removing redundancies. Most existing methods adopt a three-stage pipeline, including pretraining, pruning, and fine-tuning. However, it is time-consuming to train a large and redundant network in the pretraining process. In this work, we propose a new minimal pretraining pruning method, gradient similarity frequency-based pruning (GSFP), which prunes a given network only using few-shot pretraining before training. Instead of pretraining a fully trained over-parameterized model, our method only uses one epoch to obtain the ranked list of convolution filters to be pruned according to their gradient similarity frequency and determines the redundant convolution filters that should be removed. Then, the obtained sparse network is trained in the standard way without the need to fine-tune the inherited weights from the full model. Finally, a series of experiments are conducted to verify the effectiveness of CIFAR10/100 and ImageNet. The results show that our method can achieve remarkable results on some popular networks, such as VGG, ResNet, and DenseNet. Importantly, the proposed pruning approach never requires pretraining the over-parameterized model, thus offering a promising prospect of application and spreading for limited computational resources.
基于梯度相似频率的单次预训练剪枝网络
神经网络修剪是一种流行且有前途的方法,旨在通过去除冗余来将重型网络减少到轻量级网络。大多数现有的方法采用三个阶段的流水线,包括预训练、修剪和微调。然而,在预训练过程中,训练一个庞大且冗余的网络非常耗时。在这项工作中,我们提出了一种新的最小预训练剪枝方法——基于梯度相似频率的剪枝(GSFP),它在训练前只使用少量的预训练来修剪给定的网络。我们的方法不是预先训练一个完全训练好的过参数化模型,而是只使用一个历元,根据卷积滤波器的梯度相似频率获得待修剪卷积滤波器的排序列表,并确定需要去除的冗余卷积滤波器。然后,以标准的方式训练得到的稀疏网络,而不需要对完整模型的继承权进行微调。最后,通过一系列实验验证了CIFAR10/100和ImageNet的有效性。结果表明,我们的方法在VGG、ResNet和DenseNet等流行的网络上取得了显著的效果。重要的是,所提出的剪枝方法不需要对过度参数化模型进行预训练,因此在有限的计算资源下具有很好的应用和推广前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信