Growing-before-pruning: A progressive neural architecture search strategy via group sparsity and deterministic annealing

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2025-04-21 DOI:10.1016/j.patcog.2025.111697

Xiaotong Lu , Weisheng Dong , Zhenxuan Fang , Jie Lin , Xin Li , Guangming Shi

{"title":"Growing-before-pruning: A progressive neural architecture search strategy via group sparsity and deterministic annealing","authors":"Xiaotong Lu , Weisheng Dong , Zhenxuan Fang , Jie Lin , Xin Li , Guangming Shi","doi":"10.1016/j.patcog.2025.111697","DOIUrl":null,"url":null,"abstract":"<div><div>Network pruning is a widely studied technique of obtaining compact representations from over-parameterized deep convolutional neural networks. Existing pruning methods are based on finding an optimal combination of pruned filters in the fixed search space. However, the optimality of those methods is often questionable due to limited search space and pruning choices - e.g., the difficulty with removing the entire layer and the risk of unexpected performance degradation. Inspired by the exploration vs. exploitation trade-off in reinforcement learning, we propose to reconstruct the filter space without increasing the model capacity and prune them by exploiting group sparsity. Our approach challenges the conventional wisdom by advocating the strategy of Growing-before-Pruning (GbP), which allows us to explore more space before exploiting the power of architecture search. Meanwhile, to achieve more efficient pruning, we propose to measure the importance of filters by global group sparsity, which extends the existing Gaussian scale mixture model. Such global characterization of sparsity in the filter space leads to a novel deterministic annealing strategy for progressively pruning the filters. We have evaluated our method on several popular datasets and network architectures. Our extensive experiment results have shown that the proposed method advances the current state-of-the-art.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111697"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325003577","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Network pruning is a widely studied technique of obtaining compact representations from over-parameterized deep convolutional neural networks. Existing pruning methods are based on finding an optimal combination of pruned filters in the fixed search space. However, the optimality of those methods is often questionable due to limited search space and pruning choices - e.g., the difficulty with removing the entire layer and the risk of unexpected performance degradation. Inspired by the exploration vs. exploitation trade-off in reinforcement learning, we propose to reconstruct the filter space without increasing the model capacity and prune them by exploiting group sparsity. Our approach challenges the conventional wisdom by advocating the strategy of Growing-before-Pruning (GbP), which allows us to explore more space before exploiting the power of architecture search. Meanwhile, to achieve more efficient pruning, we propose to measure the importance of filters by global group sparsity, which extends the existing Gaussian scale mixture model. Such global characterization of sparsity in the filter space leads to a novel deterministic annealing strategy for progressively pruning the filters. We have evaluated our method on several popular datasets and network architectures. Our extensive experiment results have shown that the proposed method advances the current state-of-the-art.

查看原文本刊更多论文

基于群稀疏性和确定性退火的渐进式神经结构搜索策略

网络修剪是一种被广泛研究的从过参数化深度卷积神经网络中获得紧凑表示的技术。现有的剪枝方法是基于在固定的搜索空间中找到剪枝滤波器的最优组合。然而，由于有限的搜索空间和修剪选择，这些方法的最优性经常受到质疑——例如，删除整个层的困难和意外性能下降的风险。受强化学习中探索与利用权衡的启发，我们提出在不增加模型容量的情况下重构过滤器空间，并利用群稀疏性对其进行修剪。我们的方法挑战了传统智慧，倡导先生长后修剪（GbP）的策略，这使我们能够在利用建筑搜索的力量之前探索更多的空间。同时，为了实现更高效的剪枝，我们提出了用全局群稀疏度来衡量滤波器的重要性，扩展了现有的高斯尺度混合模型。这种滤波器空间稀疏性的全局特征导致了一种新的确定性退火策略，用于逐步修剪滤波器。我们已经在几个流行的数据集和网络架构上评估了我们的方法。我们广泛的实验结果表明，所提出的方法提高了当前的技术水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.