Xiaotong Lu , Weisheng Dong , Zhenxuan Fang , Jie Lin , Xin Li , Guangming Shi
{"title":"Growing-before-pruning: A progressive neural architecture search strategy via group sparsity and deterministic annealing","authors":"Xiaotong Lu , Weisheng Dong , Zhenxuan Fang , Jie Lin , Xin Li , Guangming Shi","doi":"10.1016/j.patcog.2025.111697","DOIUrl":null,"url":null,"abstract":"<div><div>Network pruning is a widely studied technique of obtaining compact representations from over-parameterized deep convolutional neural networks. Existing pruning methods are based on finding an optimal combination of pruned filters in the fixed search space. However, the optimality of those methods is often questionable due to limited search space and pruning choices - e.g., the difficulty with removing the entire layer and the risk of unexpected performance degradation. Inspired by the exploration vs. exploitation trade-off in reinforcement learning, we propose to reconstruct the filter space without increasing the model capacity and prune them by exploiting group sparsity. Our approach challenges the conventional wisdom by advocating the strategy of Growing-before-Pruning (GbP), which allows us to explore more space before exploiting the power of architecture search. Meanwhile, to achieve more efficient pruning, we propose to measure the importance of filters by global group sparsity, which extends the existing Gaussian scale mixture model. Such global characterization of sparsity in the filter space leads to a novel deterministic annealing strategy for progressively pruning the filters. We have evaluated our method on several popular datasets and network architectures. Our extensive experiment results have shown that the proposed method advances the current state-of-the-art.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111697"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325003577","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Network pruning is a widely studied technique of obtaining compact representations from over-parameterized deep convolutional neural networks. Existing pruning methods are based on finding an optimal combination of pruned filters in the fixed search space. However, the optimality of those methods is often questionable due to limited search space and pruning choices - e.g., the difficulty with removing the entire layer and the risk of unexpected performance degradation. Inspired by the exploration vs. exploitation trade-off in reinforcement learning, we propose to reconstruct the filter space without increasing the model capacity and prune them by exploiting group sparsity. Our approach challenges the conventional wisdom by advocating the strategy of Growing-before-Pruning (GbP), which allows us to explore more space before exploiting the power of architecture search. Meanwhile, to achieve more efficient pruning, we propose to measure the importance of filters by global group sparsity, which extends the existing Gaussian scale mixture model. Such global characterization of sparsity in the filter space leads to a novel deterministic annealing strategy for progressively pruning the filters. We have evaluated our method on several popular datasets and network architectures. Our extensive experiment results have shown that the proposed method advances the current state-of-the-art.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.