{"title":"Fine Granularity Is Critical for Intelligent Neural Network Pruning","authors":"Alex Heyman;Joel Zylberberg","doi":"10.1162/neco_a_01717","DOIUrl":null,"url":null,"abstract":"Neural network pruning is a popular approach to reducing the computational costs of training and/or deploying a network and aims to do so while minimizing accuracy loss. Pruning methods that remove individual weights (fine granularity) can remove more total network parameters before reaching a given degree of accuracy loss, while methods that preserve some or all of a network’s structure (coarser granularity, such as pruning channels from a CNN) take better advantage of hardware and software optimized for dense matrix computations. We compare intelligent iterative pruning using several different criteria sampled from the literature against random pruning at initialization across multiple granularities on two different architectures and three image classification tasks. Our work is the first direct and comprehensive investigation of the relationship between granularity and the efficacy of intelligent pruning relative to a random-pruning baseline. We find that the accuracy advantage of intelligent over random pruning decreases dramatically as granularity becomes coarser, with minimal advantage for intelligent pruning at granularity coarse enough to fully preserve network structure. For instance, at pruning rates where random pruning leaves ResNet-20 at 85.0% test accuracy on CIFAR-10 after 30,000 training iterations, intelligent weight pruning with the best-in-context criterion leaves it at about 90.0% accuracy (on par with the unpruned network), kernel pruning leaves it at about 86.5%, and channel pruning leaves it at about 85.5%. Our results suggest that compared to coarse pruning, fine pruning combined with efficient implementation of the resulting networks is a more promising direction for easing the trade-off between high accuracy and low computational cost.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 12","pages":"2677-2709"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10810342/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Neural network pruning is a popular approach to reducing the computational costs of training and/or deploying a network and aims to do so while minimizing accuracy loss. Pruning methods that remove individual weights (fine granularity) can remove more total network parameters before reaching a given degree of accuracy loss, while methods that preserve some or all of a network’s structure (coarser granularity, such as pruning channels from a CNN) take better advantage of hardware and software optimized for dense matrix computations. We compare intelligent iterative pruning using several different criteria sampled from the literature against random pruning at initialization across multiple granularities on two different architectures and three image classification tasks. Our work is the first direct and comprehensive investigation of the relationship between granularity and the efficacy of intelligent pruning relative to a random-pruning baseline. We find that the accuracy advantage of intelligent over random pruning decreases dramatically as granularity becomes coarser, with minimal advantage for intelligent pruning at granularity coarse enough to fully preserve network structure. For instance, at pruning rates where random pruning leaves ResNet-20 at 85.0% test accuracy on CIFAR-10 after 30,000 training iterations, intelligent weight pruning with the best-in-context criterion leaves it at about 90.0% accuracy (on par with the unpruned network), kernel pruning leaves it at about 86.5%, and channel pruning leaves it at about 85.5%. Our results suggest that compared to coarse pruning, fine pruning combined with efficient implementation of the resulting networks is a more promising direction for easing the trade-off between high accuracy and low computational cost.
期刊介绍:
Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.