{"title":"卷积神经网络的广义熵稀疏化研究。","authors":"Tin Barisin;Illia Horenko","doi":"10.1162/neco.a.21","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has N neurons, then there are 2N possibilities to connect them—and therefore 2N possible architectures and 2N Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an Np-hard problem since 2N grows in N faster then any polynomial Np. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 9","pages":"1648-1676"},"PeriodicalIF":2.1000,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Toward Generalized Entropic Sparsification for Convolutional Neural Networks\",\"authors\":\"Tin Barisin;Illia Horenko\",\"doi\":\"10.1162/neco.a.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has N neurons, then there are 2N possibilities to connect them—and therefore 2N possible architectures and 2N Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an Np-hard problem since 2N grows in N faster then any polynomial Np. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.\",\"PeriodicalId\":54731,\"journal\":{\"name\":\"Neural Computation\",\"volume\":\"37 9\",\"pages\":\"1648-1676\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11180102/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11180102/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Toward Generalized Entropic Sparsification for Convolutional Neural Networks
Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has N neurons, then there are 2N possibilities to connect them—and therefore 2N possible architectures and 2N Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an Np-hard problem since 2N grows in N faster then any polynomial Np. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.
期刊介绍:
Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.