{"title":"卷积神经网络的广义熵稀疏化研究。","authors":"Tin Barisin, Illia Horenko","doi":"10.1162/neco.a.21","DOIUrl":null,"url":null,"abstract":"<p><p>Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has $N$ neurons, then there are 2$^{N}$ possibilities to connect them-and therefore 2$^{N}$ possible architectures and 2$^{N}$ Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an $N^{p}$ -hard problem since 2$^{N}$ grows in $N$ faster then any polynomial $N^{p}$. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":" ","pages":"1-29"},"PeriodicalIF":2.1000,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Toward Generalized Entropic Sparsification for Convolutional Neural Networks.\",\"authors\":\"Tin Barisin, Illia Horenko\",\"doi\":\"10.1162/neco.a.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has $N$ neurons, then there are 2$^{N}$ possibilities to connect them-and therefore 2$^{N}$ possible architectures and 2$^{N}$ Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an $N^{p}$ -hard problem since 2$^{N}$ grows in $N$ faster then any polynomial $N^{p}$. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.</p>\",\"PeriodicalId\":54731,\"journal\":{\"name\":\"Neural Computation\",\"volume\":\" \",\"pages\":\"1-29\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1162/neco.a.21\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/neco.a.21","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Toward Generalized Entropic Sparsification for Convolutional Neural Networks.
Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has $N$ neurons, then there are 2$^{N}$ possibilities to connect them-and therefore 2$^{N}$ possible architectures and 2$^{N}$ Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an $N^{p}$ -hard problem since 2$^{N}$ grows in $N$ faster then any polynomial $N^{p}$. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.
期刊介绍:
Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.