Toward Generalized Entropic Sparsification for Convolutional Neural Networks

IF 2.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation Pub Date : 2025-08-08 DOI:10.1162/neco.a.21

Tin Barisin;Illia Horenko

引用次数: 0

Abstract

Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has N neurons, then there are 2N possibilities to connect them—and therefore 2N possible architectures and 2N Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an Np-hard problem since 2N grows in N faster then any polynomial Np. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.

查看原文本刊更多论文

卷积神经网络的广义熵稀疏化研究。

据报道，卷积神经网络（cnn）被过度参数化。寻找最优（最小）和充分的架构是一个np困难问题：如果网络有$N$神经元，那么有2$^{N}$连接它们的可能性-因此有2$^{N}$可能的架构和2$^{N}$布尔超参数来编码它们。从它们中选择最好的超参数变成了一个困难的问题，因为2$^{N}$在$N$中的增长速度比任何多项式$N^{p}$都快。在这里，我们引入了一种基于数学思想的逐层数据驱动的剪枝方法，旨在解决剪枝问题的计算可扩展的熵松弛问题。使用网络熵最小化作为稀疏性约束，从预训练（完整）CNN中找到稀疏子网络。这允许部署具有次线性扩展成本的数值可扩展算法。该方法在几个基准（架构）上进行了验证：在MNIST （LeNet）上，稀疏度为55%至84%，准确度损失仅为0.1%至0.5%；在CIFAR-10 （vgg16, ResNet18）上，稀疏度为73%至89%，准确度损失为0.1%至0.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Computation 工程技术-计算机：人工智能

CiteScore

6.30

自引率

3.40%

发文量

审稿时长

3.0 months

期刊介绍： Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.