基于L0正则化的卷积神经网络压缩

2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO) Pub Date : 2019-05-01 DOI:10.1109/ICCAIRO47923.2019.00032

András Formanek, D. Hadhazi

{"title":"基于L0正则化的卷积神经网络压缩","authors":"András Formanek, D. Hadhazi","doi":"10.1109/ICCAIRO47923.2019.00032","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks have recently taken over the field of image processing, because they can handle complex non algorithmic problems with state-of-the-art results, based on precision and inference times. However, there are many environments (e.g. cell phones, IoT, embedded systems, etc.) and use-cases (e.g. pedestrian detection in autonomous driving assistant systems), where the hard real-time requirements can only be satisfied by efficient computational resource utilization. The general trend is training larger and more complex networks in order to achieve better accuracies and forcing these networks to be redundant (in order to increase their generalization ability). However, this produces networks that cannot be used in such scenarios. Pruning methods try to solve this problem by reducing the size of the trained neural networks. These methods eliminate the redundant computations after the training, which usually cause high drop in the accuracy. In this paper, we propose new regularization techniques, which induce the sparsity of the parameters during the training and in this way, the network can be efficiently pruned. From this viewpoint, we analyse and compare the effect of minimizing different norms of the weights (L1, L0) one by one and for groups of them (for kernels and channels). L1 regularization can be optimized by Gradient Descent, but this is not true for L0. The paper proposes a combination of Proximal Gradient Descent optimization and RMSProp method to solve the resulting optimization problem. Our results demonstrate that the proposed L0 minimization-based regularization methods outperform the L1 based ones, both in terms of sparsity of the resulting weight-matrices and the accuracy of the pruned network. Additionally, we demonstrate that the accuracy of deep neural networks can also be increased using the proposed sparsifying regularizations.","PeriodicalId":297342,"journal":{"name":"2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Compressing Convolutional Neural Networks by L0 Regularization\",\"authors\":\"András Formanek, D. Hadhazi\",\"doi\":\"10.1109/ICCAIRO47923.2019.00032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional Neural Networks have recently taken over the field of image processing, because they can handle complex non algorithmic problems with state-of-the-art results, based on precision and inference times. However, there are many environments (e.g. cell phones, IoT, embedded systems, etc.) and use-cases (e.g. pedestrian detection in autonomous driving assistant systems), where the hard real-time requirements can only be satisfied by efficient computational resource utilization. The general trend is training larger and more complex networks in order to achieve better accuracies and forcing these networks to be redundant (in order to increase their generalization ability). However, this produces networks that cannot be used in such scenarios. Pruning methods try to solve this problem by reducing the size of the trained neural networks. These methods eliminate the redundant computations after the training, which usually cause high drop in the accuracy. In this paper, we propose new regularization techniques, which induce the sparsity of the parameters during the training and in this way, the network can be efficiently pruned. From this viewpoint, we analyse and compare the effect of minimizing different norms of the weights (L1, L0) one by one and for groups of them (for kernels and channels). L1 regularization can be optimized by Gradient Descent, but this is not true for L0. The paper proposes a combination of Proximal Gradient Descent optimization and RMSProp method to solve the resulting optimization problem. Our results demonstrate that the proposed L0 minimization-based regularization methods outperform the L1 based ones, both in terms of sparsity of the resulting weight-matrices and the accuracy of the pruned network. Additionally, we demonstrate that the accuracy of deep neural networks can also be increased using the proposed sparsifying regularizations.\",\"PeriodicalId\":297342,\"journal\":{\"name\":\"2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCAIRO47923.2019.00032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAIRO47923.2019.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

卷积神经网络最近接管了图像处理领域，因为基于精度和推理时间，卷积神经网络可以用最先进的结果处理复杂的非算法问题。然而，在许多环境(如手机、物联网、嵌入式系统等)和用例(如自动驾驶辅助系统中的行人检测)中，只有通过高效的计算资源利用才能满足硬实时性要求。总的趋势是训练更大、更复杂的网络，以达到更好的精度，并迫使这些网络冗余(以提高它们的泛化能力)。然而，这产生的网络不能在这种情况下使用。修剪方法试图通过减小训练神经网络的大小来解决这个问题。这些方法消除了训练后的冗余计算，避免了训练后的冗余计算导致准确率下降。本文提出了一种新的正则化技术，在训练过程中引入参数的稀疏性，从而有效地对网络进行修剪。从这个角度出发，我们分析和比较了逐个最小化不同权值(L1, L0)的效果以及它们的组(对于核和通道)的效果。L1正则化可以通过梯度下降来优化，但对于L0来说不是这样。本文提出一种结合近端梯度下降优化和RMSProp方法的方法来解决由此产生的优化问题。我们的结果表明，提出的基于L0最小化的正则化方法在得到的权重矩阵的稀疏性和修剪网络的准确性方面都优于基于L1的正则化方法。此外，我们还证明了使用所提出的稀疏化正则化也可以提高深度神经网络的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Compressing Convolutional Neural Networks by L0 Regularization

Convolutional Neural Networks have recently taken over the field of image processing, because they can handle complex non algorithmic problems with state-of-the-art results, based on precision and inference times. However, there are many environments (e.g. cell phones, IoT, embedded systems, etc.) and use-cases (e.g. pedestrian detection in autonomous driving assistant systems), where the hard real-time requirements can only be satisfied by efficient computational resource utilization. The general trend is training larger and more complex networks in order to achieve better accuracies and forcing these networks to be redundant (in order to increase their generalization ability). However, this produces networks that cannot be used in such scenarios. Pruning methods try to solve this problem by reducing the size of the trained neural networks. These methods eliminate the redundant computations after the training, which usually cause high drop in the accuracy. In this paper, we propose new regularization techniques, which induce the sparsity of the parameters during the training and in this way, the network can be efficiently pruned. From this viewpoint, we analyse and compare the effect of minimizing different norms of the weights (L1, L0) one by one and for groups of them (for kernels and channels). L1 regularization can be optimized by Gradient Descent, but this is not true for L0. The paper proposes a combination of Proximal Gradient Descent optimization and RMSProp method to solve the resulting optimization problem. Our results demonstrate that the proposed L0 minimization-based regularization methods outperform the L1 based ones, both in terms of sparsity of the resulting weight-matrices and the accuracy of the pruned network. Additionally, we demonstrate that the accuracy of deep neural networks can also be increased using the proposed sparsifying regularizations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO)

自引率

0.00%

发文量