利用深度卷积特征滤波器 (DeCEF) 构建高效的 CNN

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2024-09-03 DOI:10.1016/j.neucom.2024.128461

{"title":"利用深度卷积特征滤波器 (DeCEF) 构建高效的 CNN","authors":"","doi":"10.1016/j.neucom.2024.128461","DOIUrl":null,"url":null,"abstract":"<div><p>Deep Convolutional Neural Networks (CNNs) have been widely used in various domains due to their impressive capabilities. These models are typically composed of a large number of 2D convolutional (Conv2D) layers with numerous trainable parameters. To manage the complexity of such networks, compression techniques can be applied, which typically rely on the analysis of trained deep learning models. However, in certain situations, training a new CNN from scratch may be infeasible due to resource limitations. In this paper, we propose an alternative parameterization to Conv2D filters with significantly fewer parameters without relying on compressing a pre-trained CNN. Our analysis reveals that the effective rank of the vectorized Conv2D filters decreases with respect to the increasing depth in the network. This leads to the development of the Depthwise Convolutional Eigen-Filter (DeCEF) layer, which is a low rank version of the Conv2D layer with significantly fewer trainable parameters and floating point operations (FLOPs). The way we define the effective rank is different from previous work, and it is easy to implement and interpret. Applying this technique is straightforward – one can simply replace any standard convolutional layer with a DeCEF layer in a CNN. To evaluate the effectiveness of DeCEF layers, experiments are conducted on the benchmark datasets CIFAR-10 and ImageNet for various network architectures. The results have shown a similar or higher accuracy using about 2/3 of the original parameters and reducing the number of FLOPs to 2/3 of the base network. Additionally, analyzing the patterns in the effective rank provides insights into the inner workings of CNNs and highlights opportunities for future research.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0925231224012323/pdfft?md5=6f5a3a86accd86ed460b34e4b3ac884f&pid=1-s2.0-S0925231224012323-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Building efficient CNNs using Depthwise Convolutional Eigen-Filters (DeCEF)\",\"authors\":\"\",\"doi\":\"10.1016/j.neucom.2024.128461\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Deep Convolutional Neural Networks (CNNs) have been widely used in various domains due to their impressive capabilities. These models are typically composed of a large number of 2D convolutional (Conv2D) layers with numerous trainable parameters. To manage the complexity of such networks, compression techniques can be applied, which typically rely on the analysis of trained deep learning models. However, in certain situations, training a new CNN from scratch may be infeasible due to resource limitations. In this paper, we propose an alternative parameterization to Conv2D filters with significantly fewer parameters without relying on compressing a pre-trained CNN. Our analysis reveals that the effective rank of the vectorized Conv2D filters decreases with respect to the increasing depth in the network. This leads to the development of the Depthwise Convolutional Eigen-Filter (DeCEF) layer, which is a low rank version of the Conv2D layer with significantly fewer trainable parameters and floating point operations (FLOPs). The way we define the effective rank is different from previous work, and it is easy to implement and interpret. Applying this technique is straightforward – one can simply replace any standard convolutional layer with a DeCEF layer in a CNN. To evaluate the effectiveness of DeCEF layers, experiments are conducted on the benchmark datasets CIFAR-10 and ImageNet for various network architectures. The results have shown a similar or higher accuracy using about 2/3 of the original parameters and reducing the number of FLOPs to 2/3 of the base network. Additionally, analyzing the patterns in the effective rank provides insights into the inner workings of CNNs and highlights opportunities for future research.</p></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0925231224012323/pdfft?md5=6f5a3a86accd86ed460b34e4b3ac884f&pid=1-s2.0-S0925231224012323-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231224012323\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224012323","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

深度卷积神经网络（CNN）因其强大的功能而被广泛应用于各个领域。这些模型通常由大量二维卷积（Conv2D）层组成，具有众多可训练参数。为了管理此类网络的复杂性，可以采用压缩技术，这种技术通常依赖于对训练有素的深度学习模型的分析。然而，在某些情况下，由于资源限制，从头开始训练一个新的 CNN 可能并不可行。在本文中，我们提出了 Conv2D 过滤器的另一种参数化方法，参数数量大大减少，无需依赖压缩预先训练好的 CNN。我们的分析表明，矢量化 Conv2D 滤波器的有效等级随网络深度的增加而降低。因此，我们开发了深度卷积特征滤波器（DeCEF）层，它是 Conv2D 层的低级版本，可训练参数和浮点运算（FLOP）显著减少。我们定义有效秩的方法不同于以往的工作，而且易于实现和解释。这项技术的应用非常简单，只需在 CNN 中用 DeCEF 层替换任何标准卷积层即可。为了评估 DeCEF 层的有效性，我们在基准数据集 CIFAR-10 和 ImageNet 上针对不同的网络架构进行了实验。结果表明，使用大约 2/3 的原始参数，并将 FLOP 数量减少到基础网络的 2/3，就能获得类似或更高的精度。此外，通过分析有效等级的模式，可以深入了解 CNN 的内部工作原理，并为今后的研究提供了机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Building efficient CNNs using Depthwise Convolutional Eigen-Filters (DeCEF)

Deep Convolutional Neural Networks (CNNs) have been widely used in various domains due to their impressive capabilities. These models are typically composed of a large number of 2D convolutional (Conv2D) layers with numerous trainable parameters. To manage the complexity of such networks, compression techniques can be applied, which typically rely on the analysis of trained deep learning models. However, in certain situations, training a new CNN from scratch may be infeasible due to resource limitations. In this paper, we propose an alternative parameterization to Conv2D filters with significantly fewer parameters without relying on compressing a pre-trained CNN. Our analysis reveals that the effective rank of the vectorized Conv2D filters decreases with respect to the increasing depth in the network. This leads to the development of the Depthwise Convolutional Eigen-Filter (DeCEF) layer, which is a low rank version of the Conv2D layer with significantly fewer trainable parameters and floating point operations (FLOPs). The way we define the effective rank is different from previous work, and it is easy to implement and interpret. Applying this technique is straightforward – one can simply replace any standard convolutional layer with a DeCEF layer in a CNN. To evaluate the effectiveness of DeCEF layers, experiments are conducted on the benchmark datasets CIFAR-10 and ImageNet for various network architectures. The results have shown a similar or higher accuracy using about 2/3 of the original parameters and reducing the number of FLOPs to 2/3 of the base network. Additionally, analyzing the patterns in the effective rank provides insights into the inner workings of CNNs and highlights opportunities for future research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.