Efficient implementation of a generalized convolutional neural networks based on weighted euclidean distance

2017 7th International Conference on Computer and Knowledge Engineering (ICCKE) Pub Date : 2017-10-01 DOI:10.1109/ICCKE.2017.8167877

Keivan Nalaie, Kamaledin Ghiasi-Shirazi, Modhammad-R. Akbarzadeh-T.

{"title":"Efficient implementation of a generalized convolutional neural networks based on weighted euclidean distance","authors":"Keivan Nalaie, Kamaledin Ghiasi-Shirazi, Modhammad-R. Akbarzadeh-T.","doi":"10.1109/ICCKE.2017.8167877","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNNs) are multi-layer deep structures that have been very successful in visual recognition tasks. These networks basically consist of the convolution, pooling, and the nonlinearity layers, each of which operates on the representation produced by the preceding layer and generates a new representation. Convolution layers naturally compute some inner product between a plane represented by the weight parameters and input patches. Recently, Generalized Convolutional Neural Networks (GCNN) have been introduced which justify the use of some kernels or distance functions in place of the inner product operator inside the convolution layers. Although GCNNs gained interesting results on the MNIST dataset, their application to more challenging datasets is hindered by lack of an efficient implementation. In this paper, we focus on a specific generalized convolution operator which is based on the weighted L2 norm distance (WL2Dist). By replacing the nonlinear part with three convolution operators and using effective matrix-matrix multiplications, we were able to efficiently compute the WL2Dist convolution layer both on CPU and GPU. Our experiments show that, on CPU (GPU), the proposed implementation of the WL2Dist layer achieves a 5.5x (21x) speed-up over the initial BLAS-based (CUDA-based) implementations.","PeriodicalId":151934,"journal":{"name":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2017.8167877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

Convolutional Neural Networks (CNNs) are multi-layer deep structures that have been very successful in visual recognition tasks. These networks basically consist of the convolution, pooling, and the nonlinearity layers, each of which operates on the representation produced by the preceding layer and generates a new representation. Convolution layers naturally compute some inner product between a plane represented by the weight parameters and input patches. Recently, Generalized Convolutional Neural Networks (GCNN) have been introduced which justify the use of some kernels or distance functions in place of the inner product operator inside the convolution layers. Although GCNNs gained interesting results on the MNIST dataset, their application to more challenging datasets is hindered by lack of an efficient implementation. In this paper, we focus on a specific generalized convolution operator which is based on the weighted L2 norm distance (WL2Dist). By replacing the nonlinear part with three convolution operators and using effective matrix-matrix multiplications, we were able to efficiently compute the WL2Dist convolution layer both on CPU and GPU. Our experiments show that, on CPU (GPU), the proposed implementation of the WL2Dist layer achieves a 5.5x (21x) speed-up over the initial BLAS-based (CUDA-based) implementations.

查看原文本刊更多论文

基于加权欧氏距离的广义卷积神经网络的高效实现

卷积神经网络(cnn)是一种多层深层结构，在视觉识别任务中非常成功。这些网络基本上由卷积层、池化层和非线性层组成，每一层都对前一层产生的表示进行操作，并产生新的表示。卷积层自然地计算由权重参数和输入补丁表示的平面之间的一些内积。近年来，引入了广义卷积神经网络(GCNN)，证明了在卷积层内使用一些核函数或距离函数来代替内积算子。尽管gcnn在MNIST数据集上获得了有趣的结果，但由于缺乏有效的实现，它们在更具挑战性的数据集上的应用受到阻碍。本文研究了一种基于加权L2范数距离(WL2Dist)的广义卷积算子。通过用三个卷积算子替换非线性部分并使用有效的矩阵-矩阵乘法，我们能够在CPU和GPU上高效地计算WL2Dist卷积层。我们的实验表明，在CPU (GPU)上，WL2Dist层的提议实现比最初基于blas (cuda)的实现实现了5.5倍(21倍)的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)

自引率

0.00%

发文量