以积分微分样条为激活函数的神经网络信号处理算法及其图像分类实例

Highly available systems Pub Date : 1900-01-01 DOI:10.18127/j20729472-202102-02

T.K. Biryukova

{"title":"以积分微分样条为激活函数的神经网络信号处理算法及其图像分类实例","authors":"T.K. Biryukova","doi":"10.18127/j20729472-202102-02","DOIUrl":null,"url":null,"abstract":"Classic neural networks suppose trainable parameters to include just weights of neurons. This paper proposes parabolic integrodifferential splines (ID-splines), developed by author, as a new kind of activation function (AF) for neural networks, where ID-splines coefficients are also trainable parameters. Parameters of ID-spline AF together with weights of neurons are vary during the training in order to minimize the loss function thus reducing the training time and increasing the operation speed of the neural network. The newly developed algorithm enables software implementation of the ID-spline AF as a tool for neural networks construction, training and operation. It is proposed to use the same ID-spline AF for neurons in the same layer, but different for different layers. In this case, the parameters of the ID-spline AF for a particular layer change during the training process independently of the activation functions (AFs) of other network layers. In order to comply with the continuity condition for the derivative of the parabolic ID-spline on the interval (x x0, n) , its parameters fi (i= 0,...,n) should be calculated using the tridiagonal system of linear algebraic equations: To solve the system it is necessary to use two more equations arising from the boundary conditions for specific problems. For exam- ple the values of the grid function (if they are known) in the points (x x0, n) may be used for solving the system above: f f x0 = ( 0) , f f xn = ( n) . The parameters Iii+1 (i= 0,...,n−1 ) are used as trainable parameters of neural networks. The grid boundaries and spacing of the nodes of ID-spline AF are best chosen experimentally. The optimal selection of grid nodes allows improving the quality of results produced by the neural network. The formula for a parabolic ID-spline is such that the complexity of the calculations does not depend on whether the grid of nodes is uniform or non-uniform. An experimental comparison of the results of image classification from the popular FashionMNIST dataset by convolutional neural 0, x< 0 networks with the ID-spline AFs and the well-known ReLUx( ) =AF was carried out. The results reveal that the usage x x, ≥ 0 of the ID-spline AFs provides better accuracy of neural network operation than the ReLU AF. The training time for two convolutional layers network with two ID-spline AFs is just about 2 times longer than with two instances of ReLU AF. Doubling of the training time due to complexity of the ID-spline formula is the acceptable price for significantly better accuracy of the network. Wherein the difference of an operation speed of the networks with ID-spline and ReLU AFs will be negligible. The use of trainable ID-spline AFs makes it possible to simplify the architecture of neural networks without losing their efficiency. The modification of the well-known neural networks (ResNet etc.) by replacing traditional AFs with ID-spline AFs is a promising approach to increase the neural network operation accuracy. In a majority of cases, such a substitution does not require to train the network from scratch because it allows to use pre-trained on large datasets neuron weights supplied by standard software libraries for neural network construction thus substantially shortening training time.","PeriodicalId":156447,"journal":{"name":"Highly available systems","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Signal processing algorithm for neural networks with integrodifferential splines as an activation function and its particular case of image classification\",\"authors\":\"T.K. Biryukova\",\"doi\":\"10.18127/j20729472-202102-02\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classic neural networks suppose trainable parameters to include just weights of neurons. This paper proposes parabolic integrodifferential splines (ID-splines), developed by author, as a new kind of activation function (AF) for neural networks, where ID-splines coefficients are also trainable parameters. Parameters of ID-spline AF together with weights of neurons are vary during the training in order to minimize the loss function thus reducing the training time and increasing the operation speed of the neural network. The newly developed algorithm enables software implementation of the ID-spline AF as a tool for neural networks construction, training and operation. It is proposed to use the same ID-spline AF for neurons in the same layer, but different for different layers. In this case, the parameters of the ID-spline AF for a particular layer change during the training process independently of the activation functions (AFs) of other network layers. In order to comply with the continuity condition for the derivative of the parabolic ID-spline on the interval (x x0, n) , its parameters fi (i= 0,...,n) should be calculated using the tridiagonal system of linear algebraic equations: To solve the system it is necessary to use two more equations arising from the boundary conditions for specific problems. For exam- ple the values of the grid function (if they are known) in the points (x x0, n) may be used for solving the system above: f f x0 = ( 0) , f f xn = ( n) . The parameters Iii+1 (i= 0,...,n−1 ) are used as trainable parameters of neural networks. The grid boundaries and spacing of the nodes of ID-spline AF are best chosen experimentally. The optimal selection of grid nodes allows improving the quality of results produced by the neural network. The formula for a parabolic ID-spline is such that the complexity of the calculations does not depend on whether the grid of nodes is uniform or non-uniform. An experimental comparison of the results of image classification from the popular FashionMNIST dataset by convolutional neural 0, x< 0 networks with the ID-spline AFs and the well-known ReLUx( ) =AF was carried out. The results reveal that the usage x x, ≥ 0 of the ID-spline AFs provides better accuracy of neural network operation than the ReLU AF. The training time for two convolutional layers network with two ID-spline AFs is just about 2 times longer than with two instances of ReLU AF. Doubling of the training time due to complexity of the ID-spline formula is the acceptable price for significantly better accuracy of the network. Wherein the difference of an operation speed of the networks with ID-spline and ReLU AFs will be negligible. The use of trainable ID-spline AFs makes it possible to simplify the architecture of neural networks without losing their efficiency. The modification of the well-known neural networks (ResNet etc.) by replacing traditional AFs with ID-spline AFs is a promising approach to increase the neural network operation accuracy. In a majority of cases, such a substitution does not require to train the network from scratch because it allows to use pre-trained on large datasets neuron weights supplied by standard software libraries for neural network construction thus substantially shortening training time.\",\"PeriodicalId\":156447,\"journal\":{\"name\":\"Highly available systems\",\"volume\":\"89 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Highly available systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18127/j20729472-202102-02\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Highly available systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18127/j20729472-202102-02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

经典的神经网络假设可训练参数只包含神经元的权重。本文提出抛物型积分微分样条(id -spline)作为神经网络的一种新的激活函数，其中id -样条系数也是可训练的参数。在训练过程中改变id样条AF的参数和神经元的权值，以最小化损失函数，从而减少训练时间，提高神经网络的运行速度。新开发的算法使软件实现id样条AF作为神经网络构建、训练和操作的工具。提出对同一层神经元使用相同id样条AF，不同层使用不同AF。在这种情况下，特定层的id样条AF的参数在训练过程中发生变化，与其他网络层的激活函数(activation function, AFs)无关。为了满足抛物线id样条在区间(x, x,n)上导数的连续性条件，其参数fi (i= 0，…，n)应采用线性代数方程组的三对角线系统来计算。为了求解该系统，还需要使用由具体问题的边界条件产生的另外两个方程。举个例子，点(xx0, n)中的网格函数值(如果已知)可以用来求解上面的方程组:f f x0 = (0)， f f xn = (n)。参数Iii+1 (i= 0，…)，n−1)作为神经网络的可训练参数。通过实验确定了id样条AF的网格边界和节点间距。网格节点的最佳选择可以提高神经网络产生的结果的质量。抛物线id样条的公式是这样的，计算的复杂性不取决于节点的网格是均匀的还是不均匀的。将卷积神经网络0,x< 0网络与id样条AF和著名的ReLUx() =AF对流行的FashionMNIST数据集的图像分类结果进行了实验比较。结果表明，使用x x，≥0的id样条AF比使用ReLU AF提供了更好的神经网络操作精度。使用两个id样条AF的两卷积层网络的训练时间仅比使用两个ReLU AF的训练时间长2倍左右。由于id样条公式的复杂性，训练时间翻倍是网络精度显著提高的可接受代价。其中，具有id样条和ReLU af的网络的运行速度差异可以忽略不计。使用可训练id样条AFs可以简化神经网络的结构而不损失其效率。用id样条AFs代替传统的AFs，对著名的神经网络(ResNet等)进行改进，是提高神经网络运算精度的一种很有前途的方法。在大多数情况下，这种替换不需要从头开始训练网络，因为它允许使用标准软件库提供的大型数据集预训练神经元权重来构建神经网络，从而大大缩短了训练时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Signal processing algorithm for neural networks with integrodifferential splines as an activation function and its particular case of image classification

Classic neural networks suppose trainable parameters to include just weights of neurons. This paper proposes parabolic integrodifferential splines (ID-splines), developed by author, as a new kind of activation function (AF) for neural networks, where ID-splines coefficients are also trainable parameters. Parameters of ID-spline AF together with weights of neurons are vary during the training in order to minimize the loss function thus reducing the training time and increasing the operation speed of the neural network. The newly developed algorithm enables software implementation of the ID-spline AF as a tool for neural networks construction, training and operation. It is proposed to use the same ID-spline AF for neurons in the same layer, but different for different layers. In this case, the parameters of the ID-spline AF for a particular layer change during the training process independently of the activation functions (AFs) of other network layers. In order to comply with the continuity condition for the derivative of the parabolic ID-spline on the interval (x x0, n) , its parameters fi (i= 0,...,n) should be calculated using the tridiagonal system of linear algebraic equations: To solve the system it is necessary to use two more equations arising from the boundary conditions for specific problems. For exam- ple the values of the grid function (if they are known) in the points (x x0, n) may be used for solving the system above: f f x0 = ( 0) , f f xn = ( n) . The parameters Iii+1 (i= 0,...,n−1 ) are used as trainable parameters of neural networks. The grid boundaries and spacing of the nodes of ID-spline AF are best chosen experimentally. The optimal selection of grid nodes allows improving the quality of results produced by the neural network. The formula for a parabolic ID-spline is such that the complexity of the calculations does not depend on whether the grid of nodes is uniform or non-uniform. An experimental comparison of the results of image classification from the popular FashionMNIST dataset by convolutional neural 0, x< 0 networks with the ID-spline AFs and the well-known ReLUx( ) =AF was carried out. The results reveal that the usage x x, ≥ 0 of the ID-spline AFs provides better accuracy of neural network operation than the ReLU AF. The training time for two convolutional layers network with two ID-spline AFs is just about 2 times longer than with two instances of ReLU AF. Doubling of the training time due to complexity of the ID-spline formula is the acceptable price for significantly better accuracy of the network. Wherein the difference of an operation speed of the networks with ID-spline and ReLU AFs will be negligible. The use of trainable ID-spline AFs makes it possible to simplify the architecture of neural networks without losing their efficiency. The modification of the well-known neural networks (ResNet etc.) by replacing traditional AFs with ID-spline AFs is a promising approach to increase the neural network operation accuracy. In a majority of cases, such a substitution does not require to train the network from scratch because it allows to use pre-trained on large datasets neuron weights supplied by standard software libraries for neural network construction thus substantially shortening training time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Highly available systems

自引率

0.00%

发文量