具有微分激活函数的神经网络学习模型和方法

Dmytro Zelentsov, Shaptala Taras
{"title":"具有微分激活函数的神经网络学习模型和方法","authors":"Dmytro Zelentsov, Shaptala Taras","doi":"10.34185/1562-9945-6-143-2022-05","DOIUrl":null,"url":null,"abstract":"Analysis of the literature made it clear that the problem associated with improving the performance and acceleration of ANN learning is quite actual, as ANNs are used every day in more and more industries. The concepts of finding more profitable activation functions have been outlined a lot, but changing their behavior as a result of learning is a fresh look at the problem. The aim of the study is to find new models of optimization tasks for the formulated prob-lem and effective methods for their implementation, which would improve the quality of ANN training, in particular by overcoming the problem of local minima. A studied of models and methods for training neural networks using an extended vector of varying parameters is conducted. The training problem is formulated as a continuous mul-tidimensional unconditional optimization problem. The extended vector of varying parameters implies that it includes some parameters of activation functions in addition to weight coeffi-cients. The introduction of additional varying parameters does not change the architecture of a neural network, but makes it impossible to use the back propagation method. A number of gradient methods have been used to solve optimization problems. Different formulations of optimization problems and methods for their solution have been investigated according to ac-curacy and efficiency criteria. The analysis of the results of numerical experiments allowed us to conclude that it is expedient to expand the vector of varying parameters in the tasks of training ANNs with con-tinuous and differentiated activation functions. Despite the increase in the dimensionality of the optimization problem, the efficiency of the new formulation is higher than the generalized one. According to the authors, this is due to the fact that a significant share of computational costs in the generalized formulation falls on attempts to leave the neighborhood of local min-ima, while increasing the dimensionality of the solution space allows this to be done with much lower costs.","PeriodicalId":493145,"journal":{"name":"Sistemnì tehnologìï","volume":"124 7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Models and methods of learning neural networks with differentiated activation functions\",\"authors\":\"Dmytro Zelentsov, Shaptala Taras\",\"doi\":\"10.34185/1562-9945-6-143-2022-05\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analysis of the literature made it clear that the problem associated with improving the performance and acceleration of ANN learning is quite actual, as ANNs are used every day in more and more industries. The concepts of finding more profitable activation functions have been outlined a lot, but changing their behavior as a result of learning is a fresh look at the problem. The aim of the study is to find new models of optimization tasks for the formulated prob-lem and effective methods for their implementation, which would improve the quality of ANN training, in particular by overcoming the problem of local minima. A studied of models and methods for training neural networks using an extended vector of varying parameters is conducted. The training problem is formulated as a continuous mul-tidimensional unconditional optimization problem. The extended vector of varying parameters implies that it includes some parameters of activation functions in addition to weight coeffi-cients. The introduction of additional varying parameters does not change the architecture of a neural network, but makes it impossible to use the back propagation method. A number of gradient methods have been used to solve optimization problems. Different formulations of optimization problems and methods for their solution have been investigated according to ac-curacy and efficiency criteria. The analysis of the results of numerical experiments allowed us to conclude that it is expedient to expand the vector of varying parameters in the tasks of training ANNs with con-tinuous and differentiated activation functions. Despite the increase in the dimensionality of the optimization problem, the efficiency of the new formulation is higher than the generalized one. According to the authors, this is due to the fact that a significant share of computational costs in the generalized formulation falls on attempts to leave the neighborhood of local min-ima, while increasing the dimensionality of the solution space allows this to be done with much lower costs.\",\"PeriodicalId\":493145,\"journal\":{\"name\":\"Sistemnì tehnologìï\",\"volume\":\"124 7\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sistemnì tehnologìï\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34185/1562-9945-6-143-2022-05\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sistemnì tehnologìï","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34185/1562-9945-6-143-2022-05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

对文献的分析清楚地表明,随着人工神经网络每天在越来越多的行业中使用,与提高人工神经网络学习的性能和加速相关的问题是相当现实的。寻找更有利的激活函数的概念已经被提出了很多,但是通过学习来改变它们的行为是一个全新的问题。本研究的目的是为已制定的问题找到新的优化任务模型和有效的实现方法,以提高人工神经网络的训练质量,特别是克服局部极小值问题。研究了利用变参数扩展向量训练神经网络的模型和方法。将训练问题表述为一个连续的多维无条件优化问题。变参数扩展向量意味着它除了包含权系数外,还包含激活函数的一些参数。引入额外的可变参数不会改变神经网络的结构,但使其无法使用反向传播方法。许多梯度方法已被用于解决优化问题。根据精度和效率标准,研究了优化问题的不同表述及其求解方法。通过对数值实验结果的分析,我们得出结论,在具有连续和微分激活函数的人工神经网络训练任务中,扩展不同参数的向量是方便的。尽管优化问题的维数增加了,但新公式的效率高于一般公式。根据作者的说法,这是由于广义公式中很大一部分计算成本落在试图离开局部最小值的邻域上,而增加解空间的维数允许以更低的成本完成这一任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Models and methods of learning neural networks with differentiated activation functions
Analysis of the literature made it clear that the problem associated with improving the performance and acceleration of ANN learning is quite actual, as ANNs are used every day in more and more industries. The concepts of finding more profitable activation functions have been outlined a lot, but changing their behavior as a result of learning is a fresh look at the problem. The aim of the study is to find new models of optimization tasks for the formulated prob-lem and effective methods for their implementation, which would improve the quality of ANN training, in particular by overcoming the problem of local minima. A studied of models and methods for training neural networks using an extended vector of varying parameters is conducted. The training problem is formulated as a continuous mul-tidimensional unconditional optimization problem. The extended vector of varying parameters implies that it includes some parameters of activation functions in addition to weight coeffi-cients. The introduction of additional varying parameters does not change the architecture of a neural network, but makes it impossible to use the back propagation method. A number of gradient methods have been used to solve optimization problems. Different formulations of optimization problems and methods for their solution have been investigated according to ac-curacy and efficiency criteria. The analysis of the results of numerical experiments allowed us to conclude that it is expedient to expand the vector of varying parameters in the tasks of training ANNs with con-tinuous and differentiated activation functions. Despite the increase in the dimensionality of the optimization problem, the efficiency of the new formulation is higher than the generalized one. According to the authors, this is due to the fact that a significant share of computational costs in the generalized formulation falls on attempts to leave the neighborhood of local min-ima, while increasing the dimensionality of the solution space allows this to be done with much lower costs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信