A theory of synaptic neural balance: From local to global order

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Pierre Baldi, Antonios Alexos, Ian Domingo, Alireza Rahmansetayesh
{"title":"A theory of synaptic neural balance: From local to global order","authors":"Pierre Baldi,&nbsp;Antonios Alexos,&nbsp;Ian Domingo,&nbsp;Alireza Rahmansetayesh","doi":"10.1016/j.artint.2025.104360","DOIUrl":null,"url":null,"abstract":"<div><div>We develop a general theory of synaptic neural balance and how it can emerge or be enforced in neural networks. For a given additive cost function <em>R</em> (regularizer), a neuron is said to be in balance if the total cost of its input weights is equal to the total cost of its output weights. The basic example is provided by feedforward networks of ReLU units trained with <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> regularizers, which exhibit balance after proper training. The theory explains this phenomenon and extends it in several directions. The first direction is the extension to bilinear and other activation functions. The second direction is the extension to more general regularizers, including all <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span> (<span><math><mi>p</mi><mo>&gt;</mo><mn>0</mn></math></span>) regularizers. The third direction is the extension to non-layered architectures, recurrent architectures, convolutional architectures, as well as architectures with mixed activation functions and to different balancing algorithms. Gradient descent on the error function alone does not converge in general to a balanced state, where every neuron is in balance, even when starting from a balanced state. However, gradient descent on the regularized error function ought to converge to a balanced state, and thus network balance can be used to assess learning progress. The theory is based on two local neuronal operations: scaling which is commutative, and balancing which is not commutative. Finally, and most importantly, given any set of weights, when local balancing operations are applied to each neuron in a stochastic manner, global order always emerges through the convergence of the stochastic balancing algorithm to the same unique set of balanced weights. The reason for this convergence is the existence of an underlying strictly convex optimization problem where the relevant variables are constrained to a linear, only architecture-dependent, manifold. Simulations show that balancing neurons prior to learning, or during learning in alternation with gradient descent steps, can improve learning speed and performance thereby expanding the arsenal of available training tools. Scaling and balancing operations are entirely local and thus physically plausible in biological and neuromorphic neural networks.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"346 ","pages":"Article 104360"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370225000797","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

We develop a general theory of synaptic neural balance and how it can emerge or be enforced in neural networks. For a given additive cost function R (regularizer), a neuron is said to be in balance if the total cost of its input weights is equal to the total cost of its output weights. The basic example is provided by feedforward networks of ReLU units trained with L2 regularizers, which exhibit balance after proper training. The theory explains this phenomenon and extends it in several directions. The first direction is the extension to bilinear and other activation functions. The second direction is the extension to more general regularizers, including all Lp (p>0) regularizers. The third direction is the extension to non-layered architectures, recurrent architectures, convolutional architectures, as well as architectures with mixed activation functions and to different balancing algorithms. Gradient descent on the error function alone does not converge in general to a balanced state, where every neuron is in balance, even when starting from a balanced state. However, gradient descent on the regularized error function ought to converge to a balanced state, and thus network balance can be used to assess learning progress. The theory is based on two local neuronal operations: scaling which is commutative, and balancing which is not commutative. Finally, and most importantly, given any set of weights, when local balancing operations are applied to each neuron in a stochastic manner, global order always emerges through the convergence of the stochastic balancing algorithm to the same unique set of balanced weights. The reason for this convergence is the existence of an underlying strictly convex optimization problem where the relevant variables are constrained to a linear, only architecture-dependent, manifold. Simulations show that balancing neurons prior to learning, or during learning in alternation with gradient descent steps, can improve learning speed and performance thereby expanding the arsenal of available training tools. Scaling and balancing operations are entirely local and thus physically plausible in biological and neuromorphic neural networks.
突触神经平衡理论:从局部秩序到全局秩序
我们发展了突触神经平衡的一般理论,以及它如何在神经网络中出现或被强制执行。对于给定的加性代价函数R(正则化器),如果一个神经元的输入权值的总代价等于输出权值的总代价,则该神经元处于平衡状态。用L2正则化器训练的ReLU单元前馈网络提供了一个基本的例子,经过适当的训练后,ReLU单元表现出平衡。该理论解释了这一现象,并在几个方向上进行了扩展。第一个方向是对双线性和其他激活函数的扩展。第二个方向是扩展到更一般的正则子,包括所有Lp (p>0)正则子。第三个方向是扩展到非分层架构、循环架构、卷积架构、混合激活函数架构和不同的平衡算法。单独的误差函数的梯度下降通常不会收敛到平衡状态,在平衡状态下,每个神经元都处于平衡状态,即使从平衡状态开始。然而,正则化误差函数上的梯度下降应该收敛到平衡状态,因此网络平衡可以用来评估学习进度。该理论基于两种局部神经元操作:伸缩是可交换的,平衡是不可交换的。最后,也是最重要的是,给定任何一组权值,当以随机方式对每个神经元应用局部平衡操作时,随机平衡算法总是通过收敛到相同的唯一的平衡权值集而出现全局顺序。这种收敛的原因是存在一个潜在的严格凸优化问题,其中相关变量被约束为线性的,仅与体系结构相关的流形。仿真表明,在学习前或学习过程中,通过梯度下降步骤交替平衡神经元,可以提高学习速度和性能,从而扩展可用训练工具的武器库。缩放和平衡操作完全是局部的,因此在生物和神经形态神经网络中物理上是可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence
Artificial Intelligence 工程技术-计算机:人工智能
CiteScore
11.20
自引率
1.40%
发文量
118
审稿时长
8 months
期刊介绍: The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信