Jaynes machine: The universal microstructure of deep neural networks

IF 3.9 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Venkat Venkatasubramanian , N. Sanjeevrajan , Manasi Khandekar , Abhishek Sivaram , Collin Szczepanski
{"title":"Jaynes machine: The universal microstructure of deep neural networks","authors":"Venkat Venkatasubramanian ,&nbsp;N. Sanjeevrajan ,&nbsp;Manasi Khandekar ,&nbsp;Abhishek Sivaram ,&nbsp;Collin Szczepanski","doi":"10.1016/j.compchemeng.2024.108908","DOIUrl":null,"url":null,"abstract":"<div><div>Despite the recent stunning progress in large-scale deep neural network applications, our understanding of their microstructure, ‘energy’ functions, and optimal design remains incomplete. Here, we present a new game-theoretic framework, called statistical teleodynamics, that reveals important insights into these key properties. The optimally robust design of such networks inherently involves computational benefit–cost trade-offs that physics-inspired models do not adequately capture. These trade-offs occur as neurons and connections compete to increase their effective utilities under resource constraints during training. In a fully trained network, this results in a state of arbitrage equilibrium, where all neurons in a given layer have the same effective utility, and all connections to a given layer have the same effective utility. The equilibrium is characterized by the emergence of two lognormal distributions of connection weights and neuronal output as the universal microstructure of large deep neural networks. We call such a network the Jaynes Machine. Our theoretical predictions are shown to be supported by empirical data from seven large-scale deep neural networks. We also show that the Hopfield network and the Boltzmann Machine are the same special case of the Jaynes Machine.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"192 ","pages":"Article 108908"},"PeriodicalIF":3.9000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135424003260","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Despite the recent stunning progress in large-scale deep neural network applications, our understanding of their microstructure, ‘energy’ functions, and optimal design remains incomplete. Here, we present a new game-theoretic framework, called statistical teleodynamics, that reveals important insights into these key properties. The optimally robust design of such networks inherently involves computational benefit–cost trade-offs that physics-inspired models do not adequately capture. These trade-offs occur as neurons and connections compete to increase their effective utilities under resource constraints during training. In a fully trained network, this results in a state of arbitrage equilibrium, where all neurons in a given layer have the same effective utility, and all connections to a given layer have the same effective utility. The equilibrium is characterized by the emergence of two lognormal distributions of connection weights and neuronal output as the universal microstructure of large deep neural networks. We call such a network the Jaynes Machine. Our theoretical predictions are shown to be supported by empirical data from seven large-scale deep neural networks. We also show that the Hopfield network and the Boltzmann Machine are the same special case of the Jaynes Machine.
杰恩斯机器深度神经网络的通用微观结构
尽管最近在大规模深度神经网络应用方面取得了令人惊叹的进展,但我们对其微观结构、"能量 "函数和优化设计的理解仍不全面。在这里,我们提出了一个新的博弈论框架,称为统计远程动力学,揭示了对这些关键特性的重要见解。此类网络的最佳稳健设计本质上涉及计算效益-成本权衡,而物理启发模型并不能充分捕捉到这一点。在训练过程中,神经元和连接会在资源限制下竞相提高其有效效用,从而产生这些权衡。在一个经过充分训练的网络中,这会导致一种套利平衡状态,即给定层中的所有神经元都具有相同的有效效用,而通向给定层的所有连接都具有相同的有效效用。这种平衡状态的特点是,连接权重和神经元输出出现了两个对数正态分布,这是大型深度神经网络的普遍微观结构。我们将这种网络称为杰恩斯机器。七个大型深度神经网络的经验数据证明了我们的理论预测。我们还证明,Hopfield 网络和玻尔兹曼机是杰恩斯机的相同特例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Chemical Engineering
Computers & Chemical Engineering 工程技术-工程:化工
CiteScore
8.70
自引率
14.00%
发文量
374
审稿时长
70 days
期刊介绍: Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信