Jaynes machine: The universal microstructure of deep neural networks

IF 3.9 2区工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers & Chemical Engineering Pub Date : 2024-11-04 DOI:10.1016/j.compchemeng.2024.108908

Venkat Venkatasubramanian , N. Sanjeevrajan , Manasi Khandekar , Abhishek Sivaram , Collin Szczepanski

{"title":"Jaynes machine: The universal microstructure of deep neural networks","authors":"Venkat Venkatasubramanian , N. Sanjeevrajan , Manasi Khandekar , Abhishek Sivaram , Collin Szczepanski","doi":"10.1016/j.compchemeng.2024.108908","DOIUrl":null,"url":null,"abstract":"<div><div>Despite the recent stunning progress in large-scale deep neural network applications, our understanding of their microstructure, ‘energy’ functions, and optimal design remains incomplete. Here, we present a new game-theoretic framework, called statistical teleodynamics, that reveals important insights into these key properties. The optimally robust design of such networks inherently involves computational benefit–cost trade-offs that physics-inspired models do not adequately capture. These trade-offs occur as neurons and connections compete to increase their effective utilities under resource constraints during training. In a fully trained network, this results in a state of arbitrage equilibrium, where all neurons in a given layer have the same effective utility, and all connections to a given layer have the same effective utility. The equilibrium is characterized by the emergence of two lognormal distributions of connection weights and neuronal output as the universal microstructure of large deep neural networks. We call such a network the Jaynes Machine. Our theoretical predictions are shown to be supported by empirical data from seven large-scale deep neural networks. We also show that the Hopfield network and the Boltzmann Machine are the same special case of the Jaynes Machine.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"192 ","pages":"Article 108908"},"PeriodicalIF":3.9000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135424003260","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Despite the recent stunning progress in large-scale deep neural network applications, our understanding of their microstructure, ‘energy’ functions, and optimal design remains incomplete. Here, we present a new game-theoretic framework, called statistical teleodynamics, that reveals important insights into these key properties. The optimally robust design of such networks inherently involves computational benefit–cost trade-offs that physics-inspired models do not adequately capture. These trade-offs occur as neurons and connections compete to increase their effective utilities under resource constraints during training. In a fully trained network, this results in a state of arbitrage equilibrium, where all neurons in a given layer have the same effective utility, and all connections to a given layer have the same effective utility. The equilibrium is characterized by the emergence of two lognormal distributions of connection weights and neuronal output as the universal microstructure of large deep neural networks. We call such a network the Jaynes Machine. Our theoretical predictions are shown to be supported by empirical data from seven large-scale deep neural networks. We also show that the Hopfield network and the Boltzmann Machine are the same special case of the Jaynes Machine.

查看原文本刊更多论文

杰恩斯机器深度神经网络的通用微观结构

尽管最近在大规模深度神经网络应用方面取得了令人惊叹的进展，但我们对其微观结构、"能量 "函数和优化设计的理解仍不全面。在这里，我们提出了一个新的博弈论框架，称为统计远程动力学，揭示了对这些关键特性的重要见解。此类网络的最佳稳健设计本质上涉及计算效益-成本权衡，而物理启发模型并不能充分捕捉到这一点。在训练过程中，神经元和连接会在资源限制下竞相提高其有效效用，从而产生这些权衡。在一个经过充分训练的网络中，这会导致一种套利平衡状态，即给定层中的所有神经元都具有相同的有效效用，而通向给定层的所有连接都具有相同的有效效用。这种平衡状态的特点是，连接权重和神经元输出出现了两个对数正态分布，这是大型深度神经网络的普遍微观结构。我们将这种网络称为杰恩斯机器。七个大型深度神经网络的经验数据证明了我们的理论预测。我们还证明，Hopfield 网络和玻尔兹曼机是杰恩斯机的相同特例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Chemical Engineering 工程技术-工程：化工

CiteScore

8.70

自引率

14.00%

发文量

374

审稿时长

70 days

期刊介绍： Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.