Learning dynamics of deep linear networks with multiple pathways.

Advances in neural information processing systems Pub Date : 2022-12-01

Jianghong Shi, Eric Shea-Brown, Michael A Buice

{"title":"Learning dynamics of deep linear networks with multiple pathways.","authors":"Jianghong Shi, Eric Shea-Brown, Michael A Buice","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Not only have deep networks become standard in machine learning, they are increasingly of interest in neuroscience as models of cortical computation that capture relationships between structural and functional properties. In addition they are a useful target of theoretical research into the properties of network computation. Deep networks typically have a serial or approximately serial organization across layers, and this is often mirrored in models that purport to represent computation in mammalian brains. There are, however, multiple examples of parallel pathways in mammalian brains. In some cases, such as the mouse, the entire visual system appears arranged in a largely parallel, rather than serial fashion. While these pathways may be formed by differing cost functions that drive different computations, here we present a new mathematical analysis of learning dynamics in networks that have parallel computational pathways driven by the same cost function. We use the approximation of deep linear networks with large hidden layer sizes to show that, as the depth of the parallel pathways increases, different features of the training set (defined by the singular values of the input-output correlation) will typically concentrate in one of the pathways. This result is derived analytically and demonstrated with numerical simulation with both linear and non-linear networks. Thus, rather than sharing stimulus and task features across multiple pathways, parallel network architectures learn to produce sharply diversified representations with specialized and specific pathways, a mechanism which may hold important consequences for codes in both biological and artificial systems.</p>","PeriodicalId":72099,"journal":{"name":"Advances in neural information processing systems","volume":"35 ","pages":"34064-34076"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10824491/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in neural information processing systems","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Not only have deep networks become standard in machine learning, they are increasingly of interest in neuroscience as models of cortical computation that capture relationships between structural and functional properties. In addition they are a useful target of theoretical research into the properties of network computation. Deep networks typically have a serial or approximately serial organization across layers, and this is often mirrored in models that purport to represent computation in mammalian brains. There are, however, multiple examples of parallel pathways in mammalian brains. In some cases, such as the mouse, the entire visual system appears arranged in a largely parallel, rather than serial fashion. While these pathways may be formed by differing cost functions that drive different computations, here we present a new mathematical analysis of learning dynamics in networks that have parallel computational pathways driven by the same cost function. We use the approximation of deep linear networks with large hidden layer sizes to show that, as the depth of the parallel pathways increases, different features of the training set (defined by the singular values of the input-output correlation) will typically concentrate in one of the pathways. This result is derived analytically and demonstrated with numerical simulation with both linear and non-linear networks. Thus, rather than sharing stimulus and task features across multiple pathways, parallel network architectures learn to produce sharply diversified representations with specialized and specific pathways, a mechanism which may hold important consequences for codes in both biological and artificial systems.

本刊更多论文

多通道深度线性网络的学习动力学

深度网络不仅已成为机器学习的标准，而且作为捕捉结构和功能特性之间关系的大脑皮层计算模型，越来越受到神经科学的关注。此外，它们还是网络计算特性理论研究的有用目标。深度网络通常具有跨层串行或近似串行的组织结构，这通常反映在声称代表哺乳动物大脑计算的模型中。然而，哺乳动物大脑中也有多个并行通路的例子。在某些情况下，例如在小鼠身上，整个视觉系统似乎是以大体平行而非串行的方式排列的。虽然这些通路可能是由驱动不同计算的不同成本函数形成的，但我们在这里提出了一种新的数学分析方法，用于分析由相同成本函数驱动的并行计算通路网络中的学习动态。我们使用具有较大隐层尺寸的深度线性网络近似来证明，随着并行路径深度的增加，训练集的不同特征（由输入输出相关性的奇异值定义）通常会集中在其中一条路径上。这一结果是通过线性和非线性网络的分析和数值模拟得出的。因此，并行网络架构不是在多条通路上共享刺激和任务特征，而是通过专门和特定的通路学习产生急剧多样化的表征，这种机制可能会对生物和人工系统中的代码产生重要影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advances in neural information processing systems

自引率

0.00%

发文量