关于梯度流高斯多指标模型的学习,第一部分:一般性质和双时间尺度学习

IF 2.7 1区 数学 Q1 MATHEMATICS
Alberto Bietti, Joan Bruna, Loucas Pillaud-Vivien
{"title":"关于梯度流高斯多指标模型的学习,第一部分:一般性质和双时间尺度学习","authors":"Alberto Bietti,&nbsp;Joan Bruna,&nbsp;Loucas Pillaud-Vivien","doi":"10.1002/cpa.70006","DOIUrl":null,"url":null,"abstract":"<p>We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear projection and an arbitrary unknown, low-dimensional link function. As such, they constitute a natural template for feature learning in neural networks. We consider a two-timescale algorithm, whereby the low-dimensional link function is learnt with a non-parametric model infinitely faster than the subspace parametrizing the low-rank projection. By appropriately exploiting the matrix semigroup structure arising over the subspace correlation matrices, we establish global convergence of the resulting Grassmannian gradient flow dynamics, and provide a quantitative description of its associated “saddle-to-saddle” dynamics. Notably, the timescales associated with each saddle can be explicitly characterized in terms of an appropriate Hermite decomposition of the target link function.</p>","PeriodicalId":10601,"journal":{"name":"Communications on Pure and Applied Mathematics","volume":"78 12","pages":"2354-2435"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On learning Gaussian multi-index models with gradient flow part I: General properties and two-timescale learning\",\"authors\":\"Alberto Bietti,&nbsp;Joan Bruna,&nbsp;Loucas Pillaud-Vivien\",\"doi\":\"10.1002/cpa.70006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear projection and an arbitrary unknown, low-dimensional link function. As such, they constitute a natural template for feature learning in neural networks. We consider a two-timescale algorithm, whereby the low-dimensional link function is learnt with a non-parametric model infinitely faster than the subspace parametrizing the low-rank projection. By appropriately exploiting the matrix semigroup structure arising over the subspace correlation matrices, we establish global convergence of the resulting Grassmannian gradient flow dynamics, and provide a quantitative description of its associated “saddle-to-saddle” dynamics. Notably, the timescales associated with each saddle can be explicitly characterized in terms of an appropriate Hermite decomposition of the target link function.</p>\",\"PeriodicalId\":10601,\"journal\":{\"name\":\"Communications on Pure and Applied Mathematics\",\"volume\":\"78 12\",\"pages\":\"2354-2435\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications on Pure and Applied Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cpa.70006\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications on Pure and Applied Mathematics","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpa.70006","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

摘要

本文研究了高维高斯数据的多指标回归问题的梯度流。多指标函数由一个未知的低秩线性投影和一个任意未知的低维链接函数组成。因此,它们构成了神经网络特征学习的自然模板。我们考虑了一种双时间尺度算法,通过非参数模型学习低维链接函数比子空间参数化低秩投影要快得多。通过适当地利用子空间相关矩阵上产生的矩阵半群结构,我们建立了所得到的格拉斯曼梯度流动动力学的全局收敛性,并提供了其相关的“鞍到鞍”动力学的定量描述。值得注意的是,与每个鞍座相关的时间尺度可以根据目标链接函数的适当Hermite分解来明确表征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

On learning Gaussian multi-index models with gradient flow part I: General properties and two-timescale learning

On learning Gaussian multi-index models with gradient flow part I: General properties and two-timescale learning

We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear projection and an arbitrary unknown, low-dimensional link function. As such, they constitute a natural template for feature learning in neural networks. We consider a two-timescale algorithm, whereby the low-dimensional link function is learnt with a non-parametric model infinitely faster than the subspace parametrizing the low-rank projection. By appropriately exploiting the matrix semigroup structure arising over the subspace correlation matrices, we establish global convergence of the resulting Grassmannian gradient flow dynamics, and provide a quantitative description of its associated “saddle-to-saddle” dynamics. Notably, the timescales associated with each saddle can be explicitly characterized in terms of an appropriate Hermite decomposition of the target link function.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.70
自引率
3.30%
发文量
59
审稿时长
>12 weeks
期刊介绍: Communications on Pure and Applied Mathematics (ISSN 0010-3640) is published monthly, one volume per year, by John Wiley & Sons, Inc. © 2019. The journal primarily publishes papers originating at or solicited by the Courant Institute of Mathematical Sciences. It features recent developments in applied mathematics, mathematical physics, and mathematical analysis. The topics include partial differential equations, computer science, and applied mathematics. CPAM is devoted to mathematical contributions to the sciences; both theoretical and applied papers, of original or expository type, are included.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信