训练神经网络的随机反馈对齐算法：为什么会对齐？

IF 6.3 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology Pub Date : 2024-05-01 DOI:10.1088/2632-2153/ad3ee5

Dominique Chu and Florian Bacho

{"title":"训练神经网络的随机反馈对齐算法：为什么会对齐？","authors":"Dominique Chu and Florian Bacho","doi":"10.1088/2632-2153/ad3ee5","DOIUrl":null,"url":null,"abstract":"Feedback alignment algorithms are an alternative to backpropagation to train neural networks, whereby some of the partial derivatives that are required to compute the gradient are replaced by random terms. This essentially transforms the update rule into a random walk in weight space. Surprisingly, learning still works with those algorithms, including training of deep neural networks. The performance of FA is generally attributed to an alignment of the update of the random walker with the true gradient—the eponymous gradient alignment—which drives an approximate gradient descent. The mechanism that leads to this alignment remains unclear, however. In this paper, we use mathematical reasoning and simulations to investigate gradient alignment. We observe that the feedback alignment update rule has fixed points, which correspond to extrema of the loss function. We show that gradient alignment is a stability criterion for those fixed points. It is only a necessary criterion for algorithm performance. Experimentally, we demonstrate that high levels of gradient alignment can lead to poor algorithm performance and that the alignment is not always driving the gradient descent.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Random feedback alignment algorithms to train neural networks: why do they align?\",\"authors\":\"Dominique Chu and Florian Bacho\",\"doi\":\"10.1088/2632-2153/ad3ee5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feedback alignment algorithms are an alternative to backpropagation to train neural networks, whereby some of the partial derivatives that are required to compute the gradient are replaced by random terms. This essentially transforms the update rule into a random walk in weight space. Surprisingly, learning still works with those algorithms, including training of deep neural networks. The performance of FA is generally attributed to an alignment of the update of the random walker with the true gradient—the eponymous gradient alignment—which drives an approximate gradient descent. The mechanism that leads to this alignment remains unclear, however. In this paper, we use mathematical reasoning and simulations to investigate gradient alignment. We observe that the feedback alignment update rule has fixed points, which correspond to extrema of the loss function. We show that gradient alignment is a stability criterion for those fixed points. It is only a necessary criterion for algorithm performance. Experimentally, we demonstrate that high levels of gradient alignment can lead to poor algorithm performance and that the alignment is not always driving the gradient descent.\",\"PeriodicalId\":33757,\"journal\":{\"name\":\"Machine Learning Science and Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning Science and Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/ad3ee5\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad3ee5","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

反馈排列算法是反向传播训练神经网络的一种替代方法，通过这种方法，计算梯度所需的部分导数被随机项所取代。这实质上是将更新规则转化为权重空间中的随机行走。令人惊讶的是，使用这些算法，包括训练深度神经网络，学习仍然有效。FA 的性能一般归功于随机行走的更新与真实梯度的对齐--即同名的梯度对齐--这推动了近似梯度下降。然而，导致这种对齐的机制仍不清楚。在本文中，我们利用数学推理和模拟来研究梯度配准。我们观察到，反馈对齐更新规则有固定点，这些固定点与损失函数的极值相对应。我们证明梯度对齐是这些固定点的稳定性标准。它只是算法性能的必要标准。我们通过实验证明，高水平的梯度配准会导致算法性能低下，而且配准并不总是驱动梯度下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Random feedback alignment algorithms to train neural networks: why do they align?

Feedback alignment algorithms are an alternative to backpropagation to train neural networks, whereby some of the partial derivatives that are required to compute the gradient are replaced by random terms. This essentially transforms the update rule into a random walk in weight space. Surprisingly, learning still works with those algorithms, including training of deep neural networks. The performance of FA is generally attributed to an alignment of the update of the random walker with the true gradient—the eponymous gradient alignment—which drives an approximate gradient descent. The mechanism that leads to this alignment remains unclear, however. In this paper, we use mathematical reasoning and simulations to investigate gradient alignment. We observe that the feedback alignment update rule has fixed points, which correspond to extrema of the loss function. We show that gradient alignment is a stability criterion for those fixed points. It is only a necessary criterion for algorithm performance. Experimentally, we demonstrate that high levels of gradient alignment can lead to poor algorithm performance and that the alignment is not always driving the gradient descent.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.