矩阵反向传播的统一框架。

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-09-16 DOI:10.1109/tnnls.2025.3607405

Gatien Darley,Stephane Bonnet

{"title":"矩阵反向传播的统一框架。","authors":"Gatien Darley,Stephane Bonnet","doi":"10.1109/tnnls.2025.3607405","DOIUrl":null,"url":null,"abstract":"Computing matrix gradient has become a key aspect in modern signal processing/machine learning, with the recent use of matrix neural networks requiring matrix backpropagation. In this field, two main methods exist to calculate the gradient of matrix functions for symmetric positive definite (SPD) matrices, namely, the Daleckiǐ-Kreǐn/Bhatia formula and the Ionescu method. However, there appear to be a few errors. This brief aims to demonstrate each of these formulas in a self-contained and unified framework, to prove theoretically their equivalence, and to clarify inaccurate results of the literature. A numerical comparison of both methods is also provided in terms of computational speed and numerical stability to show the superiority of the Daleckiǐ-Kreǐn/Bhatia approach. We also extend the matrix gradient to the general case of diagonalizable matrices. Convincing results with the two backpropagation methods are shown on the EEG-based BCI competition dataset with the implementation of an SPDNet, yielding around 80% accuracy for one subject. Daleckiǐ-Kreǐn/Bhatia formula achieves an 8% time gain during training and handles degenerate cases.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"84 1","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Unified Framework for Matrix Backpropagation.\",\"authors\":\"Gatien Darley,Stephane Bonnet\",\"doi\":\"10.1109/tnnls.2025.3607405\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computing matrix gradient has become a key aspect in modern signal processing/machine learning, with the recent use of matrix neural networks requiring matrix backpropagation. In this field, two main methods exist to calculate the gradient of matrix functions for symmetric positive definite (SPD) matrices, namely, the Daleckiǐ-Kreǐn/Bhatia formula and the Ionescu method. However, there appear to be a few errors. This brief aims to demonstrate each of these formulas in a self-contained and unified framework, to prove theoretically their equivalence, and to clarify inaccurate results of the literature. A numerical comparison of both methods is also provided in terms of computational speed and numerical stability to show the superiority of the Daleckiǐ-Kreǐn/Bhatia approach. We also extend the matrix gradient to the general case of diagonalizable matrices. Convincing results with the two backpropagation methods are shown on the EEG-based BCI competition dataset with the implementation of an SPDNet, yielding around 80% accuracy for one subject. Daleckiǐ-Kreǐn/Bhatia formula achieves an 8% time gain during training and handles degenerate cases.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"84 1\",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tnnls.2025.3607405\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3607405","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

计算矩阵梯度已经成为现代信号处理/机器学习的一个关键方面，最近使用的矩阵神经网络需要矩阵反向传播。在该领域中，计算对称正定矩阵（SPD）的矩阵函数梯度主要有两种方法，即Daleckiǐ-Kreǐn/Bhatia公式和Ionescu方法。然而，似乎有一些错误。本文的目的是在一个独立和统一的框架中展示这些公式中的每一个，从理论上证明它们的等价性，并澄清文献中不准确的结果。两种方法在计算速度和数值稳定性方面也进行了数值比较，以显示Daleckiǐ-Kreǐn/Bhatia方法的优越性。我们也将矩阵梯度推广到可对角矩阵的一般情况。两种反向传播方法在基于脑电图的BCI竞争数据集上显示了令人信服的结果，并实现了SPDNet，对一个主题的准确率约为80%。Daleckiǐ-Kreǐn/Bhatia公式在训练和处理退化情况时获得8%的时间增益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Unified Framework for Matrix Backpropagation.

Computing matrix gradient has become a key aspect in modern signal processing/machine learning, with the recent use of matrix neural networks requiring matrix backpropagation. In this field, two main methods exist to calculate the gradient of matrix functions for symmetric positive definite (SPD) matrices, namely, the Daleckiǐ-Kreǐn/Bhatia formula and the Ionescu method. However, there appear to be a few errors. This brief aims to demonstrate each of these formulas in a self-contained and unified framework, to prove theoretically their equivalence, and to clarify inaccurate results of the literature. A numerical comparison of both methods is also provided in terms of computational speed and numerical stability to show the superiority of the Daleckiǐ-Kreǐn/Bhatia approach. We also extend the matrix gradient to the general case of diagonalizable matrices. Convincing results with the two backpropagation methods are shown on the EEG-based BCI competition dataset with the implementation of an SPDNet, yielding around 80% accuracy for one subject. Daleckiǐ-Kreǐn/Bhatia formula achieves an 8% time gain during training and handles degenerate cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.