On the limits of neural network explainability via descrambling

IF 2.6 2区 数学 Q1 MATHEMATICS, APPLIED
Shashank Sule , Richard G. Spencer , Wojciech Czaja
{"title":"On the limits of neural network explainability via descrambling","authors":"Shashank Sule ,&nbsp;Richard G. Spencer ,&nbsp;Wojciech Czaja","doi":"10.1016/j.acha.2025.101793","DOIUrl":null,"url":null,"abstract":"<div><div>We characterize the exact solutions to <em>neural network descrambling</em>–a mathematical model for explaining the fully connected layers of trained neural networks (NNs). By reformulating the problem to the minimization of the Brockett function arising in graph matching and complexity theory we show that the principal components of the hidden layer preactivations can be characterized as the optimal “explainers” or <em>descramblers</em> for the layer weights, leading to <em>descrambled</em> weight matrices. We show that in typical deep learning contexts these descramblers take diverse and interesting forms including (1) matching largest principal components with the lowest frequency modes of the Fourier basis for isotropic hidden data, (2) discovering the semantic development in two-layer linear NNs for signal recovery problems, and (3) explaining CNNs by optimally permuting the neurons. Our numerical experiments indicate that the eigendecompositions of the hidden layer data–now understood as the descramblers–can also reveal the layer's underlying transformation. These results illustrate that the SVD is more directly related to the explainability of NNs than previously thought and offers a promising avenue for discovering interpretable motifs for the hidden action of NNs, especially in contexts of operator learning or physics-informed NNs, where the input/output data has limited human readability.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101793"},"PeriodicalIF":2.6000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied and Computational Harmonic Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1063520325000478","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

We characterize the exact solutions to neural network descrambling–a mathematical model for explaining the fully connected layers of trained neural networks (NNs). By reformulating the problem to the minimization of the Brockett function arising in graph matching and complexity theory we show that the principal components of the hidden layer preactivations can be characterized as the optimal “explainers” or descramblers for the layer weights, leading to descrambled weight matrices. We show that in typical deep learning contexts these descramblers take diverse and interesting forms including (1) matching largest principal components with the lowest frequency modes of the Fourier basis for isotropic hidden data, (2) discovering the semantic development in two-layer linear NNs for signal recovery problems, and (3) explaining CNNs by optimally permuting the neurons. Our numerical experiments indicate that the eigendecompositions of the hidden layer data–now understood as the descramblers–can also reveal the layer's underlying transformation. These results illustrate that the SVD is more directly related to the explainability of NNs than previously thought and offers a promising avenue for discovering interpretable motifs for the hidden action of NNs, especially in contexts of operator learning or physics-informed NNs, where the input/output data has limited human readability.
解扰论神经网络可解释性的局限性
我们描述了神经网络解码器的精确解决方案,这是一种用于解释训练神经网络(nn)的完全连接层的数学模型。通过将问题重新表述为最小化图匹配和复杂性理论中出现的Brockett函数,我们表明隐藏层预激活的主成分可以被表征为层权重的最优“解释者”或解扰者,从而导致解扰权重矩阵。我们表明,在典型的深度学习环境中,这些解码器采取多种有趣的形式,包括(1)将各向同性隐藏数据的最大主成分与傅里叶基的最低频率模式匹配,(2)发现信号恢复问题的双层线性nn中的语义发展,以及(3)通过优化排列神经元来解释cnn。我们的数值实验表明,隐藏层数据的特征分解(现在被理解为解密器)也可以揭示该层的底层变换。这些结果表明,SVD与神经网络的可解释性比以前认为的更直接相关,并为发现神经网络隐藏动作的可解释动机提供了一条有希望的途径,特别是在算子学习或物理信息神经网络的背景下,其中输入/输出数据限制了人类的可读性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied and Computational Harmonic Analysis
Applied and Computational Harmonic Analysis 物理-物理:数学物理
CiteScore
5.40
自引率
4.00%
发文量
67
审稿时长
22.9 weeks
期刊介绍: Applied and Computational Harmonic Analysis (ACHA) is an interdisciplinary journal that publishes high-quality papers in all areas of mathematical sciences related to the applied and computational aspects of harmonic analysis, with special emphasis on innovative theoretical development, methods, and algorithms, for information processing, manipulation, understanding, and so forth. The objectives of the journal are to chronicle the important publications in the rapidly growing field of data representation and analysis, to stimulate research in relevant interdisciplinary areas, and to provide a common link among mathematical, physical, and life scientists, as well as engineers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信