Information Bottleneck in Deep Learning - A Semiotic Approach

Bogdan Musat, Razvan Andonie
{"title":"Information Bottleneck in Deep Learning - A Semiotic Approach","authors":"Bogdan Musat, Razvan Andonie","doi":"10.15837/ijccc.2022.1.4650","DOIUrl":null,"url":null,"abstract":"\n \n \nThe information bottleneck principle was recently proposed as a theory meant to explain some of the training dynamics of deep neural architectures. Via information plane analysis, patterns start to emerge in this framework, where two phases can be distinguished: fitting and compression. We take a step further and study the behaviour of the spatial entropy characterizing the layers of convolutional neural networks (CNNs), in relation to the information bottleneck theory. We observe pattern formations which resemble the information bottleneck fitting and compression phases. From the perspective of semiotics, also known as the study of signs and sign-using behavior, the saliency maps of CNN’s layers exhibit aggregations: signs are aggregated into supersigns and this process is called semiotic superization. Superization can be characterized by a decrease of entropy and interpreted as information concentration. We discuss the information bottleneck principle from the perspective of semiotic superization and discover very interesting analogies related to the informational adaptation of the model. In a practical application, we introduce a modification of the CNN training process: we progressively freeze the layers with small entropy variation of their saliency map representation. Such layers can be stopped earlier from training without a significant impact on the performance (the accuracy) of the network, connecting the entropy evolution through time with the training dynamics of a network. \n \n \n","PeriodicalId":179619,"journal":{"name":"Int. J. Comput. Commun. Control","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Commun. Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15837/ijccc.2022.1.4650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The information bottleneck principle was recently proposed as a theory meant to explain some of the training dynamics of deep neural architectures. Via information plane analysis, patterns start to emerge in this framework, where two phases can be distinguished: fitting and compression. We take a step further and study the behaviour of the spatial entropy characterizing the layers of convolutional neural networks (CNNs), in relation to the information bottleneck theory. We observe pattern formations which resemble the information bottleneck fitting and compression phases. From the perspective of semiotics, also known as the study of signs and sign-using behavior, the saliency maps of CNN’s layers exhibit aggregations: signs are aggregated into supersigns and this process is called semiotic superization. Superization can be characterized by a decrease of entropy and interpreted as information concentration. We discuss the information bottleneck principle from the perspective of semiotic superization and discover very interesting analogies related to the informational adaptation of the model. In a practical application, we introduce a modification of the CNN training process: we progressively freeze the layers with small entropy variation of their saliency map representation. Such layers can be stopped earlier from training without a significant impact on the performance (the accuracy) of the network, connecting the entropy evolution through time with the training dynamics of a network.
深度学习中的信息瓶颈——符号学方法
信息瓶颈原理是最近提出的一种理论,旨在解释深度神经结构的一些训练动态。通过信息平面分析,模式开始在这个框架中出现,其中可以区分两个阶段:拟合和压缩。我们进一步研究了与信息瓶颈理论相关的卷积神经网络(cnn)层的空间熵行为。我们观察到类似于信息瓶颈拟合和压缩阶段的模式形成。从符号学的角度来看,也就是对符号和符号使用行为的研究,CNN的层的显著性图表现出聚集性:符号被聚集成超符号,这个过程被称为符号超级化。超化可以表现为熵的减少,并被解释为信息的集中。我们从符号学的角度讨论了信息瓶颈原理,并发现了与模型的信息适应相关的非常有趣的类比。在实际应用中,我们引入了对CNN训练过程的一种修改:我们逐步冻结其显著性图表示的熵变化较小的层。这些层可以更早地停止训练,而不会对网络的性能(精度)产生重大影响,将熵随时间的演变与网络的训练动态联系起来。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信