Understanding encoder–decoder structures in machine learning using information measures

IF 3.4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Jorge F. Silva , Victor Faraggi , Camilo Ramirez , Alvaro Egaña , Eduardo Pavez
{"title":"Understanding encoder–decoder structures in machine learning using information measures","authors":"Jorge F. Silva ,&nbsp;Victor Faraggi ,&nbsp;Camilo Ramirez ,&nbsp;Alvaro Egaña ,&nbsp;Eduardo Pavez","doi":"10.1016/j.sigpro.2025.109983","DOIUrl":null,"url":null,"abstract":"<div><div>We present a theory of representation learning to model and understand the role of encoder–decoder design in machine learning (ML) from an information-theoretic angle. We use two main information concepts, information sufficiency (IS) and mutual information loss to represent predictive structures in machine learning. Our first main result provides a functional expression that characterizes the class of probabilistic models consistent with an IS encoder–decoder latent predictive structure. This result formally justifies the encoder–decoder forward stages many modern ML architectures adopt to learn latent (compressed) representations for classification. To illustrate IS as a realistic and relevant model assumption, we revisit some known ML concepts and present some interesting new examples: invariant, robust, sparse, and digital models. Furthermore, our IS characterization allows us to tackle the fundamental question of how much performance could be lost, using the cross entropy risk, when a given encoder–decoder architecture is adopted in a learning setting. Here, our second main result shows that a mutual information loss quantifies the lack of expressiveness attributed to the choice of a (biased) encoder–decoder ML design. Finally, we address the problem of universal cross-entropy learning with an encoder–decoder design where necessary and sufficiency conditions are established to meet this requirement. In all these results, Shannon’s information measures offer new interpretations and explanations for representation learning.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"234 ","pages":"Article 109983"},"PeriodicalIF":3.4000,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168425000970","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

We present a theory of representation learning to model and understand the role of encoder–decoder design in machine learning (ML) from an information-theoretic angle. We use two main information concepts, information sufficiency (IS) and mutual information loss to represent predictive structures in machine learning. Our first main result provides a functional expression that characterizes the class of probabilistic models consistent with an IS encoder–decoder latent predictive structure. This result formally justifies the encoder–decoder forward stages many modern ML architectures adopt to learn latent (compressed) representations for classification. To illustrate IS as a realistic and relevant model assumption, we revisit some known ML concepts and present some interesting new examples: invariant, robust, sparse, and digital models. Furthermore, our IS characterization allows us to tackle the fundamental question of how much performance could be lost, using the cross entropy risk, when a given encoder–decoder architecture is adopted in a learning setting. Here, our second main result shows that a mutual information loss quantifies the lack of expressiveness attributed to the choice of a (biased) encoder–decoder ML design. Finally, we address the problem of universal cross-entropy learning with an encoder–decoder design where necessary and sufficiency conditions are established to meet this requirement. In all these results, Shannon’s information measures offer new interpretations and explanations for representation learning.
使用信息度量理解机器学习中的编码器-解码器结构
我们提出了一种表征学习理论,从信息论的角度来建模和理解编码器-解码器设计在机器学习(ML)中的作用。我们使用两个主要的信息概念,信息充分性(IS)和互信息损失来表示机器学习中的预测结构。我们的第一个主要结果提供了一个函数表达式,该表达式表征了与IS编码器-解码器潜在预测结构一致的概率模型类别。这一结果正式证明了许多现代ML架构采用编码器-解码器前向阶段来学习用于分类的潜在(压缩)表示。为了说明IS是一个现实的和相关的模型假设,我们重新审视了一些已知的ML概念,并提出了一些有趣的新例子:不变的、鲁棒的、稀疏的和数字模型。此外,我们的IS特性使我们能够解决一个基本问题,即当在学习环境中采用给定的编码器-解码器架构时,使用交叉熵风险可以损失多少性能。在这里,我们的第二个主要结果表明,互信息损失量化了由于选择(有偏差的)编码器-解码器ML设计而导致的表达性缺乏。最后,我们用编码器-解码器设计解决了通用交叉熵学习的问题,其中建立了满足这一要求的充分必要条件。在所有这些结果中,香农的信息测量为表征学习提供了新的解释和解释。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Signal Processing
Signal Processing 工程技术-工程:电子与电气
CiteScore
9.20
自引率
9.10%
发文量
309
审稿时长
41 days
期刊介绍: Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing. Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信