Comparing the Modeling Powers of RNN and HMM

Achille Salaün, Y. Petetin, F. Desbouvries
{"title":"Comparing the Modeling Powers of RNN and HMM","authors":"Achille Salaün, Y. Petetin, F. Desbouvries","doi":"10.1109/ICMLA.2019.00246","DOIUrl":null,"url":null,"abstract":"Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM) are popular models for processing sequential data and have found many applications such as speech recognition, time series prediction or machine translation. Although both models have been extended in several ways (eg. Long Short Term Memory and Gated Recurrent Unit architectures, Variational RNN, partially observed Markov models...), their theoretical understanding remains partially open. In this context, our approach consists in classifying both models from an information geometry point of view. More precisely, both models can be used for modeling the distribution of a sequence of random observations from a set of latent variables; however, in RNN, the latent variable is deterministically deduced from the current observation and the previous latent variable, while, in HMM, the set of (random) latent variables is a Markov chain. In this paper, we first embed these two generative models into a generative unified model (GUM). We next consider the subclass of GUM models which yield a stationary Gaussian observations probability distribution function (pdf). Such pdf are characterized by their covariance sequence; we show that the GUM model can produce any stationary Gaussian distribution with geometrical covariance structure. We finally discuss about the modeling power of the HMM and RNN submodels, via their associated observations pdf: some observations pdf can be modeled by a RNN, but not by an HMM, and vice versa; some can be produced by both structures, up to a re-parameterization.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM) are popular models for processing sequential data and have found many applications such as speech recognition, time series prediction or machine translation. Although both models have been extended in several ways (eg. Long Short Term Memory and Gated Recurrent Unit architectures, Variational RNN, partially observed Markov models...), their theoretical understanding remains partially open. In this context, our approach consists in classifying both models from an information geometry point of view. More precisely, both models can be used for modeling the distribution of a sequence of random observations from a set of latent variables; however, in RNN, the latent variable is deterministically deduced from the current observation and the previous latent variable, while, in HMM, the set of (random) latent variables is a Markov chain. In this paper, we first embed these two generative models into a generative unified model (GUM). We next consider the subclass of GUM models which yield a stationary Gaussian observations probability distribution function (pdf). Such pdf are characterized by their covariance sequence; we show that the GUM model can produce any stationary Gaussian distribution with geometrical covariance structure. We finally discuss about the modeling power of the HMM and RNN submodels, via their associated observations pdf: some observations pdf can be modeled by a RNN, but not by an HMM, and vice versa; some can be produced by both structures, up to a re-parameterization.
比较RNN和HMM的建模能力
递归神经网络(RNN)和隐马尔可夫模型(HMM)是处理序列数据的常用模型,在语音识别、时间序列预测或机器翻译等领域有着广泛的应用。尽管这两种模型都以几种方式进行了扩展(例如。长短期记忆和门控循环单元架构,变分RNN,部分观察马尔可夫模型…),他们的理论理解仍然部分开放。在这种情况下,我们的方法包括从信息几何的角度对两种模型进行分类。更准确地说,这两种模型都可以用于模拟一组潜在变量的随机观测序列的分布;然而,在RNN中,潜变量是由当前观测值和前一个潜变量确定性地推导出来的,而在HMM中,(随机)潜变量集是一个马尔可夫链。在本文中,我们首先将这两个生成模型嵌入到一个生成统一模型(GUM)中。接下来,我们考虑产生平稳高斯观测概率分布函数(pdf)的GUM模型子类。这种pdf的特征是它们的协方差序列;我们证明了GUM模型可以产生具有几何协方差结构的任意平稳高斯分布。我们最后讨论了HMM和RNN子模型的建模能力,通过它们的关联观测值pdf:一些观测值pdf可以被RNN建模,但不能被HMM建模,反之亦然;有些可以由两种结构产生,直到重新参数化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信