关于曲线指数族的随机逼近期望最大化算法

IF 0.6 4区 数学 Q4 STATISTICS & PROBABILITY
Vianney Debavelaere, S. Allassonnière
{"title":"关于曲线指数族的随机逼近期望最大化算法","authors":"Vianney Debavelaere, S. Allassonnière","doi":"10.1051/ps/2021015","DOIUrl":null,"url":null,"abstract":"The Expectation-Maximization Algorithm (EM) is a widely used method allowing to estimate the maximum likelihood of models involving latent variables. When the Expectation step cannot be computed easily, one can use stochastic versions of the EM such as the Stochastic Approximation EM. This algorithm, however, has the drawback to require the joint likelihood to belong to the curved exponential family. To overcome this problem, [16] introduced a rewriting of the model which “exponentializes” it by considering the parameter as an additional latent variable following a Normal distribution centered on the newly defined parameters and with fixed variance. The likelihood of this new exponentialized model now belongs to the curved exponential family. Although often used, there is no guarantee that the estimated mean is close to the maximum likelihood estimate of the initial model. In this paper, we quantify the error done in this estimation while considering the exponentialized model instead of the initial one. By verifying those results on an example, we see that a trade-off must be made between the speed of convergence and the tolerated error. Finally, we propose a new algorithm allowing a better estimation of the parameter in a reasonable computation time to reduce the bias.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"41 1","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"On the curved exponential family in the Stochastic Approximation Expectation Maximization Algorithm\",\"authors\":\"Vianney Debavelaere, S. Allassonnière\",\"doi\":\"10.1051/ps/2021015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Expectation-Maximization Algorithm (EM) is a widely used method allowing to estimate the maximum likelihood of models involving latent variables. When the Expectation step cannot be computed easily, one can use stochastic versions of the EM such as the Stochastic Approximation EM. This algorithm, however, has the drawback to require the joint likelihood to belong to the curved exponential family. To overcome this problem, [16] introduced a rewriting of the model which “exponentializes” it by considering the parameter as an additional latent variable following a Normal distribution centered on the newly defined parameters and with fixed variance. The likelihood of this new exponentialized model now belongs to the curved exponential family. Although often used, there is no guarantee that the estimated mean is close to the maximum likelihood estimate of the initial model. In this paper, we quantify the error done in this estimation while considering the exponentialized model instead of the initial one. By verifying those results on an example, we see that a trade-off must be made between the speed of convergence and the tolerated error. Finally, we propose a new algorithm allowing a better estimation of the parameter in a reasonable computation time to reduce the bias.\",\"PeriodicalId\":51249,\"journal\":{\"name\":\"Esaim-Probability and Statistics\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Esaim-Probability and Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1051/ps/2021015\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Esaim-Probability and Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1051/ps/2021015","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 7

摘要

期望最大化算法(EM)是一种广泛使用的方法,用于估计包含潜在变量的模型的最大似然。当期望步长不容易计算时,可以使用EM的随机版本,如随机逼近EM。然而,该算法的缺点是要求联合似然属于曲线指数族。为了克服这个问题,[16]引入了对模型的重写,通过将参数视为遵循以新定义参数为中心且具有固定方差的正态分布的附加潜在变量,将其“指数化”。这种新的指数化模型的似然现在属于曲线指数族。虽然经常使用,但不能保证估计的平均值接近初始模型的最大似然估计。在本文中,我们在考虑指数化模型而不是初始模型的情况下,量化了这种估计中的误差。通过在一个例子上验证这些结果,我们看到必须在收敛速度和可容忍误差之间做出权衡。最后,我们提出了一种新的算法,可以在合理的计算时间内更好地估计参数,以减少偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On the curved exponential family in the Stochastic Approximation Expectation Maximization Algorithm
The Expectation-Maximization Algorithm (EM) is a widely used method allowing to estimate the maximum likelihood of models involving latent variables. When the Expectation step cannot be computed easily, one can use stochastic versions of the EM such as the Stochastic Approximation EM. This algorithm, however, has the drawback to require the joint likelihood to belong to the curved exponential family. To overcome this problem, [16] introduced a rewriting of the model which “exponentializes” it by considering the parameter as an additional latent variable following a Normal distribution centered on the newly defined parameters and with fixed variance. The likelihood of this new exponentialized model now belongs to the curved exponential family. Although often used, there is no guarantee that the estimated mean is close to the maximum likelihood estimate of the initial model. In this paper, we quantify the error done in this estimation while considering the exponentialized model instead of the initial one. By verifying those results on an example, we see that a trade-off must be made between the speed of convergence and the tolerated error. Finally, we propose a new algorithm allowing a better estimation of the parameter in a reasonable computation time to reduce the bias.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Esaim-Probability and Statistics
Esaim-Probability and Statistics STATISTICS & PROBABILITY-
CiteScore
1.00
自引率
0.00%
发文量
14
审稿时长
>12 weeks
期刊介绍: The journal publishes original research and survey papers in the area of Probability and Statistics. It covers theoretical and practical aspects, in any field of these domains. Of particular interest are methodological developments with application in other scientific areas, for example Biology and Genetics, Information Theory, Finance, Bioinformatics, Random structures and Random graphs, Econometrics, Physics. Long papers are very welcome. Indeed, we intend to develop the journal in the direction of applications and to open it to various fields where random mathematical modelling is important. In particular we will call (survey) papers in these areas, in order to make the random community aware of important problems of both theoretical and practical interest. We all know that many recent fascinating developments in Probability and Statistics are coming from "the outside" and we think that ESAIM: P&S should be a good entry point for such exchanges. Of course this does not mean that the journal will be only devoted to practical aspects.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信