{"title":"Statistical and computational guarantees for the Baum-Welch algorithm","authors":"Fanny Yang, Sivaraman Balakrishnan, M. Wainwright","doi":"10.1109/ALLERTON.2015.7447067","DOIUrl":null,"url":null,"abstract":"The Hidden Markov Model (HMM) is one of the main-stays of statistical modeling of discrete time series and is widely used in many applications. Estimating an HMM from its observation process is often addressed via the Baum-Welch algorithm, which performs well empirically when initialized reasonably close to the truth. This behavior could not be explained by existing theory which predicts susceptibility to bad local optima. In this paper we aim at closing the gap and provide a framework to characterize a sufficient basin of attraction for any global optimum in which Baum-Welch is guaranteed to converge linearly to an “optimally” small ball around the global optimum. The framework is then used to determine the linear rate of convergence and a sufficient initialization region for Baum-Welch applied on a two component isotropic hidden Markov mixture of Gaussians.","PeriodicalId":112948,"journal":{"name":"2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2015.7447067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34
Abstract
The Hidden Markov Model (HMM) is one of the main-stays of statistical modeling of discrete time series and is widely used in many applications. Estimating an HMM from its observation process is often addressed via the Baum-Welch algorithm, which performs well empirically when initialized reasonably close to the truth. This behavior could not be explained by existing theory which predicts susceptibility to bad local optima. In this paper we aim at closing the gap and provide a framework to characterize a sufficient basin of attraction for any global optimum in which Baum-Welch is guaranteed to converge linearly to an “optimally” small ball around the global optimum. The framework is then used to determine the linear rate of convergence and a sufficient initialization region for Baum-Welch applied on a two component isotropic hidden Markov mixture of Gaussians.