{"title":"Minimum message length hidden Markov modelling","authors":"T. Edgoose, L. Allison","doi":"10.1109/DCC.1998.672145","DOIUrl":null,"url":null,"abstract":"This paper describes a minimum message length (MML) approach to finding the most appropriate hidden Markov model (HMM) to describe a given sequence of observations. An MML estimate for the expected length of a two-part message stating a specific HMM and the observations given this model is presented along with an effective search strategy for finding the best number of states for the model. The information estimate enables two models with different numbers of states to be fairly compared which is necessary if the search of this complex model space is to avoid the worst locally optimal solutions. The general purpose MML classifier 'Snob' has been extended and the new program 'tSnob' is tested on 'synthetic' data and a large 'real world' dataset. The MML measure is found to be an improvement on the Bayesian information criteria (BIG) and the un-supervised search strategy.","PeriodicalId":191890,"journal":{"name":"Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.1998.672145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper describes a minimum message length (MML) approach to finding the most appropriate hidden Markov model (HMM) to describe a given sequence of observations. An MML estimate for the expected length of a two-part message stating a specific HMM and the observations given this model is presented along with an effective search strategy for finding the best number of states for the model. The information estimate enables two models with different numbers of states to be fairly compared which is necessary if the search of this complex model space is to avoid the worst locally optimal solutions. The general purpose MML classifier 'Snob' has been extended and the new program 'tSnob' is tested on 'synthetic' data and a large 'real world' dataset. The MML measure is found to be an improvement on the Bayesian information criteria (BIG) and the un-supervised search strategy.