{"title":"Discriminative training based on an integrated view of MPE and MMI in margin and error space","authors":"E. McDermott, Shinji Watanabe, Atsushi Nakamura","doi":"10.1109/ICASSP.2010.5495106","DOIUrl":null,"url":null,"abstract":"Recent work has demonstrated that the Maximum Mutual Information (MMI) objective function is mathematically equivalent to a simple integral of recognition error, if the latter is expressed as a margin-based Minimum Phone Error (MPE) style error-weighted objective function. This led to the proposal of a general approach to discriminative training based on integrals of MPE-style loss, calculated using “differenced MMI” (dMMI), a finite difference of MMI functionals evaluated at the edges of a margin interval. This article aims to clarify the essence and practical consequences of the new framework. The recently proposed Error-Indexed Forward-Backward Algorithm is used to visualize the close agreement between dMMI and MPE statistics for narrow margin intervals, and to illustrate the flexible control of the weight that can be given to different error levels using broader intervals. New speech recognition results are presented for the MIT OpenCourseWare/MIT-World corpus, showing small performance gains for dMMI compared to MPE for some choices of margin interval. Evaluation with an expanded 44K word trigram language model confirms that dMMI with a narrow margin interval yields the same performance as MPE.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2010.5495106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 44
Abstract
Recent work has demonstrated that the Maximum Mutual Information (MMI) objective function is mathematically equivalent to a simple integral of recognition error, if the latter is expressed as a margin-based Minimum Phone Error (MPE) style error-weighted objective function. This led to the proposal of a general approach to discriminative training based on integrals of MPE-style loss, calculated using “differenced MMI” (dMMI), a finite difference of MMI functionals evaluated at the edges of a margin interval. This article aims to clarify the essence and practical consequences of the new framework. The recently proposed Error-Indexed Forward-Backward Algorithm is used to visualize the close agreement between dMMI and MPE statistics for narrow margin intervals, and to illustrate the flexible control of the weight that can be given to different error levels using broader intervals. New speech recognition results are presented for the MIT OpenCourseWare/MIT-World corpus, showing small performance gains for dMMI compared to MPE for some choices of margin interval. Evaluation with an expanded 44K word trigram language model confirms that dMMI with a narrow margin interval yields the same performance as MPE.