Bayesian adaptation in HMM training and decoding using a mixture of feature transforms

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI:10.1109/ASRU.2007.4430133

S. Tsakalidis, S. Matsoukas

引用次数: 0

Abstract

Adaptive training under a Bayesian framework addresses some limitations of the standard maximum likelihood approaches. Also, the adaptively trained system can be directly used in unsupervised inference. The Bayesian framework uses a distribution of the transform rather than a point estimate. A continuous transform distribution makes the integral associated with the Bayesian framework intractable and therefore various approximations have been proposed. In this paper we model the transform distribution via a mixture of transforms. Under this model, the likelihood of an utterance is computed as a weighted sum of the likelihoods obtained by transforming its features based on each of the transforms in the mixture, with weights set to the transform priors. Experimental results on Arabic broadcast news exhibit increased likelihood on acoustic training data and improved speech recognition performance on unseen test data, compared to speaker independent and standard adaptive models.

查看原文本刊更多论文

混合特征变换在HMM训练和解码中的贝叶斯自适应

贝叶斯框架下的自适应训练解决了标准最大似然方法的一些局限性。此外，该自适应训练系统可直接用于无监督推理。贝叶斯框架使用变换的分布而不是点估计。连续变换分布使得与贝叶斯框架相关的积分难以处理，因此提出了各种近似方法。本文通过混合变换对变换分布进行建模。在该模型下，话语的似然被计算为基于混合中的每个变换变换其特征所获得的似然的加权和，权重设置为变换先验。与独立于说话者和标准自适应模型相比，阿拉伯语广播新闻的实验结果表明，声学训练数据的可能性增加，未见测试数据的语音识别性能得到改善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

自引率

0.00%

发文量