Speaker adaptation using Maximum Likelihood General Regression

2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA) Pub Date : 2012-07-02 DOI:10.1109/ISSPA.2012.6310564

M. H. Bahari, H. V. hamme

引用次数: 2

Abstract

In this paper, a new method called Maximum Likelihood General Regression (MLGR) is introduced for speaker adaptation. Gaussian means of a speaker independent (SI) model are adapted to the data of a new speaker by assuming a non-linear mapping from the SI Gaussian means to the adapted Gaussian means. MLGR performs a non-linear regression between ML estimates of the means and the SI means using General Regression Neural Network. The proposed method is evaluated on the Wall Street Journal database. Evaluation results show that the suggested scheme outperforms different conventional approaches in the case of short adaptation utterances. We also mathematically prove that the Gaussian means of the adapted model using the MLGR converges to their ML estimates in the case of long adaptation utterances.

查看原文本刊更多论文

使用最大似然广义回归的说话人自适应

本文提出了一种新的说话人自适应方法——最大似然广义回归(MLGR)。演讲者独立模型的高斯均值通过假设演讲者独立模型的高斯均值到自适应高斯均值的非线性映射来适应新演讲者的数据。MLGR使用一般回归神经网络在均值的ML估计和SI均值之间执行非线性回归。在《华尔街日报》数据库上对该方法进行了评价。评价结果表明，在短自适应语音情况下，本文提出的方案优于不同的传统方法。我们还从数学上证明了在长自适应话语的情况下，使用MLGR的自适应模型的高斯均值收敛于他们的ML估计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA)

自引率

0.00%

发文量