Combination of fMLLR with clustering and fMLLR with MLLR clustering for rapid speaker adaptation

2010 2nd International Conference on Electronic Computer Technology Pub Date : 2010-05-07 DOI:10.1109/ICECTECH.2010.5479971

Kasra Jafari, F. Almasganj, Y. Shekofteh

引用次数: 0

Abstract

Feature space Maximum Likelihood Linear Regression (fMLLR) is known as an effective algorithm for rapid speaker adaptation to a new speaker or environment. In this paper we investigate combination of feature space transforms with speaker clustering to improve rapid speaker adaptation. fMLLR employs a single transformation matrix and a bias vector to transform the test speaker's features, linearly. We applied fMLLR for less than 10 seconds of speech signals for Persian test speakers. It improved recognition by 1.5%. Then we proposed combination of fMLLR and clustering, the results show this method improved recognition by 2.5%. In another approach, we clustered speakers and applied Maximum Likelihood Linear Regression (MLLR) to each cluster, in this step we improved model of each cluster, and then use fMLLR for rapid speaker adaptation, our result shows 2.25% increasing in speech recognition.

查看原文本刊更多论文

结合fMLLR与聚类、fMLLR与MLLR聚类实现说话人快速自适应

特征空间最大似然线性回归(fMLLR)是一种快速适应新说话人或环境的有效算法。本文研究了特征空间变换与说话人聚类相结合的方法来提高说话人自适应的快速性。fMLLR使用一个变换矩阵和一个偏置向量对测试说话者的特征进行线性变换。我们对波斯语测试者的语音信号应用了小于10秒的fmlr。它将识别率提高了1.5%。然后我们提出了fMLLR和聚类相结合的方法，结果表明该方法将识别率提高了2.5%。另一种方法是对说话人进行聚类，并将最大似然线性回归(MLLR)应用于每个聚类，在此步骤中对每个聚类的模型进行改进，然后使用最大似然线性回归对说话人进行快速自适应，结果表明语音识别效率提高了2.25%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 2nd International Conference on Electronic Computer Technology

自引率

0.00%

发文量