Combination of fMLLR with clustering and fMLLR with MLLR clustering for rapid speaker adaptation

Kasra Jafari, F. Almasganj, Y. Shekofteh
{"title":"Combination of fMLLR with clustering and fMLLR with MLLR clustering for rapid speaker adaptation","authors":"Kasra Jafari, F. Almasganj, Y. Shekofteh","doi":"10.1109/ICECTECH.2010.5479971","DOIUrl":null,"url":null,"abstract":"Feature space Maximum Likelihood Linear Regression (fMLLR) is known as an effective algorithm for rapid speaker adaptation to a new speaker or environment. In this paper we investigate combination of feature space transforms with speaker clustering to improve rapid speaker adaptation. fMLLR employs a single transformation matrix and a bias vector to transform the test speaker's features, linearly. We applied fMLLR for less than 10 seconds of speech signals for Persian test speakers. It improved recognition by 1.5%. Then we proposed combination of fMLLR and clustering, the results show this method improved recognition by 2.5%. In another approach, we clustered speakers and applied Maximum Likelihood Linear Regression (MLLR) to each cluster, in this step we improved model of each cluster, and then use fMLLR for rapid speaker adaptation, our result shows 2.25% increasing in speech recognition.","PeriodicalId":178300,"journal":{"name":"2010 2nd International Conference on Electronic Computer Technology","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 2nd International Conference on Electronic Computer Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECTECH.2010.5479971","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Feature space Maximum Likelihood Linear Regression (fMLLR) is known as an effective algorithm for rapid speaker adaptation to a new speaker or environment. In this paper we investigate combination of feature space transforms with speaker clustering to improve rapid speaker adaptation. fMLLR employs a single transformation matrix and a bias vector to transform the test speaker's features, linearly. We applied fMLLR for less than 10 seconds of speech signals for Persian test speakers. It improved recognition by 1.5%. Then we proposed combination of fMLLR and clustering, the results show this method improved recognition by 2.5%. In another approach, we clustered speakers and applied Maximum Likelihood Linear Regression (MLLR) to each cluster, in this step we improved model of each cluster, and then use fMLLR for rapid speaker adaptation, our result shows 2.25% increasing in speech recognition.
结合fMLLR与聚类、fMLLR与MLLR聚类实现说话人快速自适应
特征空间最大似然线性回归(fMLLR)是一种快速适应新说话人或环境的有效算法。本文研究了特征空间变换与说话人聚类相结合的方法来提高说话人自适应的快速性。fMLLR使用一个变换矩阵和一个偏置向量对测试说话者的特征进行线性变换。我们对波斯语测试者的语音信号应用了小于10秒的fmlr。它将识别率提高了1.5%。然后我们提出了fMLLR和聚类相结合的方法,结果表明该方法将识别率提高了2.5%。另一种方法是对说话人进行聚类,并将最大似然线性回归(MLLR)应用于每个聚类,在此步骤中对每个聚类的模型进行改进,然后使用最大似然线性回归对说话人进行快速自适应,结果表明语音识别效率提高了2.25%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信