{"title":"Combination of fMLLR with clustering and fMLLR with MLLR clustering for rapid speaker adaptation","authors":"Kasra Jafari, F. Almasganj, Y. Shekofteh","doi":"10.1109/ICECTECH.2010.5479971","DOIUrl":null,"url":null,"abstract":"Feature space Maximum Likelihood Linear Regression (fMLLR) is known as an effective algorithm for rapid speaker adaptation to a new speaker or environment. In this paper we investigate combination of feature space transforms with speaker clustering to improve rapid speaker adaptation. fMLLR employs a single transformation matrix and a bias vector to transform the test speaker's features, linearly. We applied fMLLR for less than 10 seconds of speech signals for Persian test speakers. It improved recognition by 1.5%. Then we proposed combination of fMLLR and clustering, the results show this method improved recognition by 2.5%. In another approach, we clustered speakers and applied Maximum Likelihood Linear Regression (MLLR) to each cluster, in this step we improved model of each cluster, and then use fMLLR for rapid speaker adaptation, our result shows 2.25% increasing in speech recognition.","PeriodicalId":178300,"journal":{"name":"2010 2nd International Conference on Electronic Computer Technology","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 2nd International Conference on Electronic Computer Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECTECH.2010.5479971","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Feature space Maximum Likelihood Linear Regression (fMLLR) is known as an effective algorithm for rapid speaker adaptation to a new speaker or environment. In this paper we investigate combination of feature space transforms with speaker clustering to improve rapid speaker adaptation. fMLLR employs a single transformation matrix and a bias vector to transform the test speaker's features, linearly. We applied fMLLR for less than 10 seconds of speech signals for Persian test speakers. It improved recognition by 1.5%. Then we proposed combination of fMLLR and clustering, the results show this method improved recognition by 2.5%. In another approach, we clustered speakers and applied Maximum Likelihood Linear Regression (MLLR) to each cluster, in this step we improved model of each cluster, and then use fMLLR for rapid speaker adaptation, our result shows 2.25% increasing in speech recognition.