{"title":"Speaker adaptation for recognition systems with a large vocabulary","authors":"F. Class, P. Regel, K. Trottler","doi":"10.1109/MELCON.1989.50027","DOIUrl":null,"url":null,"abstract":"Algorithms for a fast speaker adaptation in a speech-recognition system are described. The techniques aim at transformations of the feature vectors, which have to be optimized with respect to some constraints. The methods transform every feature vector, computed in a 10-ms frame rate, into a speaker-normalized vector. The advantage of adaptation by transforming the feature vectors is that this procedure can be applied no matter which classification scheme is used. It is shown that, by means of adaptation procedures based on statistical correlation analysis, error rates as low as those of a speaker-dependent recognition system can be achieved after an extremely short training phase with any new speaker. The key is that the feature vectors are extended nonlinearly to a polynomial vector of second or higher order. Since the algorithms necessary for calculating the transformation matrices are typical for signal processing a real-time implementation on digital signal processors appears feasible.<<ETX>>","PeriodicalId":380214,"journal":{"name":"Proceedings. Electrotechnical Conference Integrating Research, Industry and Education in Energy and Communication Engineering',","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Electrotechnical Conference Integrating Research, Industry and Education in Energy and Communication Engineering',","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MELCON.1989.50027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Algorithms for a fast speaker adaptation in a speech-recognition system are described. The techniques aim at transformations of the feature vectors, which have to be optimized with respect to some constraints. The methods transform every feature vector, computed in a 10-ms frame rate, into a speaker-normalized vector. The advantage of adaptation by transforming the feature vectors is that this procedure can be applied no matter which classification scheme is used. It is shown that, by means of adaptation procedures based on statistical correlation analysis, error rates as low as those of a speaker-dependent recognition system can be achieved after an extremely short training phase with any new speaker. The key is that the feature vectors are extended nonlinearly to a polynomial vector of second or higher order. Since the algorithms necessary for calculating the transformation matrices are typical for signal processing a real-time implementation on digital signal processors appears feasible.<>