{"title":"改进大词汇量连续语音识别的线性判别分析","authors":"Reinhold Häb-Umbach, H. Ney","doi":"10.1109/ICASSP.1992.225984","DOIUrl":null,"url":null,"abstract":"The interaction of linear discriminant analysis (LDA) and a modeling approach using continuous Laplacian mixture density HMM is studied experimentally. The largest improvements in speech recognition could be obtained when the classes for the LDA transform were defined to be sub-phone units. On a 12000 word German recognition task with small overlap between training and test vocabulary a reduction in error rate by one-fifth was achieved compared to the case without LDA. On the development set of the DARPA RM1 task the error rate was reduced by one-third. For the DARPA speaker-dependent no-grammar case, the error rate averaged over 12 speakers was 9.9%. This was achieved with a recognizer using LDA and a set of only 47 Viterbi-trained context-independent phonemes.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"384","resultStr":"{\"title\":\"Linear discriminant analysis for improved large vocabulary continuous speech recognition\",\"authors\":\"Reinhold Häb-Umbach, H. Ney\",\"doi\":\"10.1109/ICASSP.1992.225984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The interaction of linear discriminant analysis (LDA) and a modeling approach using continuous Laplacian mixture density HMM is studied experimentally. The largest improvements in speech recognition could be obtained when the classes for the LDA transform were defined to be sub-phone units. On a 12000 word German recognition task with small overlap between training and test vocabulary a reduction in error rate by one-fifth was achieved compared to the case without LDA. On the development set of the DARPA RM1 task the error rate was reduced by one-third. For the DARPA speaker-dependent no-grammar case, the error rate averaged over 12 speakers was 9.9%. This was achieved with a recognizer using LDA and a set of only 47 Viterbi-trained context-independent phonemes.<<ETX>>\",\"PeriodicalId\":163713,\"journal\":{\"name\":\"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"384\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.1992.225984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1992.225984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Linear discriminant analysis for improved large vocabulary continuous speech recognition
The interaction of linear discriminant analysis (LDA) and a modeling approach using continuous Laplacian mixture density HMM is studied experimentally. The largest improvements in speech recognition could be obtained when the classes for the LDA transform were defined to be sub-phone units. On a 12000 word German recognition task with small overlap between training and test vocabulary a reduction in error rate by one-fifth was achieved compared to the case without LDA. On the development set of the DARPA RM1 task the error rate was reduced by one-third. For the DARPA speaker-dependent no-grammar case, the error rate averaged over 12 speakers was 9.9%. This was achieved with a recognizer using LDA and a set of only 47 Viterbi-trained context-independent phonemes.<>