{"title":"孤立语音识别的交互式查询学习","authors":"J. Hwang, H. Li","doi":"10.1109/NNSP.1992.253704","DOIUrl":null,"url":null,"abstract":"The authors propose an interactive query learning approach to isolated speech recognition tasks. The approach starts with training multiple 'one-net-one-class' time delay neural networks (TDNNs) based on sequences of LPC vectors. After all TDNNs are trained, initiated from each available LPC training sequence for one specific TDNN (say, class k), an improved network inversion algorithm with imposing constraint is used to generate a set of inverted LPC sequences corresponding to various output values of the corresponding TDNN. By carefully listening to synthesized speech based on the inverted LPC sequences, a conjugate pair of LPC sequences is selected from the whole set of LPC sequences; one corresponds to the acceptable speech of class k and the other corresponds to the unacceptable speech of class k. This conjugate LPC sequence pair constitutes some parts of the classification boundary associated with this class, and should be further used as the training date to refine the already trained classifier boundary. A 6% accuracy improvement was achieved when the proposed method was tested on speaker independent E-set data.<<ETX>>","PeriodicalId":438250,"journal":{"name":"Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop","volume":"91U 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Interactive query learning for isolated speech recognition\",\"authors\":\"J. Hwang, H. Li\",\"doi\":\"10.1109/NNSP.1992.253704\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors propose an interactive query learning approach to isolated speech recognition tasks. The approach starts with training multiple 'one-net-one-class' time delay neural networks (TDNNs) based on sequences of LPC vectors. After all TDNNs are trained, initiated from each available LPC training sequence for one specific TDNN (say, class k), an improved network inversion algorithm with imposing constraint is used to generate a set of inverted LPC sequences corresponding to various output values of the corresponding TDNN. By carefully listening to synthesized speech based on the inverted LPC sequences, a conjugate pair of LPC sequences is selected from the whole set of LPC sequences; one corresponds to the acceptable speech of class k and the other corresponds to the unacceptable speech of class k. This conjugate LPC sequence pair constitutes some parts of the classification boundary associated with this class, and should be further used as the training date to refine the already trained classifier boundary. A 6% accuracy improvement was achieved when the proposed method was tested on speaker independent E-set data.<<ETX>>\",\"PeriodicalId\":438250,\"journal\":{\"name\":\"Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop\",\"volume\":\"91U 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NNSP.1992.253704\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NNSP.1992.253704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Interactive query learning for isolated speech recognition
The authors propose an interactive query learning approach to isolated speech recognition tasks. The approach starts with training multiple 'one-net-one-class' time delay neural networks (TDNNs) based on sequences of LPC vectors. After all TDNNs are trained, initiated from each available LPC training sequence for one specific TDNN (say, class k), an improved network inversion algorithm with imposing constraint is used to generate a set of inverted LPC sequences corresponding to various output values of the corresponding TDNN. By carefully listening to synthesized speech based on the inverted LPC sequences, a conjugate pair of LPC sequences is selected from the whole set of LPC sequences; one corresponds to the acceptable speech of class k and the other corresponds to the unacceptable speech of class k. This conjugate LPC sequence pair constitutes some parts of the classification boundary associated with this class, and should be further used as the training date to refine the already trained classifier boundary. A 6% accuracy improvement was achieved when the proposed method was tested on speaker independent E-set data.<>