Masakatsu Hoshimi, M. Miyata, Shoji Hiraoka, Katsuyuki Niyada
{"title":"独立于说话人的语音识别方法利用从少量说话人中训练的语音","authors":"Masakatsu Hoshimi, M. Miyata, Shoji Hiraoka, Katsuyuki Niyada","doi":"10.1109/ICASSP.1992.225870","DOIUrl":null,"url":null,"abstract":"A novel speaker-independent speech recognition method, which registers speech uttered by a small number of speakers into a dictionary as model speech is presented. It is based on the hypothesis that movement of the vocal tract differs little among individuals when the same word is spoken. This idea leads to the conclusion that dynamic characteristics extracted from a small number of speaker's utterances are effective for speaker-independent speech recognition. A speech recognition method using model utterances in which similarity values of an input word are calculated by matching a small number of speakers' utterances with phoneme templates for speaker-independent recognition is described. When tested with 212 Japanese words, a word recognition rate of 95.8% was obtained. The evaluation of the noise robustness is also reported.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Speaker independent speech recognition method using training speech from a small number of speakers\",\"authors\":\"Masakatsu Hoshimi, M. Miyata, Shoji Hiraoka, Katsuyuki Niyada\",\"doi\":\"10.1109/ICASSP.1992.225870\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A novel speaker-independent speech recognition method, which registers speech uttered by a small number of speakers into a dictionary as model speech is presented. It is based on the hypothesis that movement of the vocal tract differs little among individuals when the same word is spoken. This idea leads to the conclusion that dynamic characteristics extracted from a small number of speaker's utterances are effective for speaker-independent speech recognition. A speech recognition method using model utterances in which similarity values of an input word are calculated by matching a small number of speakers' utterances with phoneme templates for speaker-independent recognition is described. When tested with 212 Japanese words, a word recognition rate of 95.8% was obtained. The evaluation of the noise robustness is also reported.<<ETX>>\",\"PeriodicalId\":163713,\"journal\":{\"name\":\"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.1992.225870\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1992.225870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speaker independent speech recognition method using training speech from a small number of speakers
A novel speaker-independent speech recognition method, which registers speech uttered by a small number of speakers into a dictionary as model speech is presented. It is based on the hypothesis that movement of the vocal tract differs little among individuals when the same word is spoken. This idea leads to the conclusion that dynamic characteristics extracted from a small number of speaker's utterances are effective for speaker-independent speech recognition. A speech recognition method using model utterances in which similarity values of an input word are calculated by matching a small number of speakers' utterances with phoneme templates for speaker-independent recognition is described. When tested with 212 Japanese words, a word recognition rate of 95.8% was obtained. The evaluation of the noise robustness is also reported.<>