Masakatsu Hoshimi, M. Miyata, Shoji Hiraoka, Katsuyuki Niyada
{"title":"Speaker independent speech recognition method using training speech from a small number of speakers","authors":"Masakatsu Hoshimi, M. Miyata, Shoji Hiraoka, Katsuyuki Niyada","doi":"10.1109/ICASSP.1992.225870","DOIUrl":null,"url":null,"abstract":"A novel speaker-independent speech recognition method, which registers speech uttered by a small number of speakers into a dictionary as model speech is presented. It is based on the hypothesis that movement of the vocal tract differs little among individuals when the same word is spoken. This idea leads to the conclusion that dynamic characteristics extracted from a small number of speaker's utterances are effective for speaker-independent speech recognition. A speech recognition method using model utterances in which similarity values of an input word are calculated by matching a small number of speakers' utterances with phoneme templates for speaker-independent recognition is described. When tested with 212 Japanese words, a word recognition rate of 95.8% was obtained. The evaluation of the noise robustness is also reported.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1992.225870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
A novel speaker-independent speech recognition method, which registers speech uttered by a small number of speakers into a dictionary as model speech is presented. It is based on the hypothesis that movement of the vocal tract differs little among individuals when the same word is spoken. This idea leads to the conclusion that dynamic characteristics extracted from a small number of speaker's utterances are effective for speaker-independent speech recognition. A speech recognition method using model utterances in which similarity values of an input word are calculated by matching a small number of speakers' utterances with phoneme templates for speaker-independent recognition is described. When tested with 212 Japanese words, a word recognition rate of 95.8% was obtained. The evaluation of the noise robustness is also reported.<>