{"title":"口述图像到语音转换器-图像输入麦克风","authors":"Takaaki Hasegawa, Keiichi Ohtani","doi":"10.1109/ICCS.1992.255190","DOIUrl":null,"url":null,"abstract":"The authors propose a new speech communication system to convert an oral image into voice, called image input microphone. This system synthesizes the voice from only the oral image. It provides high security and is not affected by acoustic noise. Since the voice is synthesized without recognition, this system is independent of languages. Simulations to convert oral image to voice for five Japanese vowels were carried out. A vocal tract area function is estimated from the oral image, and a PARCOR synthesis filter is obtained from the vocal tract area function. The PARCOR synthesis filter is driven by a pulse train. The performance of this system is evaluated by hearing tests of the synthesized voice. As a result, an audible voice has been synthesized and the mean recognition rate of five Japanese vowels has been 91(%).<<ETX>>","PeriodicalId":223769,"journal":{"name":"[Proceedings] Singapore ICCS/ISITA `92","volume":"144 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Oral image to voice converter-image input microphone\",\"authors\":\"Takaaki Hasegawa, Keiichi Ohtani\",\"doi\":\"10.1109/ICCS.1992.255190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors propose a new speech communication system to convert an oral image into voice, called image input microphone. This system synthesizes the voice from only the oral image. It provides high security and is not affected by acoustic noise. Since the voice is synthesized without recognition, this system is independent of languages. Simulations to convert oral image to voice for five Japanese vowels were carried out. A vocal tract area function is estimated from the oral image, and a PARCOR synthesis filter is obtained from the vocal tract area function. The PARCOR synthesis filter is driven by a pulse train. The performance of this system is evaluated by hearing tests of the synthesized voice. As a result, an audible voice has been synthesized and the mean recognition rate of five Japanese vowels has been 91(%).<<ETX>>\",\"PeriodicalId\":223769,\"journal\":{\"name\":\"[Proceedings] Singapore ICCS/ISITA `92\",\"volume\":\"144 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings] Singapore ICCS/ISITA `92\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCS.1992.255190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] Singapore ICCS/ISITA `92","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCS.1992.255190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Oral image to voice converter-image input microphone
The authors propose a new speech communication system to convert an oral image into voice, called image input microphone. This system synthesizes the voice from only the oral image. It provides high security and is not affected by acoustic noise. Since the voice is synthesized without recognition, this system is independent of languages. Simulations to convert oral image to voice for five Japanese vowels were carried out. A vocal tract area function is estimated from the oral image, and a PARCOR synthesis filter is obtained from the vocal tract area function. The PARCOR synthesis filter is driven by a pulse train. The performance of this system is evaluated by hearing tests of the synthesized voice. As a result, an audible voice has been synthesized and the mean recognition rate of five Japanese vowels has been 91(%).<>