{"title":"Improving speech intelligibility in cochlear implants using vocoder-centric acoustic models","authors":"A. R. Gladston, P. Vijayalakshmi, N. Thangavelu","doi":"10.1109/ICRTIT.2012.6206795","DOIUrl":null,"url":null,"abstract":"The cochlear implant is a prosthetic device, used to replace a damaged inner ear. It consists of an externally worn speech processor and an internal receiver stimulator. The cochlear implant is patient specific and system specific and so in the current work, a lab model for the speech processor, based on various vocoder models is designed to analyse the effect of system specific parameters such as filter bandwidth, number of channels and vocal excitation, on the speech intelligibility. Initially a formant vocoder is designed and used in the analysis and synthesis of English vowels. A channel vocoder is then developed for the same and extended to perform the analysis and synthesis of words from the Lexical Neighbourhood Test and sentences from the TIMIT database. The effect of number of channels on the synthetic speech quality is analysed and a 21-channel vocoder is found to yield the best response with a mean opinion score (MOS) of 4 out of 5 for vowels and 3.4 for sentences. The formant trajectories and CosH distance are also used to validate the speech intelligibility. The influence of glottal pulse on speech intelligibility is analysed and the synthetic speech is found to sound more natural with a glottal pulse train than an impulse train with an MOS of 4.2 for vowels and 4 for sentences.","PeriodicalId":191151,"journal":{"name":"2012 International Conference on Recent Trends in Information Technology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Recent Trends in Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRTIT.2012.6206795","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The cochlear implant is a prosthetic device, used to replace a damaged inner ear. It consists of an externally worn speech processor and an internal receiver stimulator. The cochlear implant is patient specific and system specific and so in the current work, a lab model for the speech processor, based on various vocoder models is designed to analyse the effect of system specific parameters such as filter bandwidth, number of channels and vocal excitation, on the speech intelligibility. Initially a formant vocoder is designed and used in the analysis and synthesis of English vowels. A channel vocoder is then developed for the same and extended to perform the analysis and synthesis of words from the Lexical Neighbourhood Test and sentences from the TIMIT database. The effect of number of channels on the synthetic speech quality is analysed and a 21-channel vocoder is found to yield the best response with a mean opinion score (MOS) of 4 out of 5 for vowels and 3.4 for sentences. The formant trajectories and CosH distance are also used to validate the speech intelligibility. The influence of glottal pulse on speech intelligibility is analysed and the synthetic speech is found to sound more natural with a glottal pulse train than an impulse train with an MOS of 4.2 for vowels and 4 for sentences.