{"title":"低资源语言中各种不对称语境因素对TTS的影响","authors":"Nirmesh J. Shah, Mohammadi Zaki, H. Patil","doi":"10.1109/IALP.2014.6973509","DOIUrl":null,"url":null,"abstract":"The generalized statistical framework of Hidden Markov Model (HMM) has been successfully applied from the field of speech recognition to speech synthesis. In this paper, we have applied HMM-based Speech Synthesis (HTS) method to Gujarati (one of the official languages of India). Adaption and evaluation of HTS for Gujarati language has been done here. In addition, to understand the influence of asymmetrical contextual factors on quality of synthesized speech, we have conducted series of experiments. Evaluation of different HTS built for Gujarati speech using various asymmetrical contextual factors is done in terms of naturalness and speech intelligibility. From the experimental results, it is evident that when more weightage is given to left phoneme in asymmetrical contextual factor, HTS performance improves compared to conventional symmetrical contextual factors for both triphone and pentaphone case. Furthermore, we achieved best performance for Gujarati HTS with left-left-left-centre-right (i.e., LLLCR) contextual factors.","PeriodicalId":117334,"journal":{"name":"2014 International Conference on Asian Language Processing (IALP)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Influence of various asymmetrical contextual factors for TTS in a low resource language\",\"authors\":\"Nirmesh J. Shah, Mohammadi Zaki, H. Patil\",\"doi\":\"10.1109/IALP.2014.6973509\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The generalized statistical framework of Hidden Markov Model (HMM) has been successfully applied from the field of speech recognition to speech synthesis. In this paper, we have applied HMM-based Speech Synthesis (HTS) method to Gujarati (one of the official languages of India). Adaption and evaluation of HTS for Gujarati language has been done here. In addition, to understand the influence of asymmetrical contextual factors on quality of synthesized speech, we have conducted series of experiments. Evaluation of different HTS built for Gujarati speech using various asymmetrical contextual factors is done in terms of naturalness and speech intelligibility. From the experimental results, it is evident that when more weightage is given to left phoneme in asymmetrical contextual factor, HTS performance improves compared to conventional symmetrical contextual factors for both triphone and pentaphone case. Furthermore, we achieved best performance for Gujarati HTS with left-left-left-centre-right (i.e., LLLCR) contextual factors.\",\"PeriodicalId\":117334,\"journal\":{\"name\":\"2014 International Conference on Asian Language Processing (IALP)\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Asian Language Processing (IALP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IALP.2014.6973509\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2014.6973509","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Influence of various asymmetrical contextual factors for TTS in a low resource language
The generalized statistical framework of Hidden Markov Model (HMM) has been successfully applied from the field of speech recognition to speech synthesis. In this paper, we have applied HMM-based Speech Synthesis (HTS) method to Gujarati (one of the official languages of India). Adaption and evaluation of HTS for Gujarati language has been done here. In addition, to understand the influence of asymmetrical contextual factors on quality of synthesized speech, we have conducted series of experiments. Evaluation of different HTS built for Gujarati speech using various asymmetrical contextual factors is done in terms of naturalness and speech intelligibility. From the experimental results, it is evident that when more weightage is given to left phoneme in asymmetrical contextual factor, HTS performance improves compared to conventional symmetrical contextual factors for both triphone and pentaphone case. Furthermore, we achieved best performance for Gujarati HTS with left-left-left-centre-right (i.e., LLLCR) contextual factors.