M. Y. Arafat, Sanjana Fahrin, Md Jamirul Islam, Md Ashraf Siddiquee, Afsana Khan, Mohammed Rokibul Alam Kotwal, M. N. Huda
{"title":"Speech synthesis for Bangla Text to Speech conversion","authors":"M. Y. Arafat, Sanjana Fahrin, Md Jamirul Islam, Md Ashraf Siddiquee, Afsana Khan, Mohammed Rokibul Alam Kotwal, M. N. Huda","doi":"10.1109/SKIMA.2014.7083517","DOIUrl":null,"url":null,"abstract":"This paper illustrates the design and implementation of Bangla (widely used as Bengali) Text to Speech (TTS) system from the very raw level without using any third party speech synthesis tool. For constructing the system we have considered two directions, where one is based on phoneme and another one is on syllable. In this study, our proposed system comprises some stages. At first stage audio sounds are recorded for each of the Bangla phonemes and three thousand out of 250000 syllables in Bangla, and then noise is reduced to obtained high quality sounds for each phoneme and syllable. Second stage searches for longest possible matching of the syllables if it is available in the input text, and if not, then searches for the phonemes to match with the corresponding graphemes. For further improvement, we also added the complex conjuncts which need to be handled separately. It is observed from the experiments that the syllable based method provides the better quality speech for the input text in comparison with the method based on phoneme.","PeriodicalId":22294,"journal":{"name":"The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014)","volume":"112 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SKIMA.2014.7083517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
This paper illustrates the design and implementation of Bangla (widely used as Bengali) Text to Speech (TTS) system from the very raw level without using any third party speech synthesis tool. For constructing the system we have considered two directions, where one is based on phoneme and another one is on syllable. In this study, our proposed system comprises some stages. At first stage audio sounds are recorded for each of the Bangla phonemes and three thousand out of 250000 syllables in Bangla, and then noise is reduced to obtained high quality sounds for each phoneme and syllable. Second stage searches for longest possible matching of the syllables if it is available in the input text, and if not, then searches for the phonemes to match with the corresponding graphemes. For further improvement, we also added the complex conjuncts which need to be handled separately. It is observed from the experiments that the syllable based method provides the better quality speech for the input text in comparison with the method based on phoneme.