{"title":"Developing Concatenative Based Text to Speech Synthesizer for Tigrigna Language","authors":"Mezgebe Araya Keletay, Hussien Seid Worku","doi":"10.11648/j.iotcc.20200802.12","DOIUrl":null,"url":null,"abstract":"A Text-To-Speech (TTS) synthesizer is a computer-based system able to read any text and convert it into speech that resembles as closely as possible a native speaker of the language. This thesis describes the first Text-to-Speech (TTS) system for the Tigrigna language, using speech synthesis architecture in MATLAB. The TTS system is working based on concatenative synthesis and applying LPC technique. The performance of the system is measured and the quality of synthesized speech is assessed in terms of intelligibility and naturalness. The result of the synthesizer is evaluated in two ways, in word level and sentences level. The test results indicate in the word level is evaluated by NeoSpeech tool online and most of the words are recognizable. The overall performance of the system in the word level which is evaluated by NeoSpeech tool is found to be 78%. When it comes to the intelligibility and naturalness of the synthesized speech in the sentence level, it is measured in MOS scale and the overall intelligibility and naturalness of the system is found to be 3.28 and 3.27 respectively. The values of performance, intelligibility and naturalness are encouraging and show that diphone speech units are good candidates to develop fully functional speech synthesizer. But there are areas that can be improved. Inclusion of text analyzer to pronounce zonal dialects of the language and prosody generator are some of the things that need further investigation.","PeriodicalId":173948,"journal":{"name":"Internet of Things and Cloud Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things and Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11648/j.iotcc.20200802.12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
A Text-To-Speech (TTS) synthesizer is a computer-based system able to read any text and convert it into speech that resembles as closely as possible a native speaker of the language. This thesis describes the first Text-to-Speech (TTS) system for the Tigrigna language, using speech synthesis architecture in MATLAB. The TTS system is working based on concatenative synthesis and applying LPC technique. The performance of the system is measured and the quality of synthesized speech is assessed in terms of intelligibility and naturalness. The result of the synthesizer is evaluated in two ways, in word level and sentences level. The test results indicate in the word level is evaluated by NeoSpeech tool online and most of the words are recognizable. The overall performance of the system in the word level which is evaluated by NeoSpeech tool is found to be 78%. When it comes to the intelligibility and naturalness of the synthesized speech in the sentence level, it is measured in MOS scale and the overall intelligibility and naturalness of the system is found to be 3.28 and 3.27 respectively. The values of performance, intelligibility and naturalness are encouraging and show that diphone speech units are good candidates to develop fully functional speech synthesizer. But there are areas that can be improved. Inclusion of text analyzer to pronounce zonal dialects of the language and prosody generator are some of the things that need further investigation.