Developing Concatenative Based Text to Speech Synthesizer for Tigrigna Language

Mezgebe Araya Keletay, Hussien Seid Worku
{"title":"Developing Concatenative Based Text to Speech Synthesizer for Tigrigna Language","authors":"Mezgebe Araya Keletay, Hussien Seid Worku","doi":"10.11648/j.iotcc.20200802.12","DOIUrl":null,"url":null,"abstract":"A Text-To-Speech (TTS) synthesizer is a computer-based system able to read any text and convert it into speech that resembles as closely as possible a native speaker of the language. This thesis describes the first Text-to-Speech (TTS) system for the Tigrigna language, using speech synthesis architecture in MATLAB. The TTS system is working based on concatenative synthesis and applying LPC technique. The performance of the system is measured and the quality of synthesized speech is assessed in terms of intelligibility and naturalness. The result of the synthesizer is evaluated in two ways, in word level and sentences level. The test results indicate in the word level is evaluated by NeoSpeech tool online and most of the words are recognizable. The overall performance of the system in the word level which is evaluated by NeoSpeech tool is found to be 78%. When it comes to the intelligibility and naturalness of the synthesized speech in the sentence level, it is measured in MOS scale and the overall intelligibility and naturalness of the system is found to be 3.28 and 3.27 respectively. The values of performance, intelligibility and naturalness are encouraging and show that diphone speech units are good candidates to develop fully functional speech synthesizer. But there are areas that can be improved. Inclusion of text analyzer to pronounce zonal dialects of the language and prosody generator are some of the things that need further investigation.","PeriodicalId":173948,"journal":{"name":"Internet of Things and Cloud Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things and Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11648/j.iotcc.20200802.12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

A Text-To-Speech (TTS) synthesizer is a computer-based system able to read any text and convert it into speech that resembles as closely as possible a native speaker of the language. This thesis describes the first Text-to-Speech (TTS) system for the Tigrigna language, using speech synthesis architecture in MATLAB. The TTS system is working based on concatenative synthesis and applying LPC technique. The performance of the system is measured and the quality of synthesized speech is assessed in terms of intelligibility and naturalness. The result of the synthesizer is evaluated in two ways, in word level and sentences level. The test results indicate in the word level is evaluated by NeoSpeech tool online and most of the words are recognizable. The overall performance of the system in the word level which is evaluated by NeoSpeech tool is found to be 78%. When it comes to the intelligibility and naturalness of the synthesized speech in the sentence level, it is measured in MOS scale and the overall intelligibility and naturalness of the system is found to be 3.28 and 3.27 respectively. The values of performance, intelligibility and naturalness are encouraging and show that diphone speech units are good candidates to develop fully functional speech synthesizer. But there are areas that can be improved. Inclusion of text analyzer to pronounce zonal dialects of the language and prosody generator are some of the things that need further investigation.
Tigrigna语言中基于连接的文本到语音合成器的开发
文本到语音(TTS)合成器是一种基于计算机的系统,能够读取任何文本并将其转换为尽可能接近该语言母语者的语音。本文介绍了第一个Tigrigna语言的文本到语音(TTS)系统,使用MATLAB中的语音合成体系结构。TTS系统以串联合成为基础,采用LPC技术进行工作。测试了系统的性能,并从可理解性和自然度两个方面评价了合成语音的质量。从单词水平和句子水平两方面对合成器的结果进行评价。测试结果表明,通过在线NeoSpeech工具对单词水平进行了评估,大部分单词都是可识别的。经NeoSpeech工具评估,该系统在词级的总体表现为78%。对于合成语音在句子层面的可理解度和自然度,采用MOS量表进行测量,系统的整体可理解度和自然度分别为3.28和3.27。性能、可理解性和自然度的价值是令人鼓舞的,表明diphone语音单元是开发全功能语音合成器的良好候选者。但也有可以改进的地方。包括文本分析器对语言区域方言的发音和韵律生成器是一些需要进一步研究的事情。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信