Style-Code Method for Multi-Style Parametric Text-to-Speech Synthesis

Q3 Mathematics
S. Suzic, Tijana Delic, S. Ostrogonac, Simona Đurić, D. Pekar
{"title":"Style-Code Method for Multi-Style Parametric Text-to-Speech Synthesis","authors":"S. Suzic, Tijana Delic, S. Ostrogonac, Simona Đurić, D. Pekar","doi":"10.15622/sp.60.8","DOIUrl":null,"url":null,"abstract":"Modern text-to-speech systems generally achieve good intelligibility. The one of the main drawbacks of these systems is the lack of expressiveness in comparison to natural human speech. It is very unpleasant when automated system conveys positive and negative message in completely the same way. The introduction of parametric methods in speech synthesis gave possibility to easily change speaker characteristics and speaking styles. In this paper a simple method for incorporating styles into synthesized speech by using style codes is presented. The proposed method requires just a couple of minutes of target style and moderate amount of neutral speech. It is successfully applied to both hidden Markov models and deep neural networks-based synthesis, giving style code as additional input to the model. Listening tests confirmed that better style expressiveness is achieved by deep neural networks synthesis compared to hidden Markov model synthesis. It is also proved that quality of speech synthesized by deep neural networks in a certain style is comparable with the speech synthesized in neutral style, although the neutral-speech-database is about 10 times bigger. DNN based TTS with style codes are further investigated by comparing the quality of speech produced by single-style modeling and multi-style modeling systems. Objective and subjective measures confirmed that there is no significant difference between these two approaches.","PeriodicalId":53447,"journal":{"name":"SPIIRAS Proceedings","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SPIIRAS Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15622/sp.60.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 4

Abstract

Modern text-to-speech systems generally achieve good intelligibility. The one of the main drawbacks of these systems is the lack of expressiveness in comparison to natural human speech. It is very unpleasant when automated system conveys positive and negative message in completely the same way. The introduction of parametric methods in speech synthesis gave possibility to easily change speaker characteristics and speaking styles. In this paper a simple method for incorporating styles into synthesized speech by using style codes is presented. The proposed method requires just a couple of minutes of target style and moderate amount of neutral speech. It is successfully applied to both hidden Markov models and deep neural networks-based synthesis, giving style code as additional input to the model. Listening tests confirmed that better style expressiveness is achieved by deep neural networks synthesis compared to hidden Markov model synthesis. It is also proved that quality of speech synthesized by deep neural networks in a certain style is comparable with the speech synthesized in neutral style, although the neutral-speech-database is about 10 times bigger. DNN based TTS with style codes are further investigated by comparing the quality of speech produced by single-style modeling and multi-style modeling systems. Objective and subjective measures confirmed that there is no significant difference between these two approaches.
多样式参数文本到语音合成的样式编码方法
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
SPIIRAS Proceedings
SPIIRAS Proceedings Mathematics-Applied Mathematics
CiteScore
1.90
自引率
0.00%
发文量
0
审稿时长
14 weeks
期刊介绍: The SPIIRAS Proceedings journal publishes scientific, scientific-educational, scientific-popular papers relating to computer science, automation, applied mathematics, interdisciplinary research, as well as information technology, the theoretical foundations of computer science (such as mathematical and related to other scientific disciplines), information security and information protection, decision making and artificial intelligence, mathematical modeling, informatization.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信