Emotion conversion for expressive Arabic text to speech

Doaa Gamal, M. Rashwan, Sherif M. Abdou
{"title":"Emotion conversion for expressive Arabic text to speech","authors":"Doaa Gamal, M. Rashwan, Sherif M. Abdou","doi":"10.1109/AICCSA.2014.7073218","DOIUrl":null,"url":null,"abstract":"Emotion conversion using a small speech corpus is very important for expressive text to speech systems. Applying the unit selection paradigm for intonation conversion has been widely used for different languages using different intonation units. In this paper, an emotion conversion system is proposed for expressive Arabic speech. This system combines the transformation of both spectral and prosodic (pitch, duration, and energy) parameters of speech based on the linguistic context. Unit selection is used for pitch conversion and the effect of using different intonation units and different pitch detectors is studied. We also study the effect of converting each speech parameter, using our proposed system, on different expressions. Subjective tests were carried out to evaluate the system on three target expressions: sadness, happiness and questioning. Results show the effectiveness of both syllable and word units as the basic intonation unit for pitch conversion, however using syllables gives higher expressiveness for sadness and happiness. Results also show that converting pitch contours using our system is dominant for the happiness and questioning and highly affects the sadness, while duration conversion affects only sadness, spectral conversion affects only happiness, and decreasing the energy level adds more expressiveness to sadness. Finally, the evaluation of the overall system for emotion conversion shows that the proposed system managed to add an acceptable expressiveness in Arabic speech with a good quality for sadness and happiness. The same results can be obtained for questioning if only the pitch contour is converted, since spectral conversion degrades the output quality without increasing the expressiveness and duration conversion has no effect on questioning.","PeriodicalId":412749,"journal":{"name":"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICCSA.2014.7073218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Emotion conversion using a small speech corpus is very important for expressive text to speech systems. Applying the unit selection paradigm for intonation conversion has been widely used for different languages using different intonation units. In this paper, an emotion conversion system is proposed for expressive Arabic speech. This system combines the transformation of both spectral and prosodic (pitch, duration, and energy) parameters of speech based on the linguistic context. Unit selection is used for pitch conversion and the effect of using different intonation units and different pitch detectors is studied. We also study the effect of converting each speech parameter, using our proposed system, on different expressions. Subjective tests were carried out to evaluate the system on three target expressions: sadness, happiness and questioning. Results show the effectiveness of both syllable and word units as the basic intonation unit for pitch conversion, however using syllables gives higher expressiveness for sadness and happiness. Results also show that converting pitch contours using our system is dominant for the happiness and questioning and highly affects the sadness, while duration conversion affects only sadness, spectral conversion affects only happiness, and decreasing the energy level adds more expressiveness to sadness. Finally, the evaluation of the overall system for emotion conversion shows that the proposed system managed to add an acceptable expressiveness in Arabic speech with a good quality for sadness and happiness. The same results can be obtained for questioning if only the pitch contour is converted, since spectral conversion degrades the output quality without increasing the expressiveness and duration conversion has no effect on questioning.
表达性阿拉伯语文本到语音的情感转换
使用小语料库进行情感转换对于表达文本到语音系统是非常重要的。声调转换的单位选择范式被广泛应用于使用不同声调单位的不同语言。本文提出了一种面向阿拉伯语表达语音的情感转换系统。该系统结合了基于语言语境的语音频谱和韵律(音高、持续时间和能量)参数的转换。采用单位选择方法进行音高转换,研究了使用不同声调单位和不同音高检测器的效果。我们还研究了使用我们提出的系统转换每个语音参数对不同表达的影响。通过主观测试对该系统的三个目标表情进行评估:悲伤、快乐和质疑。结果表明,音节单位和词单位作为声调转换的基本单位都是有效的,但使用音节单位可以更好地表达悲伤和快乐。结果还表明,利用该系统转换音高轮廓对快乐和质疑具有主导作用,对悲伤的影响较大,而持续时间转换只对悲伤有影响,频谱转换只对快乐有影响,降低能级增加了悲伤的表现力。最后,对整个情感转换系统的评估表明,所提出的系统成功地在阿拉伯语中添加了一种可接受的表达方式,具有良好的悲伤和快乐质量。如果只对音高轮廓进行转换,提问也可以得到相同的结果,因为频谱转换会降低输出质量,但不会增加表达性,而持续时间转换对提问没有影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信