基于dnn的阿拉伯语语音合成

A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed
{"title":"基于dnn的阿拉伯语语音合成","authors":"A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed","doi":"10.1109/ICEEE55327.2022.9772602","DOIUrl":null,"url":null,"abstract":"This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.","PeriodicalId":375340,"journal":{"name":"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"DNN-Based Arabic Speech Synthesis\",\"authors\":\"A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed\",\"doi\":\"10.1109/ICEEE55327.2022.9772602\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.\",\"PeriodicalId\":375340,\"journal\":{\"name\":\"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEEE55327.2022.9772602\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEE55327.2022.9772602","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

本文讨论了一种基于深度神经网络的阿拉伯语文本到语音合成。采用主观测试和客观测试对系统进行评价。我们使用平均意见评分(Mean Opinion Score, MOS)进行主观评价,并使用诊断韵测试(Diagnostic Rhyme Test, DRT)测试部分辅音和元音的可理解性。我们使用语音质量感知评价(PESQ)进行客观评价。MOS和DRT测试的平均值分别为3.92/5、3.88/5,PESQ测试的平均值为3.17/5;大多数单词和句子被识别,系统的整体评价质量令人满意。此外,研究结果表明,与基于hmm的TTS相比,基于dnn的TTS在合成语音质量方面有显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
DNN-Based Arabic Speech Synthesis
This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信