基于dnn的阿拉伯语语音合成

2022 9th International Conference on Electrical and Electronics Engineering (ICEEE) Pub Date : 2022-03-29 DOI:10.1109/ICEEE55327.2022.9772602

A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed

{"title":"基于dnn的阿拉伯语语音合成","authors":"A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed","doi":"10.1109/ICEEE55327.2022.9772602","DOIUrl":null,"url":null,"abstract":"This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.","PeriodicalId":375340,"journal":{"name":"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"DNN-Based Arabic Speech Synthesis\",\"authors\":\"A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed\",\"doi\":\"10.1109/ICEEE55327.2022.9772602\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.\",\"PeriodicalId\":375340,\"journal\":{\"name\":\"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEEE55327.2022.9772602\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEE55327.2022.9772602","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

本文讨论了一种基于深度神经网络的阿拉伯语文本到语音合成。采用主观测试和客观测试对系统进行评价。我们使用平均意见评分(Mean Opinion Score, MOS)进行主观评价，并使用诊断韵测试(Diagnostic Rhyme Test, DRT)测试部分辅音和元音的可理解性。我们使用语音质量感知评价(PESQ)进行客观评价。MOS和DRT测试的平均值分别为3.92/5、3.88/5,PESQ测试的平均值为3.17/5;大多数单词和句子被识别，系统的整体评价质量令人满意。此外，研究结果表明，与基于hmm的TTS相比，基于dnn的TTS在合成语音质量方面有显著提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DNN-Based Arabic Speech Synthesis

This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)

自引率

0.00%

发文量