A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed
{"title":"基于dnn的阿拉伯语语音合成","authors":"A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed","doi":"10.1109/ICEEE55327.2022.9772602","DOIUrl":null,"url":null,"abstract":"This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.","PeriodicalId":375340,"journal":{"name":"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"DNN-Based Arabic Speech Synthesis\",\"authors\":\"A. Amrouche, Y. Bentrcia, Khadidja Nesrine Boubakeur, Ahcène Abed\",\"doi\":\"10.1109/ICEEE55327.2022.9772602\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.\",\"PeriodicalId\":375340,\"journal\":{\"name\":\"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEEE55327.2022.9772602\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th International Conference on Electrical and Electronics Engineering (ICEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEE55327.2022.9772602","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This article discusses a Deep Neural Network-based Text-to-Speech synthesis for the Arabic language. Subjective and objective tests were used to evaluate the system. We used the Mean Opinion Score (MOS) for subjective evaluation, and the Diagnostic Rhyme Test (DRT) to test the intelligibility of some consonants and vowels. We use the Perceptual Evaluation of Speech Quality (PESQ) for objective evaluation. The results have a mean of 3.92/5, 3.88/5 for the MOS and DRT tests, respectively, and 3.17/5 for the PESQ test; the majority of words and sentences were recognized, and the system's overall evaluation quality was satisfactory. Furthermore, the results show a significant improvement in the quality of synthesized speech for DNN-based TTS when compared to its HMM-based counterpart.