基于LSTM网络的阿拉伯语发音错误识别系统

Inf. Comput. Pub Date : 2023-07-16 DOI:10.3390/info14070413

A. Ahmed, Mohamed Bader, I. Shahin, A. B. Nassif, N. Werghi, Mohammad Basel

{"title":"基于LSTM网络的阿拉伯语发音错误识别系统","authors":"A. Ahmed, Mohamed Bader, I. Shahin, A. B. Nassif, N. Werghi, Mohammad Basel","doi":"10.3390/info14070413","DOIUrl":null,"url":null,"abstract":"The Arabic language has always been an immense source of attraction to various people from different ethnicities by virtue of the significant linguistic legacy that it possesses. Consequently, a multitude of people from all over the world are yearning to learn it. However, people from different mother tongues and cultural backgrounds might experience some hardships regarding articulation due to the absence of some particular letters only available in the Arabic language, which could hinder the learning process. As a result, a speaker-independent and text-dependent efficient system that aims to detect articulation disorders was implemented. In the proposed system, we emphasize the prominence of “speech signal processing” in diagnosing Arabic mispronunciation using the Mel-frequency cepstral coefficients (MFCCs) as the optimum extracted features. In addition, long short-term memory (LSTM) was also utilized for the classification process. Furthermore, the analytical framework was incorporated with a gender recognition model to perform two-level classification. Our results show that the LSTM network significantly enhances mispronunciation detection along with gender recognition. The LSTM models attained an average accuracy of 81.52% in the proposed system, reflecting a high performance compared to previous mispronunciation detection systems.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"20 1","pages":"413"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Arabic Mispronunciation Recognition System Using LSTM Network\",\"authors\":\"A. Ahmed, Mohamed Bader, I. Shahin, A. B. Nassif, N. Werghi, Mohammad Basel\",\"doi\":\"10.3390/info14070413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Arabic language has always been an immense source of attraction to various people from different ethnicities by virtue of the significant linguistic legacy that it possesses. Consequently, a multitude of people from all over the world are yearning to learn it. However, people from different mother tongues and cultural backgrounds might experience some hardships regarding articulation due to the absence of some particular letters only available in the Arabic language, which could hinder the learning process. As a result, a speaker-independent and text-dependent efficient system that aims to detect articulation disorders was implemented. In the proposed system, we emphasize the prominence of “speech signal processing” in diagnosing Arabic mispronunciation using the Mel-frequency cepstral coefficients (MFCCs) as the optimum extracted features. In addition, long short-term memory (LSTM) was also utilized for the classification process. Furthermore, the analytical framework was incorporated with a gender recognition model to perform two-level classification. Our results show that the LSTM network significantly enhances mispronunciation detection along with gender recognition. The LSTM models attained an average accuracy of 81.52% in the proposed system, reflecting a high performance compared to previous mispronunciation detection systems.\",\"PeriodicalId\":13622,\"journal\":{\"name\":\"Inf. Comput.\",\"volume\":\"20 1\",\"pages\":\"413\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Inf. Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/info14070413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inf. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info14070413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

由于阿拉伯语所拥有的重要语言遗产，它一直是吸引来自不同种族的各种人的巨大来源。因此，来自世界各地的许多人都渴望学习它。然而，来自不同母语和文化背景的人可能会在发音上遇到一些困难，因为阿拉伯语中没有一些特定的字母，这可能会阻碍学习过程。因此，实现了一个独立于说话人和文本依赖的高效系统，旨在检测发音障碍。在提出的系统中，我们强调“语音信号处理”在使用mel频率倒谱系数(MFCCs)作为最佳提取特征诊断阿拉伯语发音错误中的突出地位。此外，还利用了长短期记忆(LSTM)进行分类。此外，将分析框架与性别识别模型相结合，进行两级分类。我们的研究结果表明，LSTM网络显著提高了错误发音检测和性别识别。LSTM模型在系统中的平均准确率达到81.52%，与以前的错误发音检测系统相比，反映了较高的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Arabic Mispronunciation Recognition System Using LSTM Network

The Arabic language has always been an immense source of attraction to various people from different ethnicities by virtue of the significant linguistic legacy that it possesses. Consequently, a multitude of people from all over the world are yearning to learn it. However, people from different mother tongues and cultural backgrounds might experience some hardships regarding articulation due to the absence of some particular letters only available in the Arabic language, which could hinder the learning process. As a result, a speaker-independent and text-dependent efficient system that aims to detect articulation disorders was implemented. In the proposed system, we emphasize the prominence of “speech signal processing” in diagnosing Arabic mispronunciation using the Mel-frequency cepstral coefficients (MFCCs) as the optimum extracted features. In addition, long short-term memory (LSTM) was also utilized for the classification process. Furthermore, the analytical framework was incorporated with a gender recognition model to perform two-level classification. Our results show that the LSTM network significantly enhances mispronunciation detection along with gender recognition. The LSTM models attained an average accuracy of 81.52% in the proposed system, reflecting a high performance compared to previous mispronunciation detection systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Inf. Comput.

自引率

0.00%

发文量