基于隐马尔可夫模型的阿拉伯语方言系统

WSEAS TRANSACTIONS ON COMPUTERS Pub Date : 2022-11-10 DOI:10.37394/23205.2022.21.37

Z. Zubi, Eman Jibril Idris

{"title":"基于隐马尔可夫模型的阿拉伯语方言系统","authors":"Z. Zubi, Eman Jibril Idris","doi":"10.37394/23205.2022.21.37","DOIUrl":null,"url":null,"abstract":"The Arabic language has many different dialects and it must be recognized before using the automatic speech recognition (ASR). On the other hand, it is observed in all Arab countries that the standard Arabic language is widely written and used in an official speech, newspapers, public administration, and schools but it is not used in daily conversations instead the dialect is widely spoken in daily life and rarely written. In this paper, we examine the difficult task of properly identifying various Arabic dialects and propose a system developed to identify a set of four regional and modern standard Arabic speeches, based on speech recognition using Hidden Markov Models (HMMs) algorithms. HMMs have become a very popular way to build a speech recognition system. It is set as hidden states and possibilities of transition from one state to another. Due to the similarities and differences between the Arabic dialects, speeches collected from the ADI5 datasets were retrieved from the MGB-3 challenge source. We proposed an Arabic Dialect Identification System called \"Building a System for Arabic Dialects Identification based on Speech Recognition using Hidden Markov Models (HMMs)\" that takes Input as speech utterances and produces output as dialect being spoken. During the training phase, speech utterances from one or more dialects were analyzed to capture the important properties of audio signals in terms of time and frequency. During the testing phase, previously unseen test utterances were utilized to the system, and the system outputs the dialect associated with the model of dialect that most closely matches the test utterance. The proposed model of the system shows promising results of the model for each dialect match.","PeriodicalId":332148,"journal":{"name":"WSEAS TRANSACTIONS ON COMPUTERS","volume":"159 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Arabic Dialects System using Hidden Markov Models (HMMs)\",\"authors\":\"Z. Zubi, Eman Jibril Idris\",\"doi\":\"10.37394/23205.2022.21.37\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Arabic language has many different dialects and it must be recognized before using the automatic speech recognition (ASR). On the other hand, it is observed in all Arab countries that the standard Arabic language is widely written and used in an official speech, newspapers, public administration, and schools but it is not used in daily conversations instead the dialect is widely spoken in daily life and rarely written. In this paper, we examine the difficult task of properly identifying various Arabic dialects and propose a system developed to identify a set of four regional and modern standard Arabic speeches, based on speech recognition using Hidden Markov Models (HMMs) algorithms. HMMs have become a very popular way to build a speech recognition system. It is set as hidden states and possibilities of transition from one state to another. Due to the similarities and differences between the Arabic dialects, speeches collected from the ADI5 datasets were retrieved from the MGB-3 challenge source. We proposed an Arabic Dialect Identification System called \\\"Building a System for Arabic Dialects Identification based on Speech Recognition using Hidden Markov Models (HMMs)\\\" that takes Input as speech utterances and produces output as dialect being spoken. During the training phase, speech utterances from one or more dialects were analyzed to capture the important properties of audio signals in terms of time and frequency. During the testing phase, previously unseen test utterances were utilized to the system, and the system outputs the dialect associated with the model of dialect that most closely matches the test utterance. The proposed model of the system shows promising results of the model for each dialect match.\",\"PeriodicalId\":332148,\"journal\":{\"name\":\"WSEAS TRANSACTIONS ON COMPUTERS\",\"volume\":\"159 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"WSEAS TRANSACTIONS ON COMPUTERS\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.37394/23205.2022.21.37\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"WSEAS TRANSACTIONS ON COMPUTERS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37394/23205.2022.21.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

阿拉伯语有许多不同的方言，在使用自动语音识别(ASR)之前必须对其进行识别。另一方面，在所有阿拉伯国家，标准阿拉伯语被广泛地书面使用，并用于官方演讲、报纸、公共管理和学校，但它不用于日常对话，而是在日常生活中广泛使用方言，很少书面使用。在本文中，我们研究了正确识别各种阿拉伯语方言的困难任务，并提出了一个基于隐马尔可夫模型(hmm)算法的语音识别系统，该系统用于识别一组四种区域和现代标准阿拉伯语语音。hmm已经成为一种非常流行的构建语音识别系统的方法。它被设置为隐藏状态和从一种状态过渡到另一种状态的可能性。由于阿拉伯语方言之间的相似性和差异性，从ADI5数据集收集的演讲从MGB-3挑战源中检索。我们提出了一种阿拉伯方言识别系统，即“基于隐马尔可夫模型的语音识别阿拉伯方言识别系统的构建”，该系统将输入作为语音，输出作为所说的方言。在训练阶段，对来自一种或多种方言的语音进行分析，以捕获音频信号在时间和频率方面的重要属性。在测试阶段，系统利用以前未见过的测试话语，系统输出与最接近测试话语的方言模型相关联的方言。提出的系统模型对每个方言匹配都显示了良好的模型效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Arabic Dialects System using Hidden Markov Models (HMMs)

The Arabic language has many different dialects and it must be recognized before using the automatic speech recognition (ASR). On the other hand, it is observed in all Arab countries that the standard Arabic language is widely written and used in an official speech, newspapers, public administration, and schools but it is not used in daily conversations instead the dialect is widely spoken in daily life and rarely written. In this paper, we examine the difficult task of properly identifying various Arabic dialects and propose a system developed to identify a set of four regional and modern standard Arabic speeches, based on speech recognition using Hidden Markov Models (HMMs) algorithms. HMMs have become a very popular way to build a speech recognition system. It is set as hidden states and possibilities of transition from one state to another. Due to the similarities and differences between the Arabic dialects, speeches collected from the ADI5 datasets were retrieved from the MGB-3 challenge source. We proposed an Arabic Dialect Identification System called "Building a System for Arabic Dialects Identification based on Speech Recognition using Hidden Markov Models (HMMs)" that takes Input as speech utterances and produces output as dialect being spoken. During the training phase, speech utterances from one or more dialects were analyzed to capture the important properties of audio signals in terms of time and frequency. During the testing phase, previously unseen test utterances were utilized to the system, and the system outputs the dialect associated with the model of dialect that most closely matches the test utterance. The proposed model of the system shows promising results of the model for each dialect match.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

WSEAS TRANSACTIONS ON COMPUTERS

自引率

0.00%

发文量