{"title":"Detecting Regional Arabic Dialect based on Recurrent Neural Network","authors":"Dalia Alzu'bi, R. Duwairi","doi":"10.1109/ICICS52457.2021.9464605","DOIUrl":null,"url":null,"abstract":"In recent times, Arabic text analysis has attracted great interest, due to the widespread and use of the Arabic language by social media platforms, applications, and communities, and others. Each Arabian country has a special dialect that distinguishes it from others. Accordingly, the work on classifying these dialects is an interesting area of research, as it has implications for other areas, such as; sentiment analysis and machine translation. In this paper, we build a multi-task classification model for dialects based on utilizing Recurrent Neural Networks, where the dialects are classified into four categories, namely; Maghreb, Levantine, Gulf (in addition to Iraqi), and the Nile. The used dataset is taken from the MADAR corpus, which contained 110,000 sentences, these belong to dialects of different countries in the four regions. Based on experimentations, the results revealed that the classifiers are able to distinguish between the four dialects with an accuracy of up to 84.76%, which in turn is considered a promising result in this field.","PeriodicalId":421803,"journal":{"name":"2021 12th International Conference on Information and Communication Systems (ICICS)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Conference on Information and Communication Systems (ICICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICS52457.2021.9464605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In recent times, Arabic text analysis has attracted great interest, due to the widespread and use of the Arabic language by social media platforms, applications, and communities, and others. Each Arabian country has a special dialect that distinguishes it from others. Accordingly, the work on classifying these dialects is an interesting area of research, as it has implications for other areas, such as; sentiment analysis and machine translation. In this paper, we build a multi-task classification model for dialects based on utilizing Recurrent Neural Networks, where the dialects are classified into four categories, namely; Maghreb, Levantine, Gulf (in addition to Iraqi), and the Nile. The used dataset is taken from the MADAR corpus, which contained 110,000 sentences, these belong to dialects of different countries in the four regions. Based on experimentations, the results revealed that the classifiers are able to distinguish between the four dialects with an accuracy of up to 84.76%, which in turn is considered a promising result in this field.