Hamed Ramadan, Mohammad M. Alqahtani, Abdullah Algoson
{"title":"使用深度学习技术识别不同阿拉伯语方言中的等效词","authors":"Hamed Ramadan, Mohammad M. Alqahtani, Abdullah Algoson","doi":"10.1109/ESOLEC54569.2022.10009555","DOIUrl":null,"url":null,"abstract":"The Arabic language comprises many spoken dialects. These dialects vary from a standard written Modern Standard Arabic (MSA) in terms of syntactic, lexical, phonological, and morphological. Arabic Dialects differ, not only along a geographical continuum, but also with other sociolinguistic factors such as the urban, rural, Bedouin dimension. Currently, Dialectal Arabic (DA) is the essential written language of unofficial communication in the Arab World. These Dialects can be found on social media platforms, emails, Twitter, etc. There has been a high interest in research on computational models of Arabic dialects in the last decade. Most of these studies focus on Arabic dialect identification (classification) and building Arabic dialect corpora. However, finding Arabic dialect word synonyms from another Arabic dialects has received limited attention. To bridge this gap, this study will develop a model to identify the equivalent words from different Arab world dialects using deep learning techniques such as word2vec. This research merged and extended the existing Arabic dialects corpora and then applied some deep learning techniques to achieve the best results for dialectal word synonyms. The outcomes of this research are a new dataset of Arabic dialectical word synonyms and a model with acceptable accuracy of 81%.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying Equivalent Words from Different Arabic Dialects Using Deep Learning Techniques\",\"authors\":\"Hamed Ramadan, Mohammad M. Alqahtani, Abdullah Algoson\",\"doi\":\"10.1109/ESOLEC54569.2022.10009555\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Arabic language comprises many spoken dialects. These dialects vary from a standard written Modern Standard Arabic (MSA) in terms of syntactic, lexical, phonological, and morphological. Arabic Dialects differ, not only along a geographical continuum, but also with other sociolinguistic factors such as the urban, rural, Bedouin dimension. Currently, Dialectal Arabic (DA) is the essential written language of unofficial communication in the Arab World. These Dialects can be found on social media platforms, emails, Twitter, etc. There has been a high interest in research on computational models of Arabic dialects in the last decade. Most of these studies focus on Arabic dialect identification (classification) and building Arabic dialect corpora. However, finding Arabic dialect word synonyms from another Arabic dialects has received limited attention. To bridge this gap, this study will develop a model to identify the equivalent words from different Arab world dialects using deep learning techniques such as word2vec. This research merged and extended the existing Arabic dialects corpora and then applied some deep learning techniques to achieve the best results for dialectal word synonyms. The outcomes of this research are a new dataset of Arabic dialectical word synonyms and a model with acceptable accuracy of 81%.\",\"PeriodicalId\":179850,\"journal\":{\"name\":\"2022 20th International Conference on Language Engineering (ESOLEC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 20th International Conference on Language Engineering (ESOLEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ESOLEC54569.2022.10009555\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 20th International Conference on Language Engineering (ESOLEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESOLEC54569.2022.10009555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identifying Equivalent Words from Different Arabic Dialects Using Deep Learning Techniques
The Arabic language comprises many spoken dialects. These dialects vary from a standard written Modern Standard Arabic (MSA) in terms of syntactic, lexical, phonological, and morphological. Arabic Dialects differ, not only along a geographical continuum, but also with other sociolinguistic factors such as the urban, rural, Bedouin dimension. Currently, Dialectal Arabic (DA) is the essential written language of unofficial communication in the Arab World. These Dialects can be found on social media platforms, emails, Twitter, etc. There has been a high interest in research on computational models of Arabic dialects in the last decade. Most of these studies focus on Arabic dialect identification (classification) and building Arabic dialect corpora. However, finding Arabic dialect word synonyms from another Arabic dialects has received limited attention. To bridge this gap, this study will develop a model to identify the equivalent words from different Arab world dialects using deep learning techniques such as word2vec. This research merged and extended the existing Arabic dialects corpora and then applied some deep learning techniques to achieve the best results for dialectal word synonyms. The outcomes of this research are a new dataset of Arabic dialectical word synonyms and a model with acceptable accuracy of 81%.