Workshop on Arabic Natural Language Processing最新文献_第5页

Domain Adaptation for Arabic Crisis Response 阿拉伯危机应对的领域适应

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.23

Reem AlRashdi, Simon E. M. O'Keefe

引用次数: 0

SI2M & AIOX Labs at WANLP 2022 Shared Task: Propaganda Detection in Arabic, A Data Augmentation and Name Entity Recognition Approach SI2M和AIOX实验室在WANLP 2022共享任务:阿拉伯语的宣传检测，数据增强和名称实体识别方法

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.58

Kamel Gaanoun, Imade Benelallam

引用次数: 1

TF-IDF or Transformers for Arabic Dialect Identification? ITFLOWS participation in the NADI 2022 Shared Task TF-IDF或变压器用于阿拉伯语方言识别?itflow参与NADI 2022共享任务

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.42

Fouad Shammary, Yiyi Chen, Z. T. Kardkovács, Mehwish Alam, Haithem Afli

引用次数: 2

Building an Ensemble of Transformer Models for Arabic Dialect Classification and Sentiment Analysis 阿拉伯语方言分类与情感分析的变压器模型集成

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.53

Abdullah Salem Khered, Ingy Yasser Hassan Abdou Abdelhalim, R. Batista-Navarro

{"title":"Building an Ensemble of Transformer Models for Arabic Dialect Classification and Sentiment Analysis","authors":"Abdullah Salem Khered, Ingy Yasser Hassan Abdou Abdelhalim, R. Batista-Navarro","doi":"10.18653/v1/2022.wanlp-1.53","DOIUrl":"https://doi.org/10.18653/v1/2022.wanlp-1.53","url":null,"abstract":"In this paper, we describe the approaches we developed for the Nuanced Arabic Dialect Identification (NADI) 2022 shared task, which consists of two subtasks: the identification of country-level Arabic dialects and sentiment analysis. Our team, UniManc, developed approaches to the two subtasks which are underpinned by the same model: a pre-trained MARBERT language model. For Subtask 1, we applied undersampling to create versions of the training data with a balanced distribution across classes. For Subtask 2, we further trained the original MARBERT model for the masked language modelling objective using a NADI-provided dataset of unlabelled Arabic tweets. For each of the subtasks, a MARBERT model was fine-tuned for sequence classification, using different values for hyperparameters such as seed and learning rate. This resulted in multiple model variants, which formed the basis of an ensemble model for each subtask. Based on the official NADI evaluation, our ensemble model obtained a macro-F1-score of 26.863, ranking second overall in the first subtask. In the second subtask, our ensemble model also ranked second, obtaining a macro-F1-PN score (macro-averaged F1-score over the Positive and Negative classes) of 73.544.","PeriodicalId":355149,"journal":{"name":"Workshop on Arabic Natural Language Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132557081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

TUB at WANLP22 Shared Task: Using Semantic Similarity for Propaganda Detection in Arabic 共享任务:使用语义相似度进行阿拉伯语宣传检测

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.57

Salar Mohtaj, Sebastian Möller

引用次数: 3

Generating Classical Arabic Poetry using Pre-trained Models 使用预训练模型生成古典阿拉伯诗歌

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.6

Nehal Elkaref, Mervat Abu-Elkheir, Maryam ElOraby, Mohamed Abdelgaber

引用次数: 1

Arabic Keyphrase Extraction: Enhancing Deep Learning Models with Pre-trained Contextual Embedding and External Features 阿拉伯语关键词提取:使用预训练的上下文嵌入和外部特征增强深度学习模型

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.30

Randah Alharbi, H. Al-Muhtaseb

引用次数: 0

Event-Based Knowledge MLM for Arabic Event Detection 基于事件的知识传销阿拉伯语事件检测

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.25

Asma Z. Yamani, Amjad K Alsulami, Rabeah Al-Zaidy

引用次数: 0

Cross-lingual transfer for low-resource Arabic language understanding 低资源阿拉伯语理解的跨语迁移

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.21

Khadige Abboud, O. Yu. Golovneva, C. Dipersio

{"title":"Cross-lingual transfer for low-resource Arabic language understanding","authors":"Khadige Abboud, O. Yu. Golovneva, C. Dipersio","doi":"10.18653/v1/2022.wanlp-1.21","DOIUrl":"https://doi.org/10.18653/v1/2022.wanlp-1.21","url":null,"abstract":"This paper explores cross-lingual transfer learning in natural language understanding (NLU), with the focus on bootstrapping Arabic from high-resource English and French languages for domain classification, intent classification, and named entity recognition tasks. We adopt a BERT-based architecture and pretrain three models using open-source Wikipedia data and large-scale commercial datasets: monolingual:Arabic, bilingual:Arabic-English, and trilingual:Arabic-English-French models. Additionally, we use off-the-shelf machine translator to translate internal data from source English language to the target Arabic language, in an effort to enhance transfer learning through translation. We conduct experiments that finetune the three models for NLU tasks and evaluate them on a large internal dataset. Despite the morphological, orthographical, and grammatical differences between Arabic and the source languages, transfer learning performance gains from source languages and through machine translation are achieved on a real-world Arabic test dataset in both a zero-shot setting and in a setting when the models are further finetuned on labeled data from the target language.","PeriodicalId":355149,"journal":{"name":"Workshop on Arabic Natural Language Processing","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124641758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Establishing a Baseline for Arabic Patents Classification: A Comparison of Twelve Approaches 建立阿拉伯专利分类基线:12种方法的比较

Workshop on Arabic Natural Language Processing Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.wanlp-1.26

Taif Omar Al-Omar, H. Al-Khalifa, Rawan N. Al-Matham

引用次数: 0