Mawdoo3 AI at MADAR Shared Task: Arabic Fine-Grained Dialect Identification with Ensemble Learning

WANLP@ACL 2019 Pub Date : 2019-08-01 DOI:10.18653/v1/W19-4630

A. Ragab, Haitham Seelawi, Mostafa Samir, Abdelrahman Mattar, Hesham Al-Bataineh, Mohammad Zaghloul, Ahmad Mustafa, Bashar Talafha, Abed Alhakim Freihat, Hussein T. Al-Natsheh

引用次数: 12

Abstract

In this paper we discuss several models we used to classify 25 city-level Arabic dialects in addition to Modern Standard Arabic (MSA) as part of MADAR shared task (sub-task 1). We propose an ensemble model of a group of experimentally designed best performing classifiers on a various set of features. Our system achieves an accuracy of 69.3% macro F1-score with an improvement of 1.4% accuracy from the baseline model on the DEV dataset. Our best run submitted model ranked as third out of 19 participating teams on the TEST dataset with only 0.12% macro F1-score behind the top ranked system.

查看原文本刊更多论文

基于集成学习的阿拉伯语细粒度方言识别

在本文中，我们讨论了我们用于分类25个城市级阿拉伯语方言以及现代标准阿拉伯语(MSA)的几个模型，作为MADAR共享任务(子任务1)的一部分。我们提出了一组实验设计的性能最佳的分类器在各种特征集上的集成模型。我们的系统在DEV数据集上实现了69.3%宏观f1得分的准确率，比基线模型的准确率提高了1.4%。我们提交的最佳运行模型在TEST数据集中的19个参赛队中排名第三，仅比排名第一的系统低0.12%的宏观f1分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

WANLP@ACL 2019

自引率

0.00%

发文量