Arabic Text Classification: A Literature Review

2021 IEEE/ACS 18th International Conference on Computer Systems and Applications (AICCSA) Pub Date : 2021-11-01 DOI:10.1109/AICCSA53542.2021.9686874

Bilel Elayeb

引用次数: 2

Abstract

Automatic text classification or categorization consists to assign predefined classes or categories to a given set of text documents aiming to organize the document collection based on conceptual views. Although there are many text classifiers in the literature, most of them are assessed using English or other non-Arabic languages text collections. The lack of availability of a large collection in the Arabic language is one of the most important challenges facing the few numbers of existing Arabic text classifiers (ATC). We present in this paper a literature review in the domain of Arabic text classification. We firstly overview the ATC based on machine learning algorithms. Then, we investigate ATC based on deep learning techniques as well as a set of other classifiers based on non-ML algorithms. The assessment of these ATC is also discussed. Finally, we focus on some open problems and we suggest some future directions.

查看原文本刊更多论文

阿拉伯语文本分类:文献综述

自动文本分类或分类包括将预定义的类或类别分配给给定的文本文档集，目的是基于概念视图组织文档集合。虽然文献中有许多文本分类器，但大多数都是使用英语或其他非阿拉伯语文本集进行评估的。由于现有的阿拉伯语文本分类器(ATC)数量不多，缺乏大量阿拉伯语文本的可用性是它们面临的最重要的挑战之一。本文对阿拉伯语文本分类领域的相关文献进行了综述。我们首先概述了基于机器学习算法的ATC。然后，我们研究了基于深度学习技术的ATC以及一组基于非ml算法的其他分类器。对这些ATC的评价也进行了讨论。最后，对一些尚未解决的问题进行了讨论，并提出了未来的发展方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE/ACS 18th International Conference on Computer Systems and Applications (AICCSA)

自引率

0.00%

发文量