A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis

2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC) Pub Date : 2022-06-01 DOI:10.1109/COMPSAC54236.2022.00196

Mohamed Zouidine, Mohammed Khalil

引用次数: 0

Abstract

In this paper, we conduct a series of experiments to systematically study both context-independent and context-dependent word embeddings for the purpose of Arabic sentiment analysis. We use pretrained word embeddings as fixed features extractors to provide input features for a CNN model. Experimental results with two different Arabic sentiment analysis datasets indicate that the pre-trained contextualized AraBERT model is the most suitable for such tasks. AraBERT reaches an accuracy score of 91.4% and 95.49% on the large Arabic book reviews dataset (LABR) and the hotel Arabic-reviews dataset (HARD), respectively.

查看原文本刊更多论文

面向阿拉伯语情感分析的预训练词嵌入比较研究

在本文中，我们进行了一系列实验来系统地研究上下文无关和上下文相关的词嵌入，以用于阿拉伯语情感分析。我们使用预训练的词嵌入作为固定特征提取器，为CNN模型提供输入特征。两种不同阿拉伯语情感分析数据集的实验结果表明，预训练的情境化AraBERT模型最适合此类任务。AraBERT在大型阿拉伯语书评数据集(LABR)和酒店阿拉伯语评论数据集(HARD)上分别达到了91.4%和95.49%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)

自引率

0.00%

发文量