PREPROCESSING ARABIC DIALECT FOR SENTIMENT MINING: STATE OF ART

Zineb Nassr, N. Sael, F. Benabbou
{"title":"PREPROCESSING ARABIC DIALECT FOR SENTIMENT MINING: STATE OF ART","authors":"Zineb Nassr, N. Sael, F. Benabbou","doi":"10.5194/ISPRS-ARCHIVES-XLIV-4-W3-2020-323-2020","DOIUrl":null,"url":null,"abstract":"Abstract. Sentiment Analysis concerns the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has become more crucial in the field of text mining research and has since been used to explore users’ opinions on various products or topics discussed on the Internet. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies, especially for sentiments written in non-structured or semi-structured languages. In this paper, we present a literature review on the pre-processing task on the field of sentiment analysis and an analytical and comparative study of different researches conducted in Arabic social networks. This study allowed as concluding that several works have dealt with the generation of stop words dictionary. In this context, two approaches are adopted: first, the manual one, which gives rise to a limited list, and second, the automatic, where the list of stop words is extracted from social networks based on defined rules. For stemming two, algorithms have been proposed to isolate prefixes and suffixes from words in dialects. However, few works have been interested in dialects directly without translation. The Moroccan dialect in particular is considered as the 5th dialect studied among Arabic dialects after Jordanian, Egyptian, Tunisian and Algerian dialects. Despite the significant lack in studies carried out on Arabic dialects, we were able to extract several conclusions about the difficulties and challenges encountered through this comparative study, as well as the possible ways and tracks to study in any dialects sentiment analysis pre-processing solution.","PeriodicalId":14757,"journal":{"name":"ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences","volume":"1 1","pages":"323-330"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/ISPRS-ARCHIVES-XLIV-4-W3-2020-323-2020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Abstract. Sentiment Analysis concerns the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has become more crucial in the field of text mining research and has since been used to explore users’ opinions on various products or topics discussed on the Internet. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies, especially for sentiments written in non-structured or semi-structured languages. In this paper, we present a literature review on the pre-processing task on the field of sentiment analysis and an analytical and comparative study of different researches conducted in Arabic social networks. This study allowed as concluding that several works have dealt with the generation of stop words dictionary. In this context, two approaches are adopted: first, the manual one, which gives rise to a limited list, and second, the automatic, where the list of stop words is extracted from social networks based on defined rules. For stemming two, algorithms have been proposed to isolate prefixes and suffixes from words in dialects. However, few works have been interested in dialects directly without translation. The Moroccan dialect in particular is considered as the 5th dialect studied among Arabic dialects after Jordanian, Egyptian, Tunisian and Algerian dialects. Despite the significant lack in studies carried out on Arabic dialects, we were able to extract several conclusions about the difficulties and challenges encountered through this comparative study, as well as the possible ways and tracks to study in any dialects sentiment analysis pre-processing solution.
用于情感挖掘的阿拉伯语方言预处理:技术现状
摘要情感分析涉及对产品、服务、公司、个人、任务、事件、头衔及其特征的想法、情感、评价、价值观、态度和感受的分析。随着互联网和社交网络应用的增加,情感分析在文本挖掘研究领域变得越来越重要,并已被用于探索用户对互联网上讨论的各种产品或话题的看法。自然语言处理和计算语言学领域的发展对情感分析研究做出了积极的贡献,特别是对用非结构化或半结构化语言编写的情感。在本文中,我们对情感分析领域的预处理任务进行了文献综述,并对在阿拉伯社会网络中进行的不同研究进行了分析和比较研究。本研究认为,已有多篇论著涉及停词词典的生成。在这种情况下,采用了两种方法:第一种是手动方法,它产生一个有限的列表;第二种是自动方法,其中根据定义的规则从社交网络中提取停止词列表。对于词干二,已经提出了从方言单词中分离前缀和后缀的算法。然而,很少有作品直接对方言感兴趣而不进行翻译。特别是摩洛哥方言被认为是继约旦方言、埃及方言、突尼斯方言和阿尔及利亚方言之后研究的第5种阿拉伯方言。尽管对阿拉伯语方言的研究明显缺乏,但我们能够通过比较研究得出一些关于遇到的困难和挑战的结论,以及在任何方言情感分析预处理解决方案中可能的研究方法和轨迹。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信