Sentence-Level and Document-Level Sentiment Mining for Arabic Texts

N. Farra, Elie Challita, R. A. Assi, Hazem M. Hajj
{"title":"Sentence-Level and Document-Level Sentiment Mining for Arabic Texts","authors":"N. Farra, Elie Challita, R. A. Assi, Hazem M. Hajj","doi":"10.1109/ICDMW.2010.95","DOIUrl":null,"url":null,"abstract":"In this work, we investigate sentiment mining of Arabic text at both the sentence level and the document level. Existing research in Arabic sentiment mining remains very limited. For sentence-level classification, we investigate two approaches. The first is a novel grammatical approach that employs the use of a general structure for the Arabic sentence. The second approach is based on the semantic orientation of words and their corresponding frequencies, to do this we built an interactive learning semantic dictionary which stores the polarities of the roots of different words and identifies new polarities based on these roots. For document-level classification, we use sentences of known classes to classify whole documents, using a novel approach whereby documents are divided dynamically into chunks and classification is based on the semantic contributions of different chunks in the document. This dynamic chunking approach can also be investigated for sentiment mining in other languages. Finally, we propose a hierarchical classification scheme that uses the results of the sentence-level classifier as input to the document-level classifier, an approach which has not been investigated previously for Arabic documents. We also pinpoint the various challenges that are faced by sentiment mining for Arabic texts and propose suggestions for its development. We demonstrate promising results with our sentence-level approach, and our document-level experiments show, with high accuracy, that it is feasible to extract the sentiment of an Arabic document based on the classes of its sentences.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"186","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2010.95","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 186

Abstract

In this work, we investigate sentiment mining of Arabic text at both the sentence level and the document level. Existing research in Arabic sentiment mining remains very limited. For sentence-level classification, we investigate two approaches. The first is a novel grammatical approach that employs the use of a general structure for the Arabic sentence. The second approach is based on the semantic orientation of words and their corresponding frequencies, to do this we built an interactive learning semantic dictionary which stores the polarities of the roots of different words and identifies new polarities based on these roots. For document-level classification, we use sentences of known classes to classify whole documents, using a novel approach whereby documents are divided dynamically into chunks and classification is based on the semantic contributions of different chunks in the document. This dynamic chunking approach can also be investigated for sentiment mining in other languages. Finally, we propose a hierarchical classification scheme that uses the results of the sentence-level classifier as input to the document-level classifier, an approach which has not been investigated previously for Arabic documents. We also pinpoint the various challenges that are faced by sentiment mining for Arabic texts and propose suggestions for its development. We demonstrate promising results with our sentence-level approach, and our document-level experiments show, with high accuracy, that it is feasible to extract the sentiment of an Arabic document based on the classes of its sentences.
阿拉伯语文本的句子级和文档级情感挖掘
在这项工作中,我们从句子层面和文档层面研究了阿拉伯语文本的情感挖掘。现有的关于阿拉伯语情感挖掘的研究仍然非常有限。对于句子级分类,我们研究了两种方法。第一种是一种新颖的语法方法,它采用了阿拉伯语句子的一般结构。第二种方法是基于单词的语义方向及其相应的频率,为此我们建立了一个交互式学习语义词典,该词典存储不同单词词根的极性,并根据这些词根识别新的极性。对于文档级分类,我们使用已知类的句子对整个文档进行分类,使用一种新颖的方法,将文档动态划分为块,并基于文档中不同块的语义贡献进行分类。这种动态分块方法也可以用于其他语言的情感挖掘。最后,我们提出了一种分层分类方案,该方案使用句子级分类器的结果作为文档级分类器的输入,这种方法以前没有研究过阿拉伯语文档。我们还指出了阿拉伯语文本情感挖掘面临的各种挑战,并提出了其发展建议。我们用句子级的方法证明了有希望的结果,我们的文档级实验表明,基于句子的类别提取阿拉伯语文档的情感是可行的,并且具有很高的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信