Automatic Domain-Specific Sentiment Lexicon Generation with Label Propagation

Yen-Jen Tai, Hung-Yu kao
{"title":"Automatic Domain-Specific Sentiment Lexicon Generation with Label Propagation","authors":"Yen-Jen Tai, Hung-Yu kao","doi":"10.1145/2539150.2539190","DOIUrl":null,"url":null,"abstract":"Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.","PeriodicalId":424918,"journal":{"name":"International Conference on Information Integration and Web-based Applications & Services","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"55","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2539150.2539190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 55

Abstract

Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.
基于标签传播的领域特定情感词典自动生成
如今,社交媒体的进步导致了意见数据的爆炸式增长。因此,情感分析引起了人们的广泛关注。目前,情感分析应用分为两种主要方法,基于词典的方法和机器学习方法。然而,两者都面临着获取大量人工标记训练数据和语料库的挑战。对于基于词典的方法,它需要一个情感词典来确定意见的极性。现有的基准情感词典很多,但它们不能涵盖所有特定领域的词义。因此,自动生成特定领域的情感词典成为一项重要的任务。提出了一种自动生成情感词汇的框架。首先,我们确定了整个未标记语料库中两个词之间的语义相似度。我们将单词作为节点,将相似度作为加权边来构建单词图。一种基于图的半监督标签传播方法最终通过所提出的传播过程为未标记的单词分配极性。在微博数据Twitter上进行的实验表明,我们的方法比基线方法和通用情感词典的性能更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信