Building thesaurus lexicon using dictionary-based approach for sentiment classification

S. Park, Yanggon Kim
{"title":"Building thesaurus lexicon using dictionary-based approach for sentiment classification","authors":"S. Park, Yanggon Kim","doi":"10.1109/SERA.2016.7516126","DOIUrl":null,"url":null,"abstract":"Sentiment classification categorizes people's opinions from the data. Nowadays, people express their personal interests, feelings, and opinions on social media, and the posts on social media are frequently used as the data for sentiment classification. One of the sentiment classification approaches is a dictionary-based approach. A traditional dictionary-based sentiment classification approach uses word matching based on the lexicon. However, many posts cannot be analyzed by traditional dictionary-based sentiment classifier due to the absence of the sentiment words in the lexicon. For this reason, it is needed to expand the lexicon so that the lexicon can contain the words. In this paper, we propose a method to build thesaurus lexicon using dictionary-based approach for the sentiment classification. The proposed method uses three online dictionaries to collect thesauruses based on the seed words, and stores only co-occurrence words into the thesaurus lexicon in order to improve the reliability of the thesaurus lexicon. Also, this method recursively collects thesauruses which are a set of synonyms and antonyms to expand the thesaurus lexicon. This recursive thesaurus collection provides effective expansion of the lexicon from small set without the use of human resource, and the expanded thesaurus lexicon is used to increase availability of posts and used to increase accuracy of the sentiment classification.","PeriodicalId":412361,"journal":{"name":"2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERA.2016.7516126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34

Abstract

Sentiment classification categorizes people's opinions from the data. Nowadays, people express their personal interests, feelings, and opinions on social media, and the posts on social media are frequently used as the data for sentiment classification. One of the sentiment classification approaches is a dictionary-based approach. A traditional dictionary-based sentiment classification approach uses word matching based on the lexicon. However, many posts cannot be analyzed by traditional dictionary-based sentiment classifier due to the absence of the sentiment words in the lexicon. For this reason, it is needed to expand the lexicon so that the lexicon can contain the words. In this paper, we propose a method to build thesaurus lexicon using dictionary-based approach for the sentiment classification. The proposed method uses three online dictionaries to collect thesauruses based on the seed words, and stores only co-occurrence words into the thesaurus lexicon in order to improve the reliability of the thesaurus lexicon. Also, this method recursively collects thesauruses which are a set of synonyms and antonyms to expand the thesaurus lexicon. This recursive thesaurus collection provides effective expansion of the lexicon from small set without the use of human resource, and the expanded thesaurus lexicon is used to increase availability of posts and used to increase accuracy of the sentiment classification.
使用基于词典的方法构建词库词典进行情感分类
情绪分类根据数据对人们的观点进行分类。如今,人们在社交媒体上表达个人的兴趣、感受和观点,社交媒体上的帖子经常被用作情感分类的数据。情感分类方法之一是基于字典的方法。传统的基于词典的情感分类方法使用基于词典的词匹配。然而,由于词典中缺乏情感词,传统的基于词典的情感分类器无法对许多帖子进行分析。出于这个原因,需要扩展词典,以便词典可以包含单词。在本文中,我们提出了一种基于词典的情感分类方法来构建词库词典的方法。该方法利用三个在线词典根据种子词收集同义词典,仅将共现词存储到同义词典中,以提高同义词典的可靠性。此外,该方法递归地收集同义词典,这些同义词典是同义词和反义词的集合,以扩展同义词典词典。这种递归的同义词词典集合在不使用人力资源的情况下,从小集合有效地扩展了词典,扩展后的同义词词典用于提高帖子的可用性,并用于提高情感分类的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信