Efficient sentiment classification of Twitter feeds

2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA) Pub Date : 2016-09-01 DOI:10.1109/ICKEA.2016.7802996

Nicholas Chamansingh, Patrick Hosein

引用次数: 6

Abstract

Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.

查看原文本刊更多论文

Twitter feed的高效情感分类

情感分析包括使用自然语言处理以及统计学和机器学习方法来从文本中识别、提取和表征情感元素。微博平台，如推特，允许分享来自数百万用户对各种话题的实时评论和观点。本研究提出了一个实验来确定一个有效的实时Twitter消息的情感分类器。比较了朴素贝叶斯、支持向量机(SVM)和最大熵(MaxEnt)三种分类方法。对于每种方法，我们使用相同的预处理和特征选择方法。使用卡方特征选择来确定具有给定精度水平、存储要求和分类时间的分类器所需的最小特征集和训练数据大小。结果表明，与以前的工作相比，在保持可接受的精度水平的同时，可以实现数据输入和处理的显着减少。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)

自引率

0.00%

发文量