Twitter feed的高效情感分类

2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA) Pub Date : 2016-09-01 DOI:10.1109/ICKEA.2016.7802996

Nicholas Chamansingh, Patrick Hosein

{"title":"Twitter feed的高效情感分类","authors":"Nicholas Chamansingh, Patrick Hosein","doi":"10.1109/ICKEA.2016.7802996","DOIUrl":null,"url":null,"abstract":"Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.","PeriodicalId":241850,"journal":{"name":"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Efficient sentiment classification of Twitter feeds\",\"authors\":\"Nicholas Chamansingh, Patrick Hosein\",\"doi\":\"10.1109/ICKEA.2016.7802996\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.\",\"PeriodicalId\":241850,\"journal\":{\"name\":\"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICKEA.2016.7802996\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKEA.2016.7802996","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

情感分析包括使用自然语言处理以及统计学和机器学习方法来从文本中识别、提取和表征情感元素。微博平台，如推特，允许分享来自数百万用户对各种话题的实时评论和观点。本研究提出了一个实验来确定一个有效的实时Twitter消息的情感分类器。比较了朴素贝叶斯、支持向量机(SVM)和最大熵(MaxEnt)三种分类方法。对于每种方法，我们使用相同的预处理和特征选择方法。使用卡方特征选择来确定具有给定精度水平、存储要求和分类时间的分类器所需的最小特征集和训练数据大小。结果表明，与以前的工作相比，在保持可接受的精度水平的同时，可以实现数据输入和处理的显着减少。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient sentiment classification of Twitter feeds

Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)

自引率

0.00%

发文量