{"title":"Efficient sentiment classification of Twitter feeds","authors":"Nicholas Chamansingh, Patrick Hosein","doi":"10.1109/ICKEA.2016.7802996","DOIUrl":null,"url":null,"abstract":"Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.","PeriodicalId":241850,"journal":{"name":"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKEA.2016.7802996","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.