{"title":"Twitter feed的高效情感分类","authors":"Nicholas Chamansingh, Patrick Hosein","doi":"10.1109/ICKEA.2016.7802996","DOIUrl":null,"url":null,"abstract":"Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.","PeriodicalId":241850,"journal":{"name":"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Efficient sentiment classification of Twitter feeds\",\"authors\":\"Nicholas Chamansingh, Patrick Hosein\",\"doi\":\"10.1109/ICKEA.2016.7802996\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.\",\"PeriodicalId\":241850,\"journal\":{\"name\":\"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICKEA.2016.7802996\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKEA.2016.7802996","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient sentiment classification of Twitter feeds
Sentiment Analysis encompasses the use of Natural Language Processing together with statistics and machine learning methods for the identification, extraction and characterization of sentiment elements from a body of text. Micro-blog platforms, such as Twitter, allows for the sharing of real-time comments and opinions from millions of users on various topics. This research presents an experiment to determine an efficient sentiment classifier of real-time Twitter feeds. Naive Bayes, Support Vector Machine (SVM) and Maximum Entropy (MaxEnt) classification methods were compared. For each approach we used the same pre-processing and feature selection methods. Chi-Square feature selection was used to determine the smallest feature set and training data size needed for a classifier with a given accuracy level, storage requirements and classification time. Results show that, when compared to previous work, a significant reduction in data input and processing can be achieve while maintaining an acceptable level of accuracy.