{"title":"情感分析系统特征提取和分类方法的性能研究","authors":"Raghdah Elnadree, A. El-Sisi, Walid Atwa","doi":"10.21608/ijci.2021.65578.1044","DOIUrl":null,"url":null,"abstract":"Data pre-processing and feature extraction of micro-blogging data in sentiment analysis systems becomes an effective field of analysis. Object identification, negation expressions, sarcasm, outlines, misspellings are the major issues faced during sentiment analysis. So, data pre-processing in a sentiment analysis system is a conclusive step to improve data quality, raise the extraction, and classification of meaningful data. This paper presents a sentiment analysis system for performance investigation. Several pre-processing and feature extraction techniques are applied to optimize the sentiment analysis. Our system comprises three different components: data pre-processing, feature extraction, and sentiment analysis. The pre-processing and feature extraction approaches enhance the sentiment analysis system performance. We compare between different sentiment analysis approaches using a dataset of US Airlines from Twitter. Results show achieving high performance when using the Word2Vec approach with XGBoost and random forest classification algorithms. Also, the results show the classification technique, Naive Bayes is the lowest performance.","PeriodicalId":137729,"journal":{"name":"IJCI. International Journal of Computers and Information","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Investigation of Features Extraction and Classification Approaches for Sentiment Analysis Systems\",\"authors\":\"Raghdah Elnadree, A. El-Sisi, Walid Atwa\",\"doi\":\"10.21608/ijci.2021.65578.1044\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data pre-processing and feature extraction of micro-blogging data in sentiment analysis systems becomes an effective field of analysis. Object identification, negation expressions, sarcasm, outlines, misspellings are the major issues faced during sentiment analysis. So, data pre-processing in a sentiment analysis system is a conclusive step to improve data quality, raise the extraction, and classification of meaningful data. This paper presents a sentiment analysis system for performance investigation. Several pre-processing and feature extraction techniques are applied to optimize the sentiment analysis. Our system comprises three different components: data pre-processing, feature extraction, and sentiment analysis. The pre-processing and feature extraction approaches enhance the sentiment analysis system performance. We compare between different sentiment analysis approaches using a dataset of US Airlines from Twitter. Results show achieving high performance when using the Word2Vec approach with XGBoost and random forest classification algorithms. Also, the results show the classification technique, Naive Bayes is the lowest performance.\",\"PeriodicalId\":137729,\"journal\":{\"name\":\"IJCI. International Journal of Computers and Information\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IJCI. International Journal of Computers and Information\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21608/ijci.2021.65578.1044\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJCI. International Journal of Computers and Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21608/ijci.2021.65578.1044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Investigation of Features Extraction and Classification Approaches for Sentiment Analysis Systems
Data pre-processing and feature extraction of micro-blogging data in sentiment analysis systems becomes an effective field of analysis. Object identification, negation expressions, sarcasm, outlines, misspellings are the major issues faced during sentiment analysis. So, data pre-processing in a sentiment analysis system is a conclusive step to improve data quality, raise the extraction, and classification of meaningful data. This paper presents a sentiment analysis system for performance investigation. Several pre-processing and feature extraction techniques are applied to optimize the sentiment analysis. Our system comprises three different components: data pre-processing, feature extraction, and sentiment analysis. The pre-processing and feature extraction approaches enhance the sentiment analysis system performance. We compare between different sentiment analysis approaches using a dataset of US Airlines from Twitter. Results show achieving high performance when using the Word2Vec approach with XGBoost and random forest classification algorithms. Also, the results show the classification technique, Naive Bayes is the lowest performance.