K. Sasikanth, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana
{"title":"使用Apache Spark对Twitter进行有效的情感分析","authors":"K. Sasikanth, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana","doi":"10.22068/IJIEPR.31.3.343","DOIUrl":null,"url":null,"abstract":"Today’s interconnected world generates a huge amount of digital data while millions of users share their opinions and feelings on various topics through popular applications such as social media, different micro blogging sites, and various review websites every day. Nowadays, applying sentiment analysis to Twitter data is regarded as a considerable problem, particularly for various organizations or companies who seek to know customers’ feelings and opinions about their products and services. The nature, variety, and enormous size of the data make it considerably practical for several applications ranging from choice and decision making to product assessment. Tweets are being used to convey the sentiment of a tweeter on a specific topic. Those companies keep surveying millions of tweets on some kinds of subjects to evaluate actual opinions and know the customers’ feelings. This paper aims to significantly collect, recognize, filter, reduce, and analyze all such relevant opinions, emotions, and feelings of people on different products or services which could be categorized into positive, negative, or neutral because such categorization improves sales growth of a company's products, films, etc. The Naïve Bayes classifier is the mainly utilized machine learning method for mining feelings from a large quantity of data, like twitter and other popular social networks, due to its higher accuracy rates. This study performs sentiment polarity analysis on Twitter data in a distributed environment, known as Apache Spark.","PeriodicalId":52223,"journal":{"name":"International Journal of Industrial Engineering and Production Research","volume":"1 1","pages":"343-350"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Effective Sentiment Analysis on Twitter with Apache Spark\",\"authors\":\"K. Sasikanth, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana\",\"doi\":\"10.22068/IJIEPR.31.3.343\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today’s interconnected world generates a huge amount of digital data while millions of users share their opinions and feelings on various topics through popular applications such as social media, different micro blogging sites, and various review websites every day. Nowadays, applying sentiment analysis to Twitter data is regarded as a considerable problem, particularly for various organizations or companies who seek to know customers’ feelings and opinions about their products and services. The nature, variety, and enormous size of the data make it considerably practical for several applications ranging from choice and decision making to product assessment. Tweets are being used to convey the sentiment of a tweeter on a specific topic. Those companies keep surveying millions of tweets on some kinds of subjects to evaluate actual opinions and know the customers’ feelings. This paper aims to significantly collect, recognize, filter, reduce, and analyze all such relevant opinions, emotions, and feelings of people on different products or services which could be categorized into positive, negative, or neutral because such categorization improves sales growth of a company's products, films, etc. The Naïve Bayes classifier is the mainly utilized machine learning method for mining feelings from a large quantity of data, like twitter and other popular social networks, due to its higher accuracy rates. This study performs sentiment polarity analysis on Twitter data in a distributed environment, known as Apache Spark.\",\"PeriodicalId\":52223,\"journal\":{\"name\":\"International Journal of Industrial Engineering and Production Research\",\"volume\":\"1 1\",\"pages\":\"343-350\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Industrial Engineering and Production Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22068/IJIEPR.31.3.343\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Industrial Engineering and Production Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22068/IJIEPR.31.3.343","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Decision Sciences","Score":null,"Total":0}
Effective Sentiment Analysis on Twitter with Apache Spark
Today’s interconnected world generates a huge amount of digital data while millions of users share their opinions and feelings on various topics through popular applications such as social media, different micro blogging sites, and various review websites every day. Nowadays, applying sentiment analysis to Twitter data is regarded as a considerable problem, particularly for various organizations or companies who seek to know customers’ feelings and opinions about their products and services. The nature, variety, and enormous size of the data make it considerably practical for several applications ranging from choice and decision making to product assessment. Tweets are being used to convey the sentiment of a tweeter on a specific topic. Those companies keep surveying millions of tweets on some kinds of subjects to evaluate actual opinions and know the customers’ feelings. This paper aims to significantly collect, recognize, filter, reduce, and analyze all such relevant opinions, emotions, and feelings of people on different products or services which could be categorized into positive, negative, or neutral because such categorization improves sales growth of a company's products, films, etc. The Naïve Bayes classifier is the mainly utilized machine learning method for mining feelings from a large quantity of data, like twitter and other popular social networks, due to its higher accuracy rates. This study performs sentiment polarity analysis on Twitter data in a distributed environment, known as Apache Spark.