使用Apache Spark对Twitter进行有效的情感分析

Q3 Decision Sciences

International Journal of Industrial Engineering and Production Research Pub Date : 2020-09-10 DOI:10.22068/IJIEPR.31.3.343

K. Sasikanth, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana

{"title":"使用Apache Spark对Twitter进行有效的情感分析","authors":"K. Sasikanth, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana","doi":"10.22068/IJIEPR.31.3.343","DOIUrl":null,"url":null,"abstract":"Today’s interconnected world generates a huge amount of digital data while millions of users share their opinions and feelings on various topics through popular applications such as social media, different micro blogging sites, and various review websites every day. Nowadays, applying sentiment analysis to Twitter data is regarded as a considerable problem, particularly for various organizations or companies who seek to know customers’ feelings and opinions about their products and services. The nature, variety, and enormous size of the data make it considerably practical for several applications ranging from choice and decision making to product assessment. Tweets are being used to convey the sentiment of a tweeter on a specific topic. Those companies keep surveying millions of tweets on some kinds of subjects to evaluate actual opinions and know the customers’ feelings. This paper aims to significantly collect, recognize, filter, reduce, and analyze all such relevant opinions, emotions, and feelings of people on different products or services which could be categorized into positive, negative, or neutral because such categorization improves sales growth of a company's products, films, etc. The Naïve Bayes classifier is the mainly utilized machine learning method for mining feelings from a large quantity of data, like twitter and other popular social networks, due to its higher accuracy rates. This study performs sentiment polarity analysis on Twitter data in a distributed environment, known as Apache Spark.","PeriodicalId":52223,"journal":{"name":"International Journal of Industrial Engineering and Production Research","volume":"1 1","pages":"343-350"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Effective Sentiment Analysis on Twitter with Apache Spark\",\"authors\":\"K. Sasikanth, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana\",\"doi\":\"10.22068/IJIEPR.31.3.343\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today’s interconnected world generates a huge amount of digital data while millions of users share their opinions and feelings on various topics through popular applications such as social media, different micro blogging sites, and various review websites every day. Nowadays, applying sentiment analysis to Twitter data is regarded as a considerable problem, particularly for various organizations or companies who seek to know customers’ feelings and opinions about their products and services. The nature, variety, and enormous size of the data make it considerably practical for several applications ranging from choice and decision making to product assessment. Tweets are being used to convey the sentiment of a tweeter on a specific topic. Those companies keep surveying millions of tweets on some kinds of subjects to evaluate actual opinions and know the customers’ feelings. This paper aims to significantly collect, recognize, filter, reduce, and analyze all such relevant opinions, emotions, and feelings of people on different products or services which could be categorized into positive, negative, or neutral because such categorization improves sales growth of a company's products, films, etc. The Naïve Bayes classifier is the mainly utilized machine learning method for mining feelings from a large quantity of data, like twitter and other popular social networks, due to its higher accuracy rates. This study performs sentiment polarity analysis on Twitter data in a distributed environment, known as Apache Spark.\",\"PeriodicalId\":52223,\"journal\":{\"name\":\"International Journal of Industrial Engineering and Production Research\",\"volume\":\"1 1\",\"pages\":\"343-350\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Industrial Engineering and Production Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22068/IJIEPR.31.3.343\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Industrial Engineering and Production Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22068/IJIEPR.31.3.343","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

摘要

当今互联互通的世界产生了大量的数字数据，数百万用户每天通过流行的应用程序，如社交媒体、各种微博网站和各种评论网站，分享他们对各种话题的看法和感受。如今，将情感分析应用于Twitter数据被认为是一个相当大的问题，特别是对于那些寻求了解客户对其产品和服务的感受和意见的各种组织或公司。数据的性质、多样性和巨大的规模使其在从选择和决策制定到产品评估的几个应用中相当实用。推文被用来传达推特用户对特定主题的情绪。这些公司不断调查某些主题的数百万条推文，以评估实际意见并了解客户的感受。本文旨在显著收集，识别，过滤，减少和分析所有这些相关的意见，情绪和感受的人对不同的产品或服务，可以分为积极的，消极的，或中性，因为这样的分类提高了公司的产品，电影等的销售增长。Naïve贝叶斯分类器是目前主要利用机器学习方法从大量数据中挖掘情感，如twitter和其他流行的社交网络，因为它的准确率更高。本研究在分布式环境(Apache Spark)中对Twitter数据进行情感极性分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Effective Sentiment Analysis on Twitter with Apache Spark

Today’s interconnected world generates a huge amount of digital data while millions of users share their opinions and feelings on various topics through popular applications such as social media, different micro blogging sites, and various review websites every day. Nowadays, applying sentiment analysis to Twitter data is regarded as a considerable problem, particularly for various organizations or companies who seek to know customers’ feelings and opinions about their products and services. The nature, variety, and enormous size of the data make it considerably practical for several applications ranging from choice and decision making to product assessment. Tweets are being used to convey the sentiment of a tweeter on a specific topic. Those companies keep surveying millions of tweets on some kinds of subjects to evaluate actual opinions and know the customers’ feelings. This paper aims to significantly collect, recognize, filter, reduce, and analyze all such relevant opinions, emotions, and feelings of people on different products or services which could be categorized into positive, negative, or neutral because such categorization improves sales growth of a company's products, films, etc. The Naïve Bayes classifier is the mainly utilized machine learning method for mining feelings from a large quantity of data, like twitter and other popular social networks, due to its higher accuracy rates. This study performs sentiment polarity analysis on Twitter data in a distributed environment, known as Apache Spark.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Industrial Engineering and Production Research Engineering-Industrial and Manufacturing Engineering

CiteScore

1.60

自引率

0.00%

发文量

审稿时长

10 weeks