Sentiment Analysis Using Out of Core Learning

2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) Pub Date : 2019-02-01 DOI:10.1109/ECACE.2019.8679298

Mahmudul Hasan, Ishrak Islam, K. Hasan

{"title":"Sentiment Analysis Using Out of Core Learning","authors":"Mahmudul Hasan, Ishrak Islam, K. Hasan","doi":"10.1109/ECACE.2019.8679298","DOIUrl":null,"url":null,"abstract":"Text sentiment detection for a particular language other than English is one of the challenging tasks presently. The reasons are; it needs a large dataset, language has no specific structure, one word has a different meaning, and it is hard for even human to understand the connotation of particular words. There exists several proposed architecture for detecting emotions in the Bengali language using machine learning and deep learning approaches, but they are not accurate enough to predict the perfect emotion of the sentence. And there is still no standalone architecture is available that can extract the sentiments hidden inside of a sentence in different languages. In this paper, we are proposing an abstract model that can enable sentiment analysis without any restriction of using a fixed language somewhat applicable to any language. With the use of natural language processing, we have extracted the features, and these features are then fed to different machine learning models for classification. As our main concern was to build up a general model, this model is confined to binary classification, i.e., positive and negative. Apart from this, In our system architecture, we have implemented stochastic gradient descent for optimization. So our model can be called out of core learning model where the model can be updated when new user data is inserted without training the whole model. For the evaluation of the performance of our model, we have trained the estimators against Bangla translated IMDB review dataset and calculated different evaluation metrics for our estimators. The dataset is translated into Bangla using google translator.","PeriodicalId":226060,"journal":{"name":"2019 International Conference on Electrical, Computer and Communication Engineering (ECCE)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Electrical, Computer and Communication Engineering (ECCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECACE.2019.8679298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Text sentiment detection for a particular language other than English is one of the challenging tasks presently. The reasons are; it needs a large dataset, language has no specific structure, one word has a different meaning, and it is hard for even human to understand the connotation of particular words. There exists several proposed architecture for detecting emotions in the Bengali language using machine learning and deep learning approaches, but they are not accurate enough to predict the perfect emotion of the sentence. And there is still no standalone architecture is available that can extract the sentiments hidden inside of a sentence in different languages. In this paper, we are proposing an abstract model that can enable sentiment analysis without any restriction of using a fixed language somewhat applicable to any language. With the use of natural language processing, we have extracted the features, and these features are then fed to different machine learning models for classification. As our main concern was to build up a general model, this model is confined to binary classification, i.e., positive and negative. Apart from this, In our system architecture, we have implemented stochastic gradient descent for optimization. So our model can be called out of core learning model where the model can be updated when new user data is inserted without training the whole model. For the evaluation of the performance of our model, we have trained the estimators against Bangla translated IMDB review dataset and calculated different evaluation metrics for our estimators. The dataset is translated into Bangla using google translator.

查看原文本刊更多论文

基于非核心学习的情感分析

针对非英语特定语言的文本情感检测是目前具有挑战性的任务之一。原因是;它需要庞大的数据集，语言没有特定的结构，一个词有不同的含义，甚至人类也很难理解特定词的内涵。目前已有几种使用机器学习和深度学习方法检测孟加拉语情绪的架构，但它们不够准确，无法预测句子的完美情绪。目前还没有独立的架构可以提取隐藏在不同语言句子中的情感。在本文中，我们提出了一个抽象模型，它可以使情感分析不受使用固定语言的限制，在某种程度上适用于任何语言。通过使用自然语言处理，我们提取了特征，然后将这些特征馈送到不同的机器学习模型中进行分类。由于我们主要关注的是建立一个一般的模型，所以这个模型仅限于二元分类，即正负分类。除此之外，在我们的系统架构中，我们实现了随机梯度下降进行优化。所以我们的模型可以被称为核心学习模型，当新的用户数据被插入时，模型可以被更新，而不需要训练整个模型。为了评估我们模型的性能，我们针对孟加拉国语翻译的IMDB评论数据集训练了估计器，并为我们的估计器计算了不同的评估指标。使用谷歌翻译器将数据集翻译成孟加拉语。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on Electrical, Computer and Communication Engineering (ECCE)

自引率

0.00%

发文量