Improving sentiment analysis using text network features within different machine learning algorithms

Bulletin of Electrical Engineering and Informatics Pub Date : 2024-02-01 DOI:10.11591/eei.v13i1.5576

A. M. Alnasrawi, A. M. Alzubaidi, Ahmed Abdulhadi Al-Moadhen

{"title":"Improving sentiment analysis using text network features within different machine learning algorithms","authors":"A. M. Alnasrawi, A. M. Alzubaidi, Ahmed Abdulhadi Al-Moadhen","doi":"10.11591/eei.v13i1.5576","DOIUrl":null,"url":null,"abstract":"Sentiment analysis poses a significant challenge due to the inherent subjectivity of natural language and the prevalence of unstandardized dialects in social networks. Regrettably, existing literature lacks a dedicated focus on network representation learning for sentiment classification. This paper addresses this gap by investigating ten machine learning algorithms, including support vector machine (SVM), random forest (RF), logistic regression (LR), and Naive Bayes (NB). Our approach integrates text network analysis and sentiment analysis to propose a comprehensive solution. We begin by applying text preprocessing techniques and converting a text corpus into a text network using word co-occurrence. Subsequently, we employ network analysis techniques to extract features based on network topology and node attributes. These network-derived features serve as inputs for sentiment prediction on Yelp reviews. Through the incorporation of diverse text network features and various machine learning algorithms, we achieve significant enhancements in sentiment classification performance. Our evaluation demonstrates an improved area under curve (AUC) of 83% on the Yelp reviews corpus, underscoring the efficacy of integrating network features to enhance sentiment classifiers. This research underscores the critical role of network representation and its potential impact on sentiment analysis, highlighting the prospect of harnessing network features for sentiment classification tasks.","PeriodicalId":502860,"journal":{"name":"Bulletin of Electrical Engineering and Informatics","volume":"63 11","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/eei.v13i1.5576","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Sentiment analysis poses a significant challenge due to the inherent subjectivity of natural language and the prevalence of unstandardized dialects in social networks. Regrettably, existing literature lacks a dedicated focus on network representation learning for sentiment classification. This paper addresses this gap by investigating ten machine learning algorithms, including support vector machine (SVM), random forest (RF), logistic regression (LR), and Naive Bayes (NB). Our approach integrates text network analysis and sentiment analysis to propose a comprehensive solution. We begin by applying text preprocessing techniques and converting a text corpus into a text network using word co-occurrence. Subsequently, we employ network analysis techniques to extract features based on network topology and node attributes. These network-derived features serve as inputs for sentiment prediction on Yelp reviews. Through the incorporation of diverse text network features and various machine learning algorithms, we achieve significant enhancements in sentiment classification performance. Our evaluation demonstrates an improved area under curve (AUC) of 83% on the Yelp reviews corpus, underscoring the efficacy of integrating network features to enhance sentiment classifiers. This research underscores the critical role of network representation and its potential impact on sentiment analysis, highlighting the prospect of harnessing network features for sentiment classification tasks.

查看原文本刊更多论文

利用不同机器学习算法中的文本网络特征改进情感分析

由于自然语言固有的主观性和社交网络中普遍存在的非标准化方言，情感分析面临着巨大的挑战。遗憾的是，现有文献缺乏对情感分类的网络表征学习的专门关注。本文通过研究支持向量机（SVM）、随机森林（RF）、逻辑回归（LR）和奈夫贝叶斯（NB）等十种机器学习算法来填补这一空白。我们的方法整合了文本网络分析和情感分析，提出了一个全面的解决方案。我们首先应用文本预处理技术，并利用词语共现将文本语料库转换为文本网络。随后，我们采用网络分析技术提取基于网络拓扑结构和节点属性的特征。这些网络衍生特征可作为 Yelp 评论情感预测的输入。通过结合不同的文本网络特征和各种机器学习算法，我们显著提高了情感分类的性能。我们的评估表明，Yelp 评论语料库的曲线下面积（AUC）提高了 83%，突出了整合网络特征来增强情感分类器的功效。这项研究强调了网络表示法的关键作用及其对情感分析的潜在影响，突出了利用网络特征进行情感分类任务的前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bulletin of Electrical Engineering and Informatics

自引率

0.00%

发文量