Topic sentiment analysis based on deep neural network using document embedding technique.

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Supercomputing Pub Date : 2023-06-05 DOI:10.1007/s11227-023-05423-9

Azam Seilsepour, Reza Ravanmehr, Ramin Nassiri

{"title":"Topic sentiment analysis based on deep neural network using document embedding technique.","authors":"Azam Seilsepour, Reza Ravanmehr, Ramin Nassiri","doi":"10.1007/s11227-023-05423-9","DOIUrl":null,"url":null,"abstract":"<p><p>Sentiment Analysis (SA) is a domain- or topic-dependent task since polarity terms convey different sentiments in various domains. Hence, machine learning models trained on a specific domain cannot be employed in other domains, and existing domain-independent lexicons cannot correctly recognize the polarity of domain-specific polarity terms. Conventional approaches of Topic Sentiment Analysis perform Topic Modeling (TM) and SA sequentially, utilizing the previously trained models on irrelevant datasets for classifying sentiments that cannot provide acceptable accuracy. However, some researchers perform TM and SA simultaneously using topic-sentiment joint models, which require a list of seeds and their sentiments from widely used domain-independent lexicons. As a result, these methods cannot find the polarity of domain-specific terms correctly. This paper proposes a novel supervised hybrid TSA approach, called Embedding Topic Sentiment Analysis using Deep Neural Networks (ETSANet), that extracts the semantic relationships between the hidden topics and the training dataset using Semantically Topic-Related Documents Finder (STRDF). STRDF discovers those training documents in the same context as the topic based on the semantic relationships between the Semantic Topic Vector, a newly introduced concept that encompasses the semantic aspects of a topic, and the training dataset. Then, a hybrid CNN-GRU model is trained by these semantically topic-related documents. Moreover, a hybrid metaheuristic method utilizing Grey Wolf Optimization and Whale Optimization Algorithm is employed to fine-tune the hyperparameters of the CNN-GRU network. The evaluation results demonstrate that ETSANet increases the accuracy of the state-of-the-art methods by 1.92%.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":" ","pages":"1-39"},"PeriodicalIF":2.7000,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241384/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11227-023-05423-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Sentiment Analysis (SA) is a domain- or topic-dependent task since polarity terms convey different sentiments in various domains. Hence, machine learning models trained on a specific domain cannot be employed in other domains, and existing domain-independent lexicons cannot correctly recognize the polarity of domain-specific polarity terms. Conventional approaches of Topic Sentiment Analysis perform Topic Modeling (TM) and SA sequentially, utilizing the previously trained models on irrelevant datasets for classifying sentiments that cannot provide acceptable accuracy. However, some researchers perform TM and SA simultaneously using topic-sentiment joint models, which require a list of seeds and their sentiments from widely used domain-independent lexicons. As a result, these methods cannot find the polarity of domain-specific terms correctly. This paper proposes a novel supervised hybrid TSA approach, called Embedding Topic Sentiment Analysis using Deep Neural Networks (ETSANet), that extracts the semantic relationships between the hidden topics and the training dataset using Semantically Topic-Related Documents Finder (STRDF). STRDF discovers those training documents in the same context as the topic based on the semantic relationships between the Semantic Topic Vector, a newly introduced concept that encompasses the semantic aspects of a topic, and the training dataset. Then, a hybrid CNN-GRU model is trained by these semantically topic-related documents. Moreover, a hybrid metaheuristic method utilizing Grey Wolf Optimization and Whale Optimization Algorithm is employed to fine-tune the hyperparameters of the CNN-GRU network. The evaluation results demonstrate that ETSANet increases the accuracy of the state-of-the-art methods by 1.92%.

Abstract Image

查看原文本刊更多论文

基于深度神经网络的主题情感分析采用文档嵌入技术。

情感分析（SA）是一项与领域或主题相关的任务，因为极性术语在不同领域传达不同的情感。因此，在特定领域上训练的机器学习模型不能用于其他领域，并且现有的与领域无关的词典不能正确识别领域特定极性项的极性。主题情感分析的传统方法依次执行主题建模（TM）和SA，利用先前在不相关数据集上训练的模型对无法提供可接受准确性的情感进行分类。然而，一些研究人员使用主题情感联合模型同时执行TM和SA，这需要一份来自广泛使用的领域无关词典的种子及其情感列表。因此，这些方法无法正确地找到特定领域术语的极性。本文提出了一种新的监督混合TSA方法，称为使用深度神经网络嵌入主题情感分析（ETSANet），该方法使用语义主题相关文档查找器（STRDF）提取隐藏主题与训练数据集之间的语义关系。STRDF基于语义主题向量和训练数据集之间的语义关系，在与主题相同的上下文中发现这些训练文档。语义主题向量是一个新引入的概念，包含主题的语义方面。然后，通过这些语义主题相关的文档来训练混合CNN-GRU模型。此外，还采用了一种利用灰太狼优化和鲸鱼优化算法的混合元启发式方法来微调CNN-GRU网络的超参数。评估结果表明，ETSANet将最先进方法的准确性提高了1.92%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Supercomputing 工程技术-工程：电子与电气

CiteScore

6.30

自引率

12.10%

发文量

734

审稿时长

13 months

期刊介绍： The Journal of Supercomputing publishes papers on the technology, architecture and systems, algorithms, languages and programs, performance measures and methods, and applications of all aspects of Supercomputing. Tutorial and survey papers are intended for workers and students in the fields associated with and employing advanced computer systems. The journal also publishes letters to the editor, especially in areas relating to policy, succinct statements of paradoxes, intuitively puzzling results, partial results and real needs. Published theoretical and practical papers are advanced, in-depth treatments describing new developments and new ideas. Each includes an introduction summarizing prior, directly pertinent work that is useful for the reader to understand, in order to appreciate the advances being described.