基于词特征和Bi-LSTM的在线评论情感分类模型

Jingxuan Hu
{"title":"基于词特征和Bi-LSTM的在线评论情感分类模型","authors":"Jingxuan Hu","doi":"10.1109/ICDSCA56264.2022.9988320","DOIUrl":null,"url":null,"abstract":"With the rapid development of e-commerce, many purchase and comment records are produced. Sentiment classification of commodity reviews is of great value for automatically monitoring bad reviews and assisting merchants in analyzing consumer feedback. At present, the Bi-LSTM model is representative of Chinese text sentiment classification, which can understand the semantic information in time sequence. However, due to the lack of processing lexical information, there is a problem that word vectors cannot highlight the information of sentiment words. Therefore, this paper proposes a sentiment classification model of Chinese product reviews based on word features and Bi-LSTM. The new model firstly uses Word2vec's CBOW model to train the word vectors, secondly uses an improved information gain algorithm with the word distribution and sentiment weights to calculate the amount of information, and finally uses the Naive Bayes model to classify the network classification results twice, which solves the problem that the basic Bi-LSTM model lacks understanding of lexical information. The experimental results show that the new model achieves better results relative to the basic Bi-LSTM model and can capture the sentiment information of the comments more accurately. In the test set, the accuracy rate reached 89.03%.","PeriodicalId":416983,"journal":{"name":"2022 IEEE 2nd International Conference on Data Science and Computer Application (ICDSCA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Sentiment Classification Model of Online Reviews Based on Word Features and Bi-LSTM\",\"authors\":\"Jingxuan Hu\",\"doi\":\"10.1109/ICDSCA56264.2022.9988320\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of e-commerce, many purchase and comment records are produced. Sentiment classification of commodity reviews is of great value for automatically monitoring bad reviews and assisting merchants in analyzing consumer feedback. At present, the Bi-LSTM model is representative of Chinese text sentiment classification, which can understand the semantic information in time sequence. However, due to the lack of processing lexical information, there is a problem that word vectors cannot highlight the information of sentiment words. Therefore, this paper proposes a sentiment classification model of Chinese product reviews based on word features and Bi-LSTM. The new model firstly uses Word2vec's CBOW model to train the word vectors, secondly uses an improved information gain algorithm with the word distribution and sentiment weights to calculate the amount of information, and finally uses the Naive Bayes model to classify the network classification results twice, which solves the problem that the basic Bi-LSTM model lacks understanding of lexical information. The experimental results show that the new model achieves better results relative to the basic Bi-LSTM model and can capture the sentiment information of the comments more accurately. In the test set, the accuracy rate reached 89.03%.\",\"PeriodicalId\":416983,\"journal\":{\"name\":\"2022 IEEE 2nd International Conference on Data Science and Computer Application (ICDSCA)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 2nd International Conference on Data Science and Computer Application (ICDSCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDSCA56264.2022.9988320\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 2nd International Conference on Data Science and Computer Application (ICDSCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSCA56264.2022.9988320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

随着电子商务的快速发展,产生了大量的购买和评论记录。商品评论的情感分类对于自动监控差评,协助商家分析消费者反馈具有重要的价值。目前,Bi-LSTM模型是中文文本情感分类的代表,可以按时间顺序理解语义信息。然而,由于缺乏对词汇信息的处理,存在词向量不能突出情感词信息的问题。为此,本文提出了一种基于词特征和Bi-LSTM的中文产品评论情感分类模型。新模型首先使用Word2vec的CBOW模型对词向量进行训练,然后使用改进的信息增益算法结合词分布和情感权重计算信息量,最后使用朴素贝叶斯模型对网络分类结果进行两次分类,解决了基本Bi-LSTM模型缺乏对词汇信息理解的问题。实验结果表明,新模型相对于基本的Bi-LSTM模型取得了更好的结果,可以更准确地捕获评论的情感信息。在测试集中,准确率达到89.03%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sentiment Classification Model of Online Reviews Based on Word Features and Bi-LSTM
With the rapid development of e-commerce, many purchase and comment records are produced. Sentiment classification of commodity reviews is of great value for automatically monitoring bad reviews and assisting merchants in analyzing consumer feedback. At present, the Bi-LSTM model is representative of Chinese text sentiment classification, which can understand the semantic information in time sequence. However, due to the lack of processing lexical information, there is a problem that word vectors cannot highlight the information of sentiment words. Therefore, this paper proposes a sentiment classification model of Chinese product reviews based on word features and Bi-LSTM. The new model firstly uses Word2vec's CBOW model to train the word vectors, secondly uses an improved information gain algorithm with the word distribution and sentiment weights to calculate the amount of information, and finally uses the Naive Bayes model to classify the network classification results twice, which solves the problem that the basic Bi-LSTM model lacks understanding of lexical information. The experimental results show that the new model achieves better results relative to the basic Bi-LSTM model and can capture the sentiment information of the comments more accurately. In the test set, the accuracy rate reached 89.03%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信