A BERT-based Text Sentiment Classification Algorithm through Web Data

Ganhua Li, Bo Kong, Jiancheng Li, Henghai Fan, Jian Zhang, Yuan An, Zhenglei Yang, Shengrong Danz, Jiancun Fan
{"title":"A BERT-based Text Sentiment Classification Algorithm through Web Data","authors":"Ganhua Li, Bo Kong, Jiancheng Li, Henghai Fan, Jian Zhang, Yuan An, Zhenglei Yang, Shengrong Danz, Jiancun Fan","doi":"10.1109/ICCEAI55464.2022.00105","DOIUrl":null,"url":null,"abstract":"In order to analyze the sentiment tendency of public opinion, this paper conducts a textual sentiment classification research through web data. In the research, this paper uses the BERT (Bidirectional Encoder Representation from Transformers) model to replace the commonly used word2vec model as a text vectorization tool, which has stronger semantic representation capabilities and can realize polysemous words. For the multi-label classification problem of reviews, the BR (Binary Relevance) algorithm is used to transform the problem into multiple binary classification problems, which is directly and efficient for processing multi-label data. Design the BiLSTM-Attention model, which combines the bidirectional long and short-term memory network and the attention mechanism to achieve further extraction of text features. After multiple sets of comparative experiments, the effectiveness of the BiLSTM-Attention model is verified through performance evaluation. In order to further improve the performance of the model, the problem of unbalanced data set is solved by adjusting the loss function and various parameters so that a better classification effect is achieved.","PeriodicalId":414181,"journal":{"name":"2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEAI55464.2022.00105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In order to analyze the sentiment tendency of public opinion, this paper conducts a textual sentiment classification research through web data. In the research, this paper uses the BERT (Bidirectional Encoder Representation from Transformers) model to replace the commonly used word2vec model as a text vectorization tool, which has stronger semantic representation capabilities and can realize polysemous words. For the multi-label classification problem of reviews, the BR (Binary Relevance) algorithm is used to transform the problem into multiple binary classification problems, which is directly and efficient for processing multi-label data. Design the BiLSTM-Attention model, which combines the bidirectional long and short-term memory network and the attention mechanism to achieve further extraction of text features. After multiple sets of comparative experiments, the effectiveness of the BiLSTM-Attention model is verified through performance evaluation. In order to further improve the performance of the model, the problem of unbalanced data set is solved by adjusting the loss function and various parameters so that a better classification effect is achieved.
基于bert的基于Web数据的文本情感分类算法
为了分析舆情的情感倾向,本文通过网络数据进行了文本情感分类研究。在研究中,本文使用BERT (Bidirectional Encoder Representation from Transformers)模型代替常用的word2vec模型作为文本矢量化工具,具有更强的语义表示能力,可以实现多义词。对于评论的多标签分类问题,采用BR (Binary Relevance)算法将问题转化为多个二值分类问题,直接高效地处理多标签数据。设计BiLSTM-Attention模型,将双向长短期记忆网络与注意机制相结合,实现对文本特征的进一步提取。经过多组对比实验,通过性能评价验证了BiLSTM-Attention模型的有效性。为了进一步提高模型的性能,通过调整损失函数和各种参数来解决数据集不平衡的问题,从而达到更好的分类效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信