ParsBERT Post-Training for Sentiment Analysis of Tweets Concerning Stock Market

Mohammadjalal Pouromid, Arman Yekkehkhani, M. A. Oskoei, Amin Aminimehr
{"title":"ParsBERT Post-Training for Sentiment Analysis of Tweets Concerning Stock Market","authors":"Mohammadjalal Pouromid, Arman Yekkehkhani, M. A. Oskoei, Amin Aminimehr","doi":"10.1109/CSICC52343.2021.9420569","DOIUrl":null,"url":null,"abstract":"Social media has become a playground for users to share their ideas freely. Analyzing these data has become of special interest to authorities and consulting firms. They seek to choose right policies based on the insight acquired. Hence, sentiment analysis of data spread in social media has gained significant importance. There are two major approaches for sentiment analysis including lexicon-based and supervised methods. Among supervised methods, deep models have proven to be a better fit for the sentiment analysis task. Since, they are domain free and able to handle large volumes of data effectively. In particular, BERT’s state of the art performance on various natural language processing tasks has encouraged us to use this network architecture for sentiment analysis. In this research, over 12000 Persian tweets including the stock market keyword have been crawled from twitter. They are labeled manually in three different categories of positive, neutral and negative. Then a pre-trained ParsBERT model has been fine-tuned on these data. Our model is evaluated on the test dataset and compared to its counterpart, lexicon-based method using Polyglot as its lexicon. Accuracy of 82 percent has been achieved by our proposed model surpassing its lexicon-based contender.","PeriodicalId":374593,"journal":{"name":"2021 26th International Computer Conference, Computer Society of Iran (CSICC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 26th International Computer Conference, Computer Society of Iran (CSICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSICC52343.2021.9420569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Social media has become a playground for users to share their ideas freely. Analyzing these data has become of special interest to authorities and consulting firms. They seek to choose right policies based on the insight acquired. Hence, sentiment analysis of data spread in social media has gained significant importance. There are two major approaches for sentiment analysis including lexicon-based and supervised methods. Among supervised methods, deep models have proven to be a better fit for the sentiment analysis task. Since, they are domain free and able to handle large volumes of data effectively. In particular, BERT’s state of the art performance on various natural language processing tasks has encouraged us to use this network architecture for sentiment analysis. In this research, over 12000 Persian tweets including the stock market keyword have been crawled from twitter. They are labeled manually in three different categories of positive, neutral and negative. Then a pre-trained ParsBERT model has been fine-tuned on these data. Our model is evaluated on the test dataset and compared to its counterpart, lexicon-based method using Polyglot as its lexicon. Accuracy of 82 percent has been achieved by our proposed model surpassing its lexicon-based contender.
股票市场推文情绪分析的ParsBERT后训练
社交媒体已经成为用户自由分享想法的游乐场。当局和咨询公司对分析这些数据特别感兴趣。他们寻求根据获得的洞察力选择正确的政策。因此,对社交媒体上传播的数据进行情感分析变得非常重要。情感分析有两种主要的方法,包括基于词典的方法和监督方法。在监督方法中,深度模型被证明更适合情感分析任务。因此,它们是无域的,能够有效地处理大量数据。特别是,BERT在各种自然语言处理任务上的先进表现鼓励我们使用这种网络架构进行情感分析。在这项研究中,从推特上抓取了超过12000条波斯语推文,包括股票市场关键字。它们被人工标记为积极、中性和消极三种不同的类别。然后一个预先训练好的ParsBERT模型在这些数据上进行微调。我们的模型在测试数据集上进行了评估,并与使用Polyglot作为词典的基于词典的方法进行了比较。我们提出的模型超过了基于词典的竞争者,准确率达到了82%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信