Handling data imbalance using CNN and LSTM in financial news sentiment analysis

Moldir Omarkhan, Gulnur Kissymova, Iskander Akhmetov
{"title":"Handling data imbalance using CNN and LSTM in financial news sentiment analysis","authors":"Moldir Omarkhan, Gulnur Kissymova, Iskander Akhmetov","doi":"10.1109/icecco53203.2021.9663802","DOIUrl":null,"url":null,"abstract":"With a speedy development in Natural Language processing, the financial sector meets the demand of analyzing a large quantity of financial text data. Several recent research has focused on the subject of Financial Sentiment Analysis (FSA). In this article, we worked on sentiment analysis which is one of the most popular areas of natural language processing. We tried to use the sentiment analysis of news in the financial market, as sometimes news has a very strong impact on the stock market. We used the data of P. Malo [18] containing the 5,000 sentences of the finance news with labels of the sentiment. This study uses machine learning and deep learning algorithms as a research approach to develop a comprehensive comparative study on Financial News Sentiment Analysis that includes data sources. We compared the classification accuracy performance of machine learning and deep learning algorithms such as SVM, KNN, Decision Tree, Random Forest, XGBoost, CNN, and LSTM in a sentiment analysis of financial news. Our inspirations in the future direction such as handling data imbalance also discussed and applied for algorithms. The experiments demonstrate that the CNN algorithm, based on accuracy, consistently outperforms the other models in the performance of sentiment analysis of financial news.","PeriodicalId":331369,"journal":{"name":"2021 16th International Conference on Electronics Computer and Computation (ICECCO)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 16th International Conference on Electronics Computer and Computation (ICECCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icecco53203.2021.9663802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

With a speedy development in Natural Language processing, the financial sector meets the demand of analyzing a large quantity of financial text data. Several recent research has focused on the subject of Financial Sentiment Analysis (FSA). In this article, we worked on sentiment analysis which is one of the most popular areas of natural language processing. We tried to use the sentiment analysis of news in the financial market, as sometimes news has a very strong impact on the stock market. We used the data of P. Malo [18] containing the 5,000 sentences of the finance news with labels of the sentiment. This study uses machine learning and deep learning algorithms as a research approach to develop a comprehensive comparative study on Financial News Sentiment Analysis that includes data sources. We compared the classification accuracy performance of machine learning and deep learning algorithms such as SVM, KNN, Decision Tree, Random Forest, XGBoost, CNN, and LSTM in a sentiment analysis of financial news. Our inspirations in the future direction such as handling data imbalance also discussed and applied for algorithms. The experiments demonstrate that the CNN algorithm, based on accuracy, consistently outperforms the other models in the performance of sentiment analysis of financial news.
利用CNN和LSTM处理财经新闻情感分析中的数据不平衡
随着自然语言处理技术的迅速发展,金融领域满足了对大量金融文本数据进行分析的需求。最近的一些研究集中在金融情绪分析(FSA)这一主题上。在本文中,我们致力于情感分析,这是自然语言处理中最受欢迎的领域之一。我们尝试在金融市场中使用新闻的情绪分析,因为有时新闻对股市的影响非常大。我们使用P. Malo[18]的数据,其中包含了带有情绪标签的财经新闻的5000句话。本研究采用机器学习和深度学习算法作为研究方法,对包括数据源在内的财经新闻情绪分析进行了全面的比较研究。我们比较了机器学习和深度学习算法(如SVM、KNN、Decision Tree、Random Forest、XGBoost、CNN和LSTM)在财经新闻情感分析中的分类精度表现。我们对未来方向的启发,如处理数据不平衡也进行了讨论,并应用于算法。实验表明,基于准确率的CNN算法在财经新闻情感分析的表现上始终优于其他模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信