使用 TF-IDF 和机器学习算法对金融新闻数据进行情感分析

Gideon Popoola, Khadijat-Kuburat Abdullah, Gerard Shu Fuhnwi, Janet O. Agbaje
{"title":"使用 TF-IDF 和机器学习算法对金融新闻数据进行情感分析","authors":"Gideon Popoola, Khadijat-Kuburat Abdullah, Gerard Shu Fuhnwi, Janet O. Agbaje","doi":"10.1109/ICAIC60265.2024.10433843","DOIUrl":null,"url":null,"abstract":"Blogs, online forums, comment sections, and social networking sites like Facebook, Twitter (now known as X), and Instagram can all be called social media. The growing use of social media has made some unstructured data available, which can benefit us if we clean, structure, and analyze the data. Twitter is a popular microblogging social media platform where people share and express their opinions about any topic. The act of analyzing these opinions of people is called sentimental analysis. Sentimental analysis can be helpful to individuals, businesses, government agencies, etc. In this study, tweets related to financial news were extracted, labeled, and analyzed to capture the opinions of people around the world. This paper proposes a novel machine learning-based approach to analyze social media data for sentiment analysis. The presented approach is divided into three steps. The first stage is preprocessing, where the tweets are refined and filtered. In the second stage, feature extraction was performed using Term Frequency and Inverse Document Frequency (TF-IDF). The third stage involves using the extracted features to make predictions using machine learning algorithms. Three machine learning models were used, namely, random forest classifier (RF), Naïve Bayes (NB), and k-nearest neighbor (KNN). The evaluation results show that both NB and RF perform better than KNN in accuracy, precision, Recall, and F1-score metrics. These results also show an overwhelmingly positive opinion regarding financial news.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"283 8","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sentiment Analysis of Financial News Data using TF-IDF and Machine Learning Algorithms\",\"authors\":\"Gideon Popoola, Khadijat-Kuburat Abdullah, Gerard Shu Fuhnwi, Janet O. Agbaje\",\"doi\":\"10.1109/ICAIC60265.2024.10433843\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Blogs, online forums, comment sections, and social networking sites like Facebook, Twitter (now known as X), and Instagram can all be called social media. The growing use of social media has made some unstructured data available, which can benefit us if we clean, structure, and analyze the data. Twitter is a popular microblogging social media platform where people share and express their opinions about any topic. The act of analyzing these opinions of people is called sentimental analysis. Sentimental analysis can be helpful to individuals, businesses, government agencies, etc. In this study, tweets related to financial news were extracted, labeled, and analyzed to capture the opinions of people around the world. This paper proposes a novel machine learning-based approach to analyze social media data for sentiment analysis. The presented approach is divided into three steps. The first stage is preprocessing, where the tweets are refined and filtered. In the second stage, feature extraction was performed using Term Frequency and Inverse Document Frequency (TF-IDF). The third stage involves using the extracted features to make predictions using machine learning algorithms. Three machine learning models were used, namely, random forest classifier (RF), Naïve Bayes (NB), and k-nearest neighbor (KNN). The evaluation results show that both NB and RF perform better than KNN in accuracy, precision, Recall, and F1-score metrics. These results also show an overwhelmingly positive opinion regarding financial news.\",\"PeriodicalId\":517265,\"journal\":{\"name\":\"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)\",\"volume\":\"283 8\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIC60265.2024.10433843\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIC60265.2024.10433843","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

博客、在线论坛、评论区以及 Facebook、Twitter(现在称为 X)和 Instagram 等社交网站都可称为社交媒体。社交媒体的使用日益增多,使得一些非结构化数据变得可用,如果我们对这些数据进行清理、结构化和分析,就能从中受益。Twitter 是一个流行的微博社交媒体平台,人们在这个平台上分享和表达自己对任何话题的看法。对这些观点进行分析的行为被称为情感分析。情感分析对个人、企业、政府机构等都有帮助。本研究对与财经新闻相关的推文进行了提取、标记和分析,以捕捉世界各地人们的观点。本文提出了一种基于机器学习的新方法来分析社交媒体数据,以进行情感分析。该方法分为三个步骤。第一阶段是预处理,对推文进行提炼和过滤。在第二阶段,使用术语频率和反向文档频率(TF-IDF)进行特征提取。第三阶段是利用提取的特征,使用机器学习算法进行预测。使用了三种机器学习模型,即随机森林分类器(RF)、奈夫贝叶斯(NB)和 k 近邻(KNN)。评估结果表明,NB 和 RF 在准确率、精确度、召回率和 F1 分数指标上都优于 KNN。这些结果还表明,人们对财经新闻的看法绝大多数是正面的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sentiment Analysis of Financial News Data using TF-IDF and Machine Learning Algorithms
Blogs, online forums, comment sections, and social networking sites like Facebook, Twitter (now known as X), and Instagram can all be called social media. The growing use of social media has made some unstructured data available, which can benefit us if we clean, structure, and analyze the data. Twitter is a popular microblogging social media platform where people share and express their opinions about any topic. The act of analyzing these opinions of people is called sentimental analysis. Sentimental analysis can be helpful to individuals, businesses, government agencies, etc. In this study, tweets related to financial news were extracted, labeled, and analyzed to capture the opinions of people around the world. This paper proposes a novel machine learning-based approach to analyze social media data for sentiment analysis. The presented approach is divided into three steps. The first stage is preprocessing, where the tweets are refined and filtered. In the second stage, feature extraction was performed using Term Frequency and Inverse Document Frequency (TF-IDF). The third stage involves using the extracted features to make predictions using machine learning algorithms. Three machine learning models were used, namely, random forest classifier (RF), Naïve Bayes (NB), and k-nearest neighbor (KNN). The evaluation results show that both NB and RF perform better than KNN in accuracy, precision, Recall, and F1-score metrics. These results also show an overwhelmingly positive opinion regarding financial news.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信