A User-Centric Analysis of Social Media for Stock Market Prediction

IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Mohamed Reda Bouadjenek, Scott Sanner, Ga Wu
{"title":"A User-Centric Analysis of Social Media for Stock Market Prediction","authors":"Mohamed Reda Bouadjenek, Scott Sanner, Ga Wu","doi":"https://dl.acm.org/doi/10.1145/3532856","DOIUrl":null,"url":null,"abstract":"<p>Social media platforms such as Twitter or StockTwits are widely used for sharing stock market opinions between investors, traders, and entrepreneurs. Empirically, previous work has shown that the content posted on these social media platforms can be leveraged to predict various aspects of stock market performance. Nonetheless, actors on these social media platforms may not always have altruistic motivations and may instead seek to influence stock trading behavior through the (potentially misleading) information they post. While a lot of previous work has sought to analyze how social media can be used to predict the stock market, there remain many questions regarding the quality of the predictions and the behavior of active users on these platforms. To this end, this article seeks to address a number of open research questions: Which social media platform is more predictive of stock performance? What posted content is actually predictive, and over what time horizon? How does stock market posting behavior vary among different users? Are all users trustworthy or do some user’s predictions consistently mislead about the true stock movement? To answer these questions, we analyzed data from Twitter and StockTwits covering almost 5 years of posted messages spanning 2015 to 2019. The results of this large-scale study provide a number of important insights among which we present the following: (i) StockTwits is a more predictive source of information than Twitter, leading us to focus our analysis on StockTwits; (ii) on StockTwits, users’ self-labeled sentiments are correlated with the stock market but are only slightly predictive in aggregate over the short-term; (iii) there are at least three clear types of temporal predictive behavior for users over a 144 days horizon: short, medium, and long term; and (iv) consistently incorrect users who are reliably wrong tend to exhibit what we conjecture to be “botlike” post content and their removal from the data tends to improve stock market predictions from self-labeled content.</p>","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"43 32","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on the Web","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3532856","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Social media platforms such as Twitter or StockTwits are widely used for sharing stock market opinions between investors, traders, and entrepreneurs. Empirically, previous work has shown that the content posted on these social media platforms can be leveraged to predict various aspects of stock market performance. Nonetheless, actors on these social media platforms may not always have altruistic motivations and may instead seek to influence stock trading behavior through the (potentially misleading) information they post. While a lot of previous work has sought to analyze how social media can be used to predict the stock market, there remain many questions regarding the quality of the predictions and the behavior of active users on these platforms. To this end, this article seeks to address a number of open research questions: Which social media platform is more predictive of stock performance? What posted content is actually predictive, and over what time horizon? How does stock market posting behavior vary among different users? Are all users trustworthy or do some user’s predictions consistently mislead about the true stock movement? To answer these questions, we analyzed data from Twitter and StockTwits covering almost 5 years of posted messages spanning 2015 to 2019. The results of this large-scale study provide a number of important insights among which we present the following: (i) StockTwits is a more predictive source of information than Twitter, leading us to focus our analysis on StockTwits; (ii) on StockTwits, users’ self-labeled sentiments are correlated with the stock market but are only slightly predictive in aggregate over the short-term; (iii) there are at least three clear types of temporal predictive behavior for users over a 144 days horizon: short, medium, and long term; and (iv) consistently incorrect users who are reliably wrong tend to exhibit what we conjecture to be “botlike” post content and their removal from the data tends to improve stock market predictions from self-labeled content.

以用户为中心的社会媒体对股市预测的分析
像Twitter或StockTwits这样的社交媒体平台被广泛用于投资者、交易员和企业家之间分享股市观点。从经验上看,之前的研究表明,这些社交媒体平台上发布的内容可以用来预测股市表现的各个方面。尽管如此,这些社交媒体平台上的行为者可能并不总是有无私的动机,而是可能试图通过他们发布的(可能具有误导性的)信息来影响股票交易行为。虽然之前的很多工作都试图分析如何利用社交媒体来预测股市,但关于预测的质量和这些平台上活跃用户的行为,仍然存在许多问题。为此,本文试图解决一些悬而未决的研究问题:哪个社交媒体平台更能预测股票表现?哪些发布的内容实际上是预测性的,在什么时间范围内?不同用户的股市发帖行为有何不同?是所有的用户都值得信赖,还是一些用户的预测一直误导了真实的股票走势?为了回答这些问题,我们分析了Twitter和StockTwits的数据,涵盖了2015年至2019年近5年的发布信息。这项大规模研究的结果提供了许多重要的见解,其中我们提出以下几点:(i) StockTwits是比Twitter更具预测性的信息来源,这使得我们将分析重点放在StockTwits上;(ii)在StockTwits上,用户自我标记的情绪与股市相关,但在短期内总体上只有轻微的预测性;(iii)用户在144天内至少有三种明确的时间预测行为:短期、中期和长期;(iv)一贯错误的用户往往会表现出我们推测的“机器人式”帖子内容,将他们从数据中删除往往会改善自标签内容对股市的预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Transactions on the Web
ACM Transactions on the Web 工程技术-计算机:软件工程
CiteScore
4.90
自引率
0.00%
发文量
26
审稿时长
7.5 months
期刊介绍: Transactions on the Web (TWEB) is a journal publishing refereed articles reporting the results of research on Web content, applications, use, and related enabling technologies. Topics in the scope of TWEB include but are not limited to the following: Browsers and Web Interfaces; Electronic Commerce; Electronic Publishing; Hypertext and Hypermedia; Semantic Web; Web Engineering; Web Services; and Service-Oriented Computing XML. In addition, papers addressing the intersection of the following broader technologies with the Web are also in scope: Accessibility; Business Services Education; Knowledge Management and Representation; Mobility and pervasive computing; Performance and scalability; Recommender systems; Searching, Indexing, Classification, Retrieval and Querying, Data Mining and Analysis; Security and Privacy; and User Interfaces. Papers discussing specific Web technologies, applications, content generation and management and use are within scope. Also, papers describing novel applications of the web as well as papers on the underlying technologies are welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信