Stock Market Trading Agent Using On-Policy Reinforcement Learning Algorithms

Shreyas Lele, Kavit Gangar, Harsh Daftary, Dewashish Dharkar
{"title":"Stock Market Trading Agent Using On-Policy Reinforcement Learning Algorithms","authors":"Shreyas Lele, Kavit Gangar, Harsh Daftary, Dewashish Dharkar","doi":"10.2139/ssrn.3582014","DOIUrl":null,"url":null,"abstract":"Stock market has been a complex system which has been difficult to predict for humans, thereby, making the trading decisions difficult to take. It will be useful for traders if there is a model agent which can learn the stock market trends and suggest trading decisions which in turn maximizes the profits. Inorder to develop this agent we have formulated the problem as a Markov Decision Process (MDP) and created a stock trading environment which serves as a platform for this agent to trade the stocks. In this paper, we introduce a Reinforcement Learning based approach to develop a trading agent which performs trading actions on the environment and learns according to the rewards in terms of profit or loss it receives. We have applied different On-policy Reinforcement Learning Algorithms such as Vanilla Policy Gradient (VPG), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) on the environment to obtain the profits while trading stocks for 3 companies viz. Apple, Microsoft and Nike. The performance of these algorithms in order to maximize the profits have been evaluated and the results and conclusions have been elaborated.","PeriodicalId":241211,"journal":{"name":"CompSciRN: Artificial Intelligence (Topic)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CompSciRN: Artificial Intelligence (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3582014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Stock market has been a complex system which has been difficult to predict for humans, thereby, making the trading decisions difficult to take. It will be useful for traders if there is a model agent which can learn the stock market trends and suggest trading decisions which in turn maximizes the profits. Inorder to develop this agent we have formulated the problem as a Markov Decision Process (MDP) and created a stock trading environment which serves as a platform for this agent to trade the stocks. In this paper, we introduce a Reinforcement Learning based approach to develop a trading agent which performs trading actions on the environment and learns according to the rewards in terms of profit or loss it receives. We have applied different On-policy Reinforcement Learning Algorithms such as Vanilla Policy Gradient (VPG), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) on the environment to obtain the profits while trading stocks for 3 companies viz. Apple, Microsoft and Nike. The performance of these algorithms in order to maximize the profits have been evaluated and the results and conclusions have been elaborated.
基于策略强化学习算法的股票市场交易代理
股票市场是一个复杂的系统,对人类来说很难预测,因此很难做出交易决策。如果有一个模型代理可以学习股票市场趋势,并提出交易决策,从而实现利润最大化,这对交易者来说将是有用的。为了开发该代理,我们将问题表述为马尔可夫决策过程(MDP),并创建了一个股票交易环境,作为该代理交易股票的平台。在本文中,我们引入了一种基于强化学习的方法来开发一个交易代理,该代理在环境中执行交易行为,并根据其收到的利润或损失奖励进行学习。我们在环境上应用了不同的策略强化学习算法,如香草策略梯度(VPG),信任区域策略优化(TRPO)和近端策略优化(PPO),以获得苹果,微软和耐克3家公司股票交易时的利润。以利润最大化为目标,对这些算法的性能进行了评价,并对结果和结论进行了阐述。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信