用于自动股票交易的深度强化学习:一个集成策略

Proceedings of the First ACM International Conference on AI in Finance Pub Date : 2020-09-11 DOI:10.1145/3383455.3422540

Hongyang Yang, Xiao-Yang Liu, Shanli Zhong, A. Walid

{"title":"用于自动股票交易的深度强化学习:一个集成策略","authors":"Hongyang Yang, Xiao-Yang Liu, Shanli Zhong, A. Walid","doi":"10.1145/3383455.3422540","DOIUrl":null,"url":null,"abstract":"Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"127","resultStr":"{\"title\":\"Deep reinforcement learning for automated stock trading: an ensemble strategy\",\"authors\":\"Hongyang Yang, Xiao-Yang Liu, Shanli Zhong, A. Walid\",\"doi\":\"10.1145/3383455.3422540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.\",\"PeriodicalId\":447950,\"journal\":{\"name\":\"Proceedings of the First ACM International Conference on AI in Finance\",\"volume\":\"58 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"127\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the First ACM International Conference on AI in Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3383455.3422540\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First ACM International Conference on AI in Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3383455.3422540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 127

摘要

股票交易策略在投资中起着至关重要的作用。然而，在一个复杂和动态的股票市场中设计一个有利可图的策略是具有挑战性的。在本文中，我们提出了一种集成策略，该策略采用深度强化方案来学习股票交易策略，以最大化投资回报。我们训练了一个深度强化学习代理，并使用三种基于行为者批评的算法:近端策略优化(PPO)、优势行为者批评(A2C)和深度确定性策略梯度(DDPG)获得了一个集成交易策略。集成策略继承并集成了三种算法的最佳特性，能够鲁棒地适应不同的市场情况。为了避免具有连续动作空间的训练网络的大量内存消耗，我们采用负载-按需技术来处理非常大的数据。我们用30只道琼斯指数中流动性充足的股票来测试我们的算法。采用不同强化学习算法对交易代理的性能进行了评估，并与道琼斯工业平均指数和传统的最小方差投资组合配置策略进行了比较。在夏普比率衡量的风险调整收益方面，所提出的深度集成策略优于三种单独的算法和两个基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep reinforcement learning for automated stock trading: an ensemble strategy

Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the First ACM International Conference on AI in Finance

自引率

0.00%

发文量