基于BiLSTM预测和改进PPO算法的投资组合模型

2022 International Symposium on Intelligent Robotics and Systems (ISoIRS) Pub Date : 2022-10-01 DOI:10.1109/isoirs57349.2022.00032

Xuan Zhang, Junjie Cai, Xinyue Dai, Lifeng Zhang

{"title":"基于BiLSTM预测和改进PPO算法的投资组合模型","authors":"Xuan Zhang, Junjie Cai, Xinyue Dai, Lifeng Zhang","doi":"10.1109/isoirs57349.2022.00032","DOIUrl":null,"url":null,"abstract":"This paper develops a portfolio investment model to maximize traders’ returns of gold and bitcoin. We establish a Bidirectional Long ShortTerm Memory Network based Proximal Policy Optimization (BiLSTM-PPO) algorithm. Then, we improve the PPO algorithm by setting the penalty factor to implement a strategy in which gold does not trade on non-trading days. Finally, with the proposed BiLSTM-PPO algorithm to learn the state vector composed of historical data, covariance and BiLSTM prediction results, we obtain the optimal trading strategy. A portfolio selection case is given to illustrate the application process and effectiveness of the method. We compare the BiLSTM-PPO with traditional PPO to prove the ef-fectiveness of it. Even in the worst case, the final income increases by 6.28% than the traditional PPO. And then, we compare the BiLSTM-PPO with five common investment de-cision algorithms such as Min-Variance, Deep Deterministic Policy Gradient, and etc by six financial metrics to prove the optimality of our model. The experimental results show that the BiLSTM-PPO achieves the highest final revenue and strategy provided by the algorithm is adjusted adaptively to ensure the maximization of returns.","PeriodicalId":405065,"journal":{"name":"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Portfolio Model Based on BiLSTM Prediction and Improved PPO Algorithm\",\"authors\":\"Xuan Zhang, Junjie Cai, Xinyue Dai, Lifeng Zhang\",\"doi\":\"10.1109/isoirs57349.2022.00032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper develops a portfolio investment model to maximize traders’ returns of gold and bitcoin. We establish a Bidirectional Long ShortTerm Memory Network based Proximal Policy Optimization (BiLSTM-PPO) algorithm. Then, we improve the PPO algorithm by setting the penalty factor to implement a strategy in which gold does not trade on non-trading days. Finally, with the proposed BiLSTM-PPO algorithm to learn the state vector composed of historical data, covariance and BiLSTM prediction results, we obtain the optimal trading strategy. A portfolio selection case is given to illustrate the application process and effectiveness of the method. We compare the BiLSTM-PPO with traditional PPO to prove the ef-fectiveness of it. Even in the worst case, the final income increases by 6.28% than the traditional PPO. And then, we compare the BiLSTM-PPO with five common investment de-cision algorithms such as Min-Variance, Deep Deterministic Policy Gradient, and etc by six financial metrics to prove the optimality of our model. The experimental results show that the BiLSTM-PPO achieves the highest final revenue and strategy provided by the algorithm is adjusted adaptively to ensure the maximization of returns.\",\"PeriodicalId\":405065,\"journal\":{\"name\":\"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/isoirs57349.2022.00032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/isoirs57349.2022.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文建立了一个以黄金和比特币交易者收益最大化为目标的组合投资模型。建立了一种基于双向长短期记忆网络的近端策略优化算法(BiLSTM-PPO)。然后，我们通过设置惩罚因子来改进PPO算法，以实现黄金在非交易日不交易的策略。最后，利用提出的BiLSTM- ppo算法学习由历史数据、协方差和BiLSTM预测结果组成的状态向量，得到最优交易策略。最后给出了一个投资组合选择案例，说明了该方法的应用过程和有效性。我们将BiLSTM-PPO与传统PPO进行比较，以证明其有效性。即使在最坏的情况下，最终收益也比传统的PPO增加6.28%。然后，我们通过六个财务指标将BiLSTM-PPO与五种常见的投资决策算法(如最小方差、深度确定性策略梯度等)进行比较，以证明我们的模型的最优性。实验结果表明，BiLSTM-PPO能获得最高的最终收益，算法提供的策略能自适应调整以保证收益最大化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Portfolio Model Based on BiLSTM Prediction and Improved PPO Algorithm

This paper develops a portfolio investment model to maximize traders’ returns of gold and bitcoin. We establish a Bidirectional Long ShortTerm Memory Network based Proximal Policy Optimization (BiLSTM-PPO) algorithm. Then, we improve the PPO algorithm by setting the penalty factor to implement a strategy in which gold does not trade on non-trading days. Finally, with the proposed BiLSTM-PPO algorithm to learn the state vector composed of historical data, covariance and BiLSTM prediction results, we obtain the optimal trading strategy. A portfolio selection case is given to illustrate the application process and effectiveness of the method. We compare the BiLSTM-PPO with traditional PPO to prove the ef-fectiveness of it. Even in the worst case, the final income increases by 6.28% than the traditional PPO. And then, we compare the BiLSTM-PPO with five common investment de-cision algorithms such as Min-Variance, Deep Deterministic Policy Gradient, and etc by six financial metrics to prove the optimality of our model. The experimental results show that the BiLSTM-PPO achieves the highest final revenue and strategy provided by the algorithm is adjusted adaptively to ensure the maximization of returns.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)

自引率

0.00%

发文量