Objective Driven Portfolio Construction Using Reinforcement Learning

Proceedings of the Third ACM International Conference on AI in Finance Pub Date : 2022-10-26 DOI:10.1145/3533271.3561764

Tina Wang, Jithin Pradeep, Jerry Zikun Chen

{"title":"Objective Driven Portfolio Construction Using Reinforcement Learning","authors":"Tina Wang, Jithin Pradeep, Jerry Zikun Chen","doi":"10.1145/3533271.3561764","DOIUrl":null,"url":null,"abstract":"Recent advancement in reinforcement learning has enabled robust data-driven direct optimization on the investor’s objectives without estimating the stock movements as in the traditional two-step approach [8]. Given diverse investment styles, a single trading strategy cannot serve different investor objectives. We propose an objective function formulation to augment the direct optimization approach in AlphaPortfolio (Cong et al. [6]). In addition to simple baseline Sharpe ratio used in AlphaPortfolio, we add three investor’s objectives for (i) achieving excess alpha by maximizing the information ratio; (ii) mitigating downside risks through optimizing maximum drawdown-adjusted return; and (iii) reducing transaction costs via restricting the turnover rate. We also introduce four new features: momentum, short-term reversal, drawdown, and maximum drawdown to the framework. Our objective function formulation allows for controlling the trade-off between both maximum drawdown and turnover with respect to realized return, creating flexible trading strategies for various risk appetites. The maximum drawdown efficient frontier curve, derived using a range of values of hyper-parameter α, reflects the similar concave relationship as observed in the theoretical study by Chekhlov et al. [5]. To improve the interpretability of the deep neural network and drive insights into traditional factor investment, we further explore the drivers that contribute to the top and bottom performing firms by running regression analysis using Random Forest, which achieves R2 of approximately 0.8 in producing the same winner scores as our model. Finally, to uncover the balance between profits and diversification, we investigate the impact of the trading size on strategy behaviors.","PeriodicalId":134888,"journal":{"name":"Proceedings of the Third ACM International Conference on AI in Finance","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third ACM International Conference on AI in Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533271.3561764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Recent advancement in reinforcement learning has enabled robust data-driven direct optimization on the investor’s objectives without estimating the stock movements as in the traditional two-step approach [8]. Given diverse investment styles, a single trading strategy cannot serve different investor objectives. We propose an objective function formulation to augment the direct optimization approach in AlphaPortfolio (Cong et al. [6]). In addition to simple baseline Sharpe ratio used in AlphaPortfolio, we add three investor’s objectives for (i) achieving excess alpha by maximizing the information ratio; (ii) mitigating downside risks through optimizing maximum drawdown-adjusted return; and (iii) reducing transaction costs via restricting the turnover rate. We also introduce four new features: momentum, short-term reversal, drawdown, and maximum drawdown to the framework. Our objective function formulation allows for controlling the trade-off between both maximum drawdown and turnover with respect to realized return, creating flexible trading strategies for various risk appetites. The maximum drawdown efficient frontier curve, derived using a range of values of hyper-parameter α, reflects the similar concave relationship as observed in the theoretical study by Chekhlov et al. [5]. To improve the interpretability of the deep neural network and drive insights into traditional factor investment, we further explore the drivers that contribute to the top and bottom performing firms by running regression analysis using Random Forest, which achieves R2 of approximately 0.8 in producing the same winner scores as our model. Finally, to uncover the balance between profits and diversification, we investigate the impact of the trading size on strategy behaviors.

查看原文本刊更多论文

基于强化学习的目标驱动组合构建

强化学习的最新进展已经实现了对投资者目标的稳健数据驱动的直接优化，而无需像传统的两步方法那样估计股票走势。考虑到多样化的投资风格，单一的交易策略无法满足不同投资者的目标。我们提出了一个目标函数公式来增强AlphaPortfolio中的直接优化方法(Cong et al.[6])。除了在AlphaPortfolio中使用的简单基准夏普比率之外，我们还增加了三个投资者的目标:(i)通过最大化信息比率来实现超额阿尔法;(ii)通过优化最大回调收益来降低下行风险;(三)通过限制换手率降低交易成本。我们还向框架引入了四个新特性:动量、短期反转、回调和最大回调。我们的目标函数公式允许在实现回报方面控制最大回撤和营业额之间的权衡，为各种风险偏好创建灵活的交易策略。利用超参数α值范围推导出的最大降压有效边界曲线，反映了Chekhlov等人在理论研究中观察到的类似凹关系。为了提高深度神经网络的可解释性并推动对传统要素投资的见解，我们通过使用随机森林(Random Forest)进行回归分析，进一步探索了对表现最好和最差的公司做出贡献的驱动因素，在产生与我们的模型相同的赢家得分时，其R2约为0.8。最后，为了揭示利润与多元化之间的平衡，我们研究了交易规模对策略行为的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Third ACM International Conference on AI in Finance

自引率

0.00%

发文量