Ruoyu SunXi'an Jiaotong-Liverpool University, School of Mathematics and Physics, Department of Financial and Actuarial Mathematics, Angelos StefanidisXi'an Jiaotong-Liverpool University Entrepreneur College, Zhengyong JiangXi'an Jiaotong-Liverpool University Entrepreneur College, Jionglong SuXi'an Jiaotong-Liverpool University Entrepreneur College
{"title":"将基于变压器的深度强化学习与 Black-Litterman 模型相结合,实现投资组合优化","authors":"Ruoyu SunXi'an Jiaotong-Liverpool University, School of Mathematics and Physics, Department of Financial and Actuarial Mathematics, Angelos StefanidisXi'an Jiaotong-Liverpool University Entrepreneur College, Zhengyong JiangXi'an Jiaotong-Liverpool University Entrepreneur College, Jionglong SuXi'an Jiaotong-Liverpool University Entrepreneur College","doi":"arxiv-2402.16609","DOIUrl":null,"url":null,"abstract":"As a model-free algorithm, deep reinforcement learning (DRL) agent learns and\nmakes decisions by interacting with the environment in an unsupervised way. In\nrecent years, DRL algorithms have been widely applied by scholars for portfolio\noptimization in consecutive trading periods, since the DRL agent can\ndynamically adapt to market changes and does not rely on the specification of\nthe joint dynamics across the assets. However, typical DRL agents for portfolio\noptimization cannot learn a policy that is aware of the dynamic correlation\nbetween portfolio asset returns. Since the dynamic correlations among portfolio\nassets are crucial in optimizing the portfolio, the lack of such knowledge\nmakes it difficult for the DRL agent to maximize the return per unit of risk,\nespecially when the target market permits short selling (i.e., the US stock\nmarket). In this research, we propose a hybrid portfolio optimization model\ncombining the DRL agent and the Black-Litterman (BL) model to enable the DRL\nagent to learn the dynamic correlation between the portfolio asset returns and\nimplement an efficacious long/short strategy based on the correlation.\nEssentially, the DRL agent is trained to learn the policy to apply the BL model\nto determine the target portfolio weights. To test our DRL agent, we construct\nthe portfolio based on all the Dow Jones Industrial Average constitute stocks.\nEmpirical results of the experiments conducted on real-world United States\nstock market data demonstrate that our DRL agent significantly outperforms\nvarious comparison portfolio choice strategies and alternative DRL frameworks\nby at least 42% in terms of accumulated return. In terms of the return per unit\nof risk, our DRL agent significantly outperforms various comparative portfolio\nchoice strategies and alternative strategies based on other machine learning\nframeworks.","PeriodicalId":501045,"journal":{"name":"arXiv - QuantFin - Portfolio Management","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Combining Transformer based Deep Reinforcement Learning with Black-Litterman Model for Portfolio Optimization\",\"authors\":\"Ruoyu SunXi'an Jiaotong-Liverpool University, School of Mathematics and Physics, Department of Financial and Actuarial Mathematics, Angelos StefanidisXi'an Jiaotong-Liverpool University Entrepreneur College, Zhengyong JiangXi'an Jiaotong-Liverpool University Entrepreneur College, Jionglong SuXi'an Jiaotong-Liverpool University Entrepreneur College\",\"doi\":\"arxiv-2402.16609\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a model-free algorithm, deep reinforcement learning (DRL) agent learns and\\nmakes decisions by interacting with the environment in an unsupervised way. In\\nrecent years, DRL algorithms have been widely applied by scholars for portfolio\\noptimization in consecutive trading periods, since the DRL agent can\\ndynamically adapt to market changes and does not rely on the specification of\\nthe joint dynamics across the assets. However, typical DRL agents for portfolio\\noptimization cannot learn a policy that is aware of the dynamic correlation\\nbetween portfolio asset returns. Since the dynamic correlations among portfolio\\nassets are crucial in optimizing the portfolio, the lack of such knowledge\\nmakes it difficult for the DRL agent to maximize the return per unit of risk,\\nespecially when the target market permits short selling (i.e., the US stock\\nmarket). In this research, we propose a hybrid portfolio optimization model\\ncombining the DRL agent and the Black-Litterman (BL) model to enable the DRL\\nagent to learn the dynamic correlation between the portfolio asset returns and\\nimplement an efficacious long/short strategy based on the correlation.\\nEssentially, the DRL agent is trained to learn the policy to apply the BL model\\nto determine the target portfolio weights. To test our DRL agent, we construct\\nthe portfolio based on all the Dow Jones Industrial Average constitute stocks.\\nEmpirical results of the experiments conducted on real-world United States\\nstock market data demonstrate that our DRL agent significantly outperforms\\nvarious comparison portfolio choice strategies and alternative DRL frameworks\\nby at least 42% in terms of accumulated return. In terms of the return per unit\\nof risk, our DRL agent significantly outperforms various comparative portfolio\\nchoice strategies and alternative strategies based on other machine learning\\nframeworks.\",\"PeriodicalId\":501045,\"journal\":{\"name\":\"arXiv - QuantFin - Portfolio Management\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuantFin - Portfolio Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2402.16609\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Portfolio Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2402.16609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Combining Transformer based Deep Reinforcement Learning with Black-Litterman Model for Portfolio Optimization
As a model-free algorithm, deep reinforcement learning (DRL) agent learns and
makes decisions by interacting with the environment in an unsupervised way. In
recent years, DRL algorithms have been widely applied by scholars for portfolio
optimization in consecutive trading periods, since the DRL agent can
dynamically adapt to market changes and does not rely on the specification of
the joint dynamics across the assets. However, typical DRL agents for portfolio
optimization cannot learn a policy that is aware of the dynamic correlation
between portfolio asset returns. Since the dynamic correlations among portfolio
assets are crucial in optimizing the portfolio, the lack of such knowledge
makes it difficult for the DRL agent to maximize the return per unit of risk,
especially when the target market permits short selling (i.e., the US stock
market). In this research, we propose a hybrid portfolio optimization model
combining the DRL agent and the Black-Litterman (BL) model to enable the DRL
agent to learn the dynamic correlation between the portfolio asset returns and
implement an efficacious long/short strategy based on the correlation.
Essentially, the DRL agent is trained to learn the policy to apply the BL model
to determine the target portfolio weights. To test our DRL agent, we construct
the portfolio based on all the Dow Jones Industrial Average constitute stocks.
Empirical results of the experiments conducted on real-world United States
stock market data demonstrate that our DRL agent significantly outperforms
various comparison portfolio choice strategies and alternative DRL frameworks
by at least 42% in terms of accumulated return. In terms of the return per unit
of risk, our DRL agent significantly outperforms various comparative portfolio
choice strategies and alternative strategies based on other machine learning
frameworks.