{"title":"流动性随时间变化时优化执行的强化学习","authors":"Andrea Macrì, Fabrizio Lillo","doi":"arxiv-2402.12049","DOIUrl":null,"url":null,"abstract":"Optimal execution is an important problem faced by any trader. Most solutions\nare based on the assumption of constant market impact, while liquidity is known\nto be dynamic. Moreover, models with time-varying liquidity typically assume\nthat it is observable, despite the fact that, in reality, it is latent and hard\nto measure in real time. In this paper we show that the use of Double Deep\nQ-learning, a form of Reinforcement Learning based on neural networks, is able\nto learn optimal trading policies when liquidity is time-varying. Specifically,\nwe consider an Almgren-Chriss framework with temporary and permanent impact\nparameters following several deterministic and stochastic dynamics. Using\nextensive numerical experiments, we show that the trained algorithm learns the\noptimal policy when the analytical solution is available, and overcomes\nbenchmarks and approximated solutions when the solution is not available.","PeriodicalId":501478,"journal":{"name":"arXiv - QuantFin - Trading and Market Microstructure","volume":"35 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Optimal Execution when Liquidity is Time-Varying\",\"authors\":\"Andrea Macrì, Fabrizio Lillo\",\"doi\":\"arxiv-2402.12049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optimal execution is an important problem faced by any trader. Most solutions\\nare based on the assumption of constant market impact, while liquidity is known\\nto be dynamic. Moreover, models with time-varying liquidity typically assume\\nthat it is observable, despite the fact that, in reality, it is latent and hard\\nto measure in real time. In this paper we show that the use of Double Deep\\nQ-learning, a form of Reinforcement Learning based on neural networks, is able\\nto learn optimal trading policies when liquidity is time-varying. Specifically,\\nwe consider an Almgren-Chriss framework with temporary and permanent impact\\nparameters following several deterministic and stochastic dynamics. Using\\nextensive numerical experiments, we show that the trained algorithm learns the\\noptimal policy when the analytical solution is available, and overcomes\\nbenchmarks and approximated solutions when the solution is not available.\",\"PeriodicalId\":501478,\"journal\":{\"name\":\"arXiv - QuantFin - Trading and Market Microstructure\",\"volume\":\"35 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuantFin - Trading and Market Microstructure\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2402.12049\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Trading and Market Microstructure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2402.12049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reinforcement Learning for Optimal Execution when Liquidity is Time-Varying
Optimal execution is an important problem faced by any trader. Most solutions
are based on the assumption of constant market impact, while liquidity is known
to be dynamic. Moreover, models with time-varying liquidity typically assume
that it is observable, despite the fact that, in reality, it is latent and hard
to measure in real time. In this paper we show that the use of Double Deep
Q-learning, a form of Reinforcement Learning based on neural networks, is able
to learn optimal trading policies when liquidity is time-varying. Specifically,
we consider an Almgren-Chriss framework with temporary and permanent impact
parameters following several deterministic and stochastic dynamics. Using
extensive numerical experiments, we show that the trained algorithm learns the
optimal policy when the analytical solution is available, and overcomes
benchmarks and approximated solutions when the solution is not available.