{"title":"Onflow: an online portfolio allocation algorithm","authors":"Gabriel Turinici, Pierre Brugiere","doi":"arxiv-2312.05169","DOIUrl":null,"url":null,"abstract":"We introduce Onflow, a reinforcement learning technique that enables online\noptimization of portfolio allocation policies based on gradient flows. We\ndevise dynamic allocations of an investment portfolio to maximize its expected\nlog return while taking into account transaction fees. The portfolio allocation\nis parameterized through a softmax function, and at each time step, the\ngradient flow method leads to an ordinary differential equation whose solutions\ncorrespond to the updated allocations. This algorithm belongs to the large\nclass of stochastic optimization procedures; we measure its efficiency by\ncomparing our results to the mathematical theoretical values in a log-normal\nframework and to standard benchmarks from the 'old NYSE' dataset. For\nlog-normal assets, the strategy learned by Onflow, with transaction costs at\nzero, mimics Markowitz's optimal portfolio and thus the best possible asset\nallocation strategy. Numerical experiments from the 'old NYSE' dataset show\nthat Onflow leads to dynamic asset allocation strategies whose performances\nare: a) comparable to benchmark strategies such as Cover's Universal Portfolio\nor Helmbold et al. \"multiplicative updates\" approach when transaction costs are\nzero, and b) better than previous procedures when transaction costs are high.\nOnflow can even remain efficient in regimes where other dynamical allocation\ntechniques do not work anymore. Therefore, as far as tested, Onflow appears to\nbe a promising dynamic portfolio management strategy based on observed prices\nonly and without any assumption on the laws of distributions of the underlying\nassets' returns. In particular it could avoid model risk when building a\ntrading strategy.","PeriodicalId":501045,"journal":{"name":"arXiv - QuantFin - Portfolio Management","volume":"94 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Portfolio Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2312.05169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We introduce Onflow, a reinforcement learning technique that enables online
optimization of portfolio allocation policies based on gradient flows. We
devise dynamic allocations of an investment portfolio to maximize its expected
log return while taking into account transaction fees. The portfolio allocation
is parameterized through a softmax function, and at each time step, the
gradient flow method leads to an ordinary differential equation whose solutions
correspond to the updated allocations. This algorithm belongs to the large
class of stochastic optimization procedures; we measure its efficiency by
comparing our results to the mathematical theoretical values in a log-normal
framework and to standard benchmarks from the 'old NYSE' dataset. For
log-normal assets, the strategy learned by Onflow, with transaction costs at
zero, mimics Markowitz's optimal portfolio and thus the best possible asset
allocation strategy. Numerical experiments from the 'old NYSE' dataset show
that Onflow leads to dynamic asset allocation strategies whose performances
are: a) comparable to benchmark strategies such as Cover's Universal Portfolio
or Helmbold et al. "multiplicative updates" approach when transaction costs are
zero, and b) better than previous procedures when transaction costs are high.
Onflow can even remain efficient in regimes where other dynamical allocation
techniques do not work anymore. Therefore, as far as tested, Onflow appears to
be a promising dynamic portfolio management strategy based on observed prices
only and without any assumption on the laws of distributions of the underlying
assets' returns. In particular it could avoid model risk when building a
trading strategy.